Skip to contents

Site-level Cross-Validation for Synthetic Purposive Sampling

Usage

sps_cv(
  out = NULL,
  estimates_selected = NULL,
  K = 2,
  max_iter = 100,
  seed = 1234
)

Arguments

out

Output from function sps()

estimates_selected

data.frame with two columns: the first column represents estimates of the site-specific ATEs for the selected sites and the second column represents its corresponding standard error. The number of rows is equal to the number of the selected sites and rownames(estimates_selected) should be the names of the selected sites.

K

(Default = 2) Fold of the cross-validation.

max_iter

(Default = 100) How many times the function repeats K-fold cross-validation.

seed

Numeric. seed used internally. Default = 1234.

Value

sps_cv returns two objects.

  • p_value: P-value for the null hypothesis that the the average-site ATE among non-selected sites is equal to the weighted average estimator of the average-site ATE based on site-specific ATE estimates in selected sites.

  • internal: Objects useful for internal use of the function.

References

Egami and Lee. (2023+). Designing Multi-Context Studies for External Validity: Site Selection via Synthetic Purposive Sampling. Available at https://naokiegami.com/paper/sps.pdf.