Stratification for Synthetic Purposive Sampling
stratify_sps.RdStratification for Synthetic Purposive Sampling
Arguments
- X
- Site-level variables for the target population of sites. Row names should be names of sites. X cannot contain missing data. 
- num_site
- A list of two elements, e.g., - list("at least", 1). This argument specifies the number of sites that should satisfy- conditionspecified below. The first element should be either- at leastor- at most. The second element is integer. For example,- list("at least", 1)means that we stratify SPS such that we select *at least 1* site that satisfies- condition(specified below).
- condition
- A list of three elements, e.g., - list("GDP", "larger than or equal to", 1). This argument specifies conditions for stratification. The first element should be a name of a site-level variable. The second element should be either- larger than or equal to,- smaller than or equal to, or- between. The third element is a vector of length 1 or 2. When the second element is- between, the third element should be a vector of two values. For example,- list("GDP", "larger than or equal to", 1)means that we stratify SPS such that we select- num_sitesites that have *GDP larger than or equal to 1*.
Value
stratify_sps returns an object of stratify_sps class, which we supply to sps().
- C: A matrix on the left-hand side of linear constraints. The number of columns is the number of sites in the target population (=- nrow(X)) and the number of rows is the number of constraints.
- c0: A vector on the right-hand side of linear constraints. The length is the number of constraints.
References
Egami and Lee. (2023+). Designing Multi-Context Studies for External Validity: Site Selection via Synthetic Purposive Sampling. Available at https://naokiegami.com/paper/sps.pdf.