# Stratification for Synthetic Purposive Sampling

`stratify_sps.Rd`

Stratification for Synthetic Purposive Sampling

## Arguments

- X
Site-level variables for the target population of sites. Row names should be names of sites. X cannot contain missing data.

- num_site
A list of two elements, e.g.,

`list("at least", 1)`

. This argument specifies the number of sites that should satisfy`condition`

specified below. The first element should be either`at least`

or`at most`

. The second element is integer. For example,`list("at least", 1)`

means that we stratify SPS such that we select *at least 1* site that satisfies`condition`

(specified below).- condition
A list of three elements, e.g.,

`list("GDP", "larger than or equal to", 1)`

. This argument specifies conditions for stratification. The first element should be a name of a site-level variable. The second element should be either`larger than or equal to`

,`smaller than or equal to`

, or`between`

. The third element is a vector of length 1 or 2. When the second element is`between`

, the third element should be a vector of two values. For example,`list("GDP", "larger than or equal to", 1)`

means that we stratify SPS such that we select`num_site`

sites that have *GDP larger than or equal to 1*.

## Value

`stratify_sps`

returns an object of `stratify_sps`

class, which we supply to `sps()`

.

`C`

: A matrix on the left-hand side of linear constraints. The number of columns is the number of sites in the target population (=`nrow(X)`

) and the number of rows is the number of constraints.`c0`

: A vector on the right-hand side of linear constraints. The length is the number of constraints.

## References

Egami and Lee. (2023+). Designing Multi-Context Studies for External Validity: Site Selection via Synthetic Purposive Sampling. Available at https://naokiegami.com/paper/sps.pdf.