How can we generalize findings from specific studies to other populations, outcomes, treatments, and contexts? This long-standing question of external validity is fundamental because social scientists are almost always interested in general theories of politics, economics, and society, even though we can only use specific cases and data to test these theories.
In this research program, I develop a series of empirical
strategies to improve external validity.
A formal framework to systematically discuss external validity (Egami and Hartman 2023)
Research design strategies that enable researchers to explicitly design studies with external validity in mind (Egami and Lee 2024; de la Cuesta, Egami, and Imai 2022; Egami et al. 2024)
Data analysis methods to help researchers assess external validity even when the original data collection is designed primarily for internal validity (Devaux and Egami 2022; Egami and Hartman 2021; Huang et al. 2023)
Understanding network and spatial interactions is fundamental in the social sciences where units of interest are often embedded in networks and space.
In this research program, I develop methods to identify and estimate causal effects from network and spatial data.
Specifically, I have worked on identification of causal effects from observational data in the presence of unmeasured network/spatial confounding (Egami 2024; Egami and Tchetgen Tchetgen 2024) as well as identification of causal effects from network experiments with interference (Egami 2021).
We live in an era of rapid advancements in machine learning (ML), large language models (LLMs), and artificial intelligence (AI). However, these technologies are not inherently developed for social science questions, and, as a result, naïve, direct applications of these tools can bias our statistical inferences.
In this research program, I develop frameworks and methodologies to harness ML and AI for social science questions while maintaining statistical validity.
My recent research focuses on how to use LLMs in the social sciences with statistical guarantees (Egami et al. 2023; Egami et al. 2024). In my past work, I have proposed a framework for making causal inference with texts (Egami et al. 2022) and developed a regularized ML regression to estimate causal interaction in randomized experiments (Egami and Imai 2019).