Causal inference has received increasing interests from both the academy and industry across disciplines in recent decades. Rapid development in artificial intelligence (AI) and machine learning (ML) has facilitated the approximation of arbitrary relationships underlying the big and complex data. However, ML-based predictions without accounting for endogeneity, stability, adaptability, as well as interpretability may drive misleading decisions in real-world practice. ML models can be trustworthy and relevant for decision making only if they reflect the true causal relationships among observables. In order to obtain effective data-driven decision rules, more and more researchers are turning their focuses from mining association to revealing causation under the proper and honest casual inference framework.
In technology companies, thousands of experiments (A/B testings) are being operated to justify business impacts of new product/feature launch. Practical challenges include: how long the experiments should run to attain certain business goals with statistical significance; how to reduce the required sample sizes (to improve statistical power); how to minimize the negative impacts of experiments on the existing system while addressing as many impact questions as possible; how to eliminate confounding and interference to certify the relevance of experimental conclusions. Besides experiments, companies are also taking automated actions on and receiving feedbacks from their customers via huge data streams. One important operational goal is to improve these automated systems to optimize certain managerial metrics based on observational data from history. It generates an unprecedented amount of causal learning problems for these companies to understand (policy evaluation) and optimize (policy learning) for counterfactual actions from data.
In this course, we will cover topics for experiments and their analyses, observational data and their analyses, policy evaluation and learning to assist causal decision making, unmeasured confounding in observational data, and panel data and their analysis. Targeted audiences for this course are graduate students with certain quantitative research background and coding experience in R or Python. Graduate-level courses in mathematical statistics (STAT 52800), advanced econometrics (ECON 67100, 67200) or their equivalents can be helpful prerequisites for this course.
Understand causal concepts and problems (association versus causation)
Understand various experimental designs and their statistical analyses (estimation, statistical inference)
Apply ML-based prediction methods properly to obtain valid and interpretable causal estimates from the observational data
Perform policy evaluation and learning to assist causal decision making based on the observational data
Understand the challenges due to unmeasured confounding in observational studies. Apply IV-based methods to circumvent these challenges
Understand the panel data with multi-period outcomes, and the identification assumptions for interpretable causal effects. Perform statistical analyses (estimation, statistical inference) on the panel data
Ding, P. (2024). A First Course in Causal Inference. Chapman and Hall/CRC. DOI: 10.1201/9781003484080. Preprint also available at arXiv:2305.18793.
Wager, S. (2024). Causal Inference: A Statistical Learning Approach. Lecture notes. URL: web.stanford.edu/~swager/causal_inf_book.pdf.
Chernozhukov, V., Hansen, C., Kallus, N., Spindler, M., and Syrgkanis, V. (2024). Applied Causal Inference Powered by ML and AI. CausalML-book.org. Preprint also available at arXiv:2403.02467.
Golub Capital Social Impact Lab (2023). Machine Learning-Based Causal Inference Tutorial. URL: bookdown.org/stanfordgsbsilab/ml-ci-tutorial.
Imbens, G., and Rubin, D. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge: Cambridge University Press. DOI: 10.1017/CBO9781139025751.
Hernan, M. A., and Robins, J. M. (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC. URL: www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.
Workshop on Research Design for Causal Inference: www.law.northwestern.edu/research-faculty/events/conferences/causalinference/.
Guo, R., Cheng, L., Li, J., Hahn, P. R., and Liu, H. (2021). "A Survey of Learning Causality with Data: Problems and Methods." ACM Computing Surveys. 53(4):1-37. DOI: 10.1145/3397269. Github repository: github.com/rguo12/awesome-causality-algorithms/.
Imbens, G. W. (2022). Causality in Econometrics: Choice vs Chance. Econometrica, 90(6), 2541-2566. DOI: 10.3982/ECTA21204.