A Tour of Survival Analysis, from Classical to Modern

This tutorial is part of the 2021 SIGMETRICS conference taking place virtually on June 14-18. [official SIGMETRICS tutorial page]

A previous version of this tutorial (co-taught with Jeremy C. Weiss) with more of a healthcare focus was presented at the 2020 Conference on Health, Inference, and Learning. [previous tutorial webpage]

Presenter

George H. Chen, Assistant Professor of Information Systems, Carnegie Mellon University
(georgechen [at symbol] cmu.edu)

Abstract

Predicting time-to-event outcomes arises in all sorts of applications, whether we want to know the amount of time until a device fails to how long a patient will stay in the hospital to when a convicted criminal is likely to re-offend. These time-to-event prediction problems have been studied for decades largely in the statistics and medical communities within the field of survival analysis. Only recently has survival analysis been explored more by machine learning researchers, with a number of significant methodological advances that take advantage of neural nets. This tutorial aims to go over the fundamentals of survival analysis including its basic problem formulation and how it is commonly used before highlighting some of the recent machine learning innovations and open challenges.

Tutorial Material

The tutorial slides and supplemental notes are available here:
[video] [slides] [supplemental note: "An Introduction to Survival Analysis Math"]

A more polished & extensive version of the experiments shown in the tutorial can be found here, with code:
[
paper on deep kernel survival analysis] [code]

Kvamme and Borgan have a paper that has a nice introduction to the math for continuous and discrete-time survival models as they're used with neural nets: "Continuous and Discrete-Time Survival Prediction with Neural Networks"

Some Software Packages

We highlight some software packages here. More links to code are available in the references below associated with their specific papers.

Python

  • lifelines -- Kaplan-Meier, Cox model and regularized variants, Weibull AFT, Aalen additive model

  • glmnet_python -- regularized Cox variants; official port of glmnet from R

  • pycox -- unified PyTorch implementations of DeepSurv, Cox-Time, Cox-CC, MTLR, Nnet-survival, DeepHit, and others

  • pysurvival -- MTLR implementation by original author, also random survival forests, survival SVMs, and others

  • Reference implementations by original authors:

R

  • survival -- Kaplan-Meier and Cox proportional hazards model

  • glmnet -- lasso, ridge, and elastic-net regularized Cox models

  • hdnom -- Cox nomograms

  • randomForestSRC -- random survival forest implementation by original author

  • cmprsk -- Fine & Gray subdistribution hazards and cumulative incidence functions

  • riskRegression -- cause-specific hazard models

References

A recent survey

  • Ping Wang, Yan Li, and Chandan K. Reddy. "Machine learning for survival analysis: A survey". ACM Computing Surveys (CSUR) 51(6): 1-36, 2019.
    [paper (arXiv)]

Some classical survival estimators

  • (Kaplan-Meier estimator) Edward L. Kaplan and Paul Meier. Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282):457–481, 1958.
    [paper (JSTOR)]

  • (Cox proportional hazards) David R. Cox. Regression models and life-tables. Journal of the Royal Statistical Society. Series B, 34(2): 187–202, 1972.
    [paper (JSTOR)]
    How to estimate the baseline hazard is in the official discussion of the Cox paper:

    • Norman Breslow. Discussion of the paper by D. R. Cox (1972). Journal of the Royal Statistical Society, Series B, 34(2):216–217, 1972.

An explanation for how the Cox loss (for beta) relates to ranking:

    • Harald Steck, Balaji Krishnapuram, Cary Dehing-Oberije, Philippe Lambin, and Vikas C. Raykar. On ranking in survival analysis: Bounds on the concordance index. In Advances in Neural Information Processing Systems, pages 1209-1216, 2008.
      [paper (NeurIPS)]

  • (Logistic-hazard discrete-time model) Charles C. Brown. On the use of indicator variables for studying the time-dependence of parameters in a response-time model. Biometrics, 31(4):863–872, 1975.
    [paper (JSTOR)]

  • (Conditional Kaplan-Meier estimators) Rudolf Beran. Nonparametric regression with randomly censored survival data. Technical report, University of California, Berkeley, 1981.
    [paper (ResearchGate)]
    Finite-sample error bounds are provided by:

    • George H. Chen. Nearest neighbor and kernel survival analysis: Nonasymptotic error bounds and strong consistency rates. In International Conference on Machine Learning, pages 1001–1010, 2019.
      [paper (arXiv)] [code (Python)]

  • (Fine & Gray competing risks) Jason P. Fine and Robert J. Gray. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association, 94(446):496-509, 1999.
    [paper (JSTOR)]

  • (Random survival forests) Hemant Ishwaran, Udaya B. Kogalur, Eugene H. Blackstone, and Michael S. Lauer. Random survival forests. The Annals of Applied Statistics, 2(3):841–860, 2008.
    [paper (arXiv)] [code (R)]

Accuracy/error metrics

  • Concordance index (ranking-based metric):

    • (Original) Frank E. Harrell, Robert M. Califf, David B. Pryor, Kerry L. Lee, and Robert A. Rosati. Evaluating the yield of medical tests. Journal of the American Medical Association, 247(18):2543–2546, 1982.
      [paper (JAMA)]

    • (Time-dependent) Laura Antolini, Patrizia Boracchi, and Elia Biganzoli. A time-dependent discrimination index for survival data. Statistics in Medicine, 24(24):3927–3944, 2005.
      [doi (Wiley)]

    • Also look at the mortality way of ranking to compute concordance index in the random survival forests paper: Hemant Ishwaran, Udaya B. Kogalur, Eugene H. Blackstone, and Michael S. Lauer. Random survival forests. The Annals of Applied Statistics, 2(3):841–860, 2008.
      [paper (arXiv)] [code (R)]

  • Brier score (error in estimating survival function):

    • Erika Graf, Claudia Schmoor, Willi Sauerbrei, and Martin Schumacher. Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, 18(17-18):2529–2545, 1999.
      [doi (Wiley)]

    • Thomas A. Gerds and Martin Schumacher. Consistent estimation of the expected Brier score in general survival models with right-censored event times. Biometrical Journal, 48(6):1029–1040, 2006.
      [doi (Wiley)]

  • Looking at average widths of subject-specific survival time prediction intervals:

    • George H. Chen. Deep kernel survival analysis and subject-specific survival time prediction intervals. In Machine Learning for Healthcare Conference, 2020.
      [paper (arXiv)] [code (Python)]

Calibration

  • (TRIPOD) Gary S. Collins, Johannes B. Reitsma, Douglas G. Altman, and Karel GM Moons. Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) The TRIPOD Statement. Circulation, 131(2):211-219, 2015.
    [doi (Annals of Internal Medicine)]

  • Olga V. Demler, Nina P. Paynter, and Nancy R. Cook. Tests of calibration and goodness-of-fit in the survival setting. Statistics in Medicine, 34(10):1659-1680, 2015.
    [doi (Wiley)]

  • (Countdown regression) Avati, Anand, Tony Duan, Sharon Zhou, Kenneth Jung, Nigam H. Shah, and Andrew Ng. Countdown Regression: Sharp and Calibrated Survival Predictions. In Uncertainty in Artificial Intelligence, 2019.
    [paper (arXiv)] [code (Python)]

  • (D-calibration) Haider, Humza, Bret Hoehn, Sarah Davis, and Russell Greiner. "Effective ways to build and evaluate individual survival distributions." Journal of Machine Learning Research 21(85):1-63, 2020.
    [paper (arXiv)] [code (R)]

Some standard datasets:

Note that some datasets have multiple versions floating around the web, some times with feature names and some times without. All of the following show up in some form as part of the pycox package (which also has other datasets).

Some recent neural net survival estimators (all with code)

  • (DeepSurv) Jared L. Katzman, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1): 24, 2018.
    [paper (arXiv)] [authors' original code (Python)] [pycox code (Python)]
    In fact, the approach of DeepSurv was already known decades previous:

    • David Faraggi and Richard Simon. A neural network model for survival data. Statistics in Medicine, 14(1):73-82, 1995.

  • (DeepHit) Changhee Lee, William R. Zame, Jinsung Yoon, and Mihaela van der Schaar. DeepHit: A deep learning approach to survival analysis with competing risks. In AAAI Conference on Artificial Intelligence, 2018.
    [paper (UCLA)] [authors' original code (Python)] [pycox code (Python)]

  • (Nnet-survival) Michael F. Gensheimer and Balasubramanian Narasimhan. A scalable discrete-time survival model for neural networks. PeerJ, 7:e6257, 2019.
    [paper (arXiv)] [authors' original code (Python)] [pycox code (Python)]

  • (Cox-CC, Cox-Time) Håvard Kvamme, Ørnulf Borgan, and Ida Scheel. Time-to-event prediction with neural networks and Cox regression. Journal of Machine Learning Research, 20(129):1–30, 2019.
    [paper (arXiv)] [JMLR] [pycox code (Python)]

  • (PC-Hazard) Håvard Kvamme and Ørnulf Borgan. Continuous and discrete-time survival prediction with neural networks. arXiv preprint arXiv:1910.06724, 2019.
    [paper (arXiv)] [pycox code (Python)]

  • (Dynamic-DeepHit) Changhee Lee, Jinsung Yoon, and Mihaela Van Der Schaar. "Dynamic-DeepHit: A deep learning approach for dynamic survival analysis with competing risks based on longitudinal data." IEEE Transactions on Biomedical Engineering, 67(1):122-133, 2019.
    [paper (IEEE)] [code (Python)]

  • (Deep kernel survival analysis) George H. Chen. Deep kernel survival analysis and subject-specific survival time prediction intervals. In Machine Learning for Healthcare Conference, 2020.
    [paper (arXiv)] [code (Python)]

  • (Topic modeling with survival analysis) Linhong Li, Ren Zuo, Amanda Coston, Jeremy C. Weiss, George H. Chen. Neural topic models with survival supervision: Jointly predicting time-to-event outcomes and learning how clinical features relate. In International Conference on Artificial Intelligence in Medicine, 2020.
    [paper (arXiv)] [code (Python)]

Deep generative models for survival analysis

  • Rajesh Ranganath, Adler Perotte, Noémie Elhadad, and David Blei. Deep survival analysis. In Machine Learning for Healthcare Conference, pages 101-114, 2016.
    [paper (arXiv)]

  • Chirag Nagpal, Xinyu Li, and Artur Dubrawski. Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data with Competing Risks. In NeurIPS Machine Learning for Health Workshop, 2019.
    [paper (arXiv)] [code (Python)]

A recently proposed framework for causal inference with survival analysis

  • Yifan Cui, Michael R. Kosorok, Stefan Wager, and Ruoqing Zhu. "Estimating heterogeneous treatment effects with right-censored data via causal survival forests." arXiv preprint arXiv:2001.09887, 2020.
    [paper (arXiv)]