Abstract
We propose a causal hidden Markov model (Causal-HMM) to achieve robust prediction of irreversible disease at an early stage, which is safety-critical and vital for medical treatment in early stages. Specifically, we introduce the hidden variables which propagate to generate medical data at each time step. To avoid learning spurious correlation (e.g., confounding bias), we explicitly separate these hidden variables into three parts: a) the disease (clinical)-related part; b) the disease (non-clinical)-related part; c) others, with only a),b) causally related to the disease however c) may contain spurious correlations (with the disease) inherited from the data provided. With personal attributes and disease label respectively provided as side information and supervision, we prove that these disease-related hidden variables can be disentangled from others, implying the avoidance of spurious correlation for generalization to medical data from other (out-of-) distributions. Guaranteed by this result, we propose a sequential variational auto-encoder with a reformulated objective function. We apply our model to the early prediction of peripapillary atrophy and achieve promising results on out-of-distribution test data. Further, the ablation study empirically show the effectiveness of each component in our method. And the visualization show the accurate identification of lesion regions from others.
The directed acylic graph (DAG) for our Causal-HMM.
The time series architecture for the proposed Causal-HMM
Code
[Github]
Paper
[URL]
Acknowledgements
This work was supported by MOST-2018AAA0102004, NSFC-61625201 and NSFC-62061136001.