2024 EIMS Conference on Applied Statistics
Ewha Institute of Mathematical Sciences (EIMS)
Ewha Womans University, S. Korea, Oct 11, 2024
2024 EIMS Conference on Applied Statistics
Ewha Institute of Mathematical Sciences (EIMS)
Ewha Womans University, S. Korea, Oct 11, 2024
Abstracts
Discovering Causal Structures in Privacy-Protected and Noisy Data
(박건웅 교수 / 서울대학교 통계학과)
This talk focuses on the recovery of anchored Gaussian directed acyclic graphical (DAG) models, addressing the challenge of discovering causal or directed relationships among variables in datasets that are either masked for privacy reasons or contaminated by measurement errors. A key contribution is the relaxation of the existing restrictive identifiability conditions for anchored Gaussian DAG models by introducing the anchored-frugality assumption. This assumption posits that the true graph is the most frugal among those that satisfy the possible distributions of the latent and observed variables, thereby making the true Markov equivalence class (MEC) identifiable. The validity of the anchored-frugality assumption is justified through both graph theory and probability theory. In the second part of the talk, I also present another identifiability condition within quantile-discretization settings. Specifically, I provide a bi-parition process for recovering the covariance matrix from discretized variables.
On Sufficient Graphical Models
(김경원 교수 / 이화여자대학교 통계학과)
We introduce a sufficient graphical model by applying the recently developed nonlinear sufficient dimension reduction techniques to the evaluation of conditional independence. The graphical model is nonparametric in nature, as it does not make distributional assumptions such as the Gaussian or copula Gaussian assumptions. However, unlike a fully nonparametric graphical model, which relies on the high-dimensional kernel to characterize conditional independence, our graphical model is based on conditional independence given a set of sufficient predictors with a substantially reduced dimension. In this way we avoid the curse of dimensionality that comes with a high-dimensional kernel. We develop the population-level properties, convergence rate, and variable selection consistency of our estimate. By simulation comparisons and an analysis of the DREAM 4 Challenge data set, we demonstrate that our method outperforms the existing methods when the Gaussian or copula Gaussian assumptions are violated, and its performance remains excellent in the high-dimensional setting.
Sensitivity Analysis Methods for Attributable Fraction: Addressing Unmeasured Confounding in Binary and Time-to-Event Outcomes
(이우주 교수 / 서울대학교 보건대학원)
A main goal of epidemiology is to assess the impact of an exposure on a health outcome. Attributable fraction (AF) is a widely-used measure to quantify its contribution.In this talk, we are interested in two types of AFs: those for binary outcomes and for time-to-event outcomes. Various methods have been developed for estimating AF such as standardization, inverse probability of treatment weighting and doubly robust method. However, the validity of these methods is established on the assumption of no unmeasured confounding, which cannot be verified using observed data alone. To check how research findings are vulnerable to departures from the assumption, it is essential to conduct sensitivity analysis. In this talk, we propose novel sensitivity analysis methods of these two types of AFs. The sensitivity analysis problems are formulated as optimization problems, and we derive analytic solutions for the problem, which dramatically reduces the computational burden associated with these sensitivity analysis methods.
Statistical Methods for Analyzing Restricted Mean Survival Time
(이지현 교수 / 연세대학교 응용통계학과)
In this talk, I will discuss the use of hazard ratio as a measure to quantify treatment effects and introduce the restricted mean survival time (RMST) as an alternative summary measure. In clinical studies with time-to-event outcomes, the RMST has attracted substantial attention due to its intuitive clinical interpretation, especially when the proportional hazards assumption is violated.
First, I will present a statistical model for flexibly estimating the treatment effects based on RMST, with treatment effects expressed as a function of restriction time to better capture the dynamic trend of its effect on survival. To account for possible heterogeneity across patients in large databases, we incorporate the propensity scores for receiving treatment into the model. I will then demonstrate the model’s finite sample properties and apply it to data from a study of primary inflammatory breast cancer, assessing the effect of trimodality therapy on survival.
In the second part of the talk, I will discuss how length-biased sampling, common in observational cohort studies, complicates the estimation of RMST using existing methods. To address this, we propose nonparametric and semiparametric regression methods tailored for length-biased data. Through simulations and real-world application to a Canadian dementia cohort study, I will illustrate the effectiveness of these methods in estimating RMST under this challenging framework.
Censored Quantile Regression
(최상범 교수 / 고려대학교 통계학과)
Quantile regression offers a powerful statistical framework for analyzing the effects of covariates on different points of the outcome distribution, providing a more comprehensive view than traditional mean regression. This approach is particularly useful in the presence of censored data, where observations are partially observed due to limitations like detection thresholds or dropout in survival studies. In this talk, we explore the principles of quantile regression and its adaptation to censored data settings, highlighting the challenges and methodologies, such as the Kaplan-Meier estimator and inverse probability weighting techniques. We will discuss recent advances in modeling strategies and computational algorithms, and showcase applications in fields like survival analysis, finance, and environmental studies. By addressing the intricacies of censored data through quantile regression, this presentation aims to equip researchers with robust tools for more nuanced statistical modeling in complex data scenarios.
Fitting a Time-dependent Accelerated Failure Time Model via Nonparametric Gaussian Scale Mixtures
(강상욱 교수 / 연세대학교 응용통계학과)
An accelerated failure time (AFT) model relates failure time to a set if covariates through a logarithmic link function, incorporating a random error component. The model can be either parametric or semiparametric, depending on the degree of specification of the error distribution. While covariates are typically assumed to be time-independent, many biomedical studies encounter time-dependent covariates. In this paper, we propose a semi-parametric AFT mixture model with time-dependent covariates, where the baseline failure time distribution is specified as an infinite-scale Gaussian mixture density. This mixture model approach offers greater flexibility compared to traditional models that assume a single-component parametric density, addressing the limitations in estimating the regression intercept in the standard semi-parametric AFT modeling framework. We employ non-parametric maximum likelihood estimation and propose to use the constrained Newton method for estimating both model parameters and the mixing distributions. The proposed methods are investigated via extensive simulation experiments to assess the finite sample properties. We also illustrate the application of our proposed methods using a nationwide population-based health screening database from South Korea.