2025-2026

11/21/2025

Speaker: Gary Hettinger, Assistant Professor, Department of Population Health, New York University

Title: Semi-parametric Causal Methods for Health Policy Evaluations Under Spillover

Abstract: Evaluating health policies requires understanding how they may spill over into nearby regions and differentially impact areas implementing the policy, which demands rigorous confounding adjustment and robust modeling to uncover these causal mechanisms. In this talk, I will introduce novel difference-in-difference methods to assess spillover effects and continuous exposures, developing semi-parametric approaches that relax model specification requirements and incorporate flexible modeling techniques. I demonstrate these methods by studying the effects of Philadelphia’s excise tax on sweetened beverages.

11/7/2025

Speaker: Ronghui Lily Xu, Professor, Department of Family and Preventive Medicine and the Department of Mathematics, University of California, San Diego

Title: Learning Treatment Effects under Covariate Dependent Left Truncation and Right Censoring

[Abstract] Time-to-event outcomes are often subject to left truncation and right censoring. While many survival analysis methods have been developed to handle truncation and censoring, majority of the past works require strong independence assumptions. We relax these assumptions through leveraging covariate information together with orthogonal learning, and develop a liberating framework from left truncation and right censoring so that desirable properties like double robustness can be immediately transferred from settings without truncation or censoring. To illustrate its generality and ease to use, the framework is applied to estimation of the average treatment effect (ATE) and the conditional average treatment effect (CATE). For the ATE, we establish both model and rate double robustness under confounding, truncation and censoring; for the CATE, we show that the orthogonal and the doubly robust learners under these three sources of bias can achieve oracle rate of convergence. We study the estimators both theoretically and through extensive simulation, and apply them to analyzing the effect of mid-life heavy drinking on late life cognitive impairment free survival, using data from the Honolulu Asia Aging Study.

9/19/2025

Speaker: Quentin Le Coent, Postdoctoral Fellow, Johns Hopkins School of Medicine

Title: Index Date Imputation for Externally Controlled Trials

Abstract: Externally controlled trials (ECTs) compare outcomes between a single-arm trial and external-controls drawn from sources such as historical trials, registries, or observational studies. In survival analysis, a major challenge arises when the time origin (index date) differs across groups. For example, when treatment initiation occurs after a delay in the single-arm trial but is undefined in the external controls. This misalignment can bias treatment effect estimates. In this work we propose a statistical method, Index Date Imputation (IDI), that imputes comparable index dates for external control patients using the estimated distribution of treatment initiation times from the single-arm cohort. To address potential confounding due to lack of randomization, IDI is combined with propensity-score methods. We evaluate its performances through a simulations study. Applying IDI to a randomized oncology trial, we demonstrate that the method recovers the known treatment effect despite artificial index date misalignment. IDI provides a principled framework for time-to-event analyses in ECTs and is broadly applicable in oncology and rare disease settings.

8/29/2025

Title: Data Science-Powered Provider Profiling to Enhance Quality and Equity in Health Care Delivery

Speaker: Wenbo Wu, Assistant Professor, Department of Health Policy and Management, Johns Hopkins University

Abstract: Provider profiling is a widely used comparative evaluation tool to inform patients’ care decision making and to improve the quality of care delivered by health care providers. Based on standardized quality measures of patient outcomes, this process entails quantifying provider performance and pinpointing providers with subpar performance. Current methods for profiling activities rely on risk adjustment models with the linearity assumption, often too restrictive to characterize complex associations between risk factors and outcomes. Moreover, these methods, having been historically driven by the demand for controlling care expenditures, tend to pool all racial/ethnic groups without accounting for their socioeconomic heterogeneity. Despite the importance of distinguishing between cost-driven and equity-driven profiling, a theoretical framework capable of addressing these different but related profiling objectives is still lacking, due in part to the absence of a unifying approach that defines context-specific performance benchmarks. To address these issues, we propose a versatile probability framework based on hypothetical reference providers corresponding to specific profiling objectives. Furthermore, we develop flexible machine learning approaches that relax the linearity assumption. These methods will advance the methodology of provider profiling, thereby triggering improved care-seeking decision-making by patients and stakeholders and evidence-based accountability of providers.

4/25/2025

Title: Bayesian optimality of testing procedures for survival data in the nonproportional hazards setting

Speaker: Andrea Arfe, Assistant Attending Biostatistician, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center

Discussant/Visitor: Yingying Wei, Associate Professor, Department of Statistics, The Chinese University of Hong Kong

Abstract: Most statistical tests for treatment effects used in randomized clinical trials with survival outcomes are based on the proportional hazards assumption, which often fails in practice. Data from early exploratory studies may provide evidence of nonproportional hazards, which can guide the choice of alternative tests in the design of practice-changing confirmatory trials. We developed a test to detect treatment effects in a late-stage trial, which accounts for the deviations from proportional hazards suggested by early-stage data. Conditional on early-stage data, among all tests that control the frequentist Type I error rate at a fixed α level, our testing procedure maximizes the Bayesian predictive probability that the study will demonstrate the efficacy of the experimental treatment. Hence, the proposed test provides a useful benchmark for other tests commonly used in the presence of nonproportional hazards, for example, weighted log-rank tests. We illustrate this approach in simulations based on data from a published cancer immunotherapy phase III trial.

(Joint work with Lorenzo Trippa and Brian Alexander at Dana-Farber Cancer Institute)

4/11/2025

Speaker: Shouhao Zhou, Associate Professor, Department of Public Health Sciences, Division of Biostatistics and Bioinformatics, Penn State University

Title: Posterior Predictive (PoP) Design

Abstract: We propose a novel Bayesian model-assist trial design using the predictive Bayes factors, to determine the escalation and de-escalation boundaries for dose-finding trials. It overcomes the limitations of the previous model-assisted designs and serves as the first model-assisted design to guarantee global optimality and asymptotic convergence to true MTD. Intensive simulation results demonstrate superior operating characteristics.

3/28/2025

Speaker: Jiwei Zhao, Associate Professor, Dept. of Biostatistics and Medical Informatics, University of Wisconsin

Title: Statistical Benefits when Incorporating LLM-Derived Predictions: Old Wine in a New Bottle

Abstract: In biomedical studies involving electronic health records, manually extracting gold-standard phenotype data is labor-intensive and limited in scale. The rise of generative AI, particularly large language models (LLMs), offers a systematic and significantly faster alternative through predictions, such as automated computational phenotypes (ACPs). However, directly substituting gold-standard data with these predictions, without addressing their differences, can introduce biases and lead to misleading conclusions. To address this challenge, we adopt a semi-supervised learning framework that integrates both labeled data (with gold-standard annotations) and unlabeled data (without gold-standard annotations) under the covariate shift paradigm. We propose doubly robust and semiparametrically efficient estimators to infer general target parameters. Through a rigorous efficiency analysis, we compare scenarios with and without the incorporation of LLM-derived predictions. Furthermore, we situate our approach within existing literature, drawing connections to prediction-powered inference and its extensions, as well as some seemingly unrelated concept such as surrogacy. To validate our theoretical findings, we conduct extensive synthetic experiments and apply our method to real-world data, demonstrating its practical advantages

2/21/2025

Speaker: Tianchen Qian, Assistant Professor, Department of Statistics, University of California at Irvine

Title: Causal inference and machine learning in mobile health - modeling time-varying effects using longitudinal functional data

Abstract: To optimize mobile health interventions and advance domain knowledge on intervention design, it is critical to understand how the intervention effect varies over time and with contextual information. This study aims to assess how a push notiﬁcation suggesting physical activity inﬂuences individuals’ step counts using data from the HeartSteps micro-randomized trial (MRT). The statistical challenges include the time-varying treatments and the longitudinal functional step count measurements. We propose the ﬁrst semiparametric causal excursion effect model with varying coefﬁcients to model the time-varying effects within a decision point and across decision points in an MRT. The proposed model incorporates double time indices to accommodate the longitudinal functional outcome, enabling the assessment of time-varying effect moderation by contextual variables. We propose a two-stage causal effect estimator that is robust against a misspeciﬁed high-dimensional outcome regression nuisance model. We establish asymptotic theory and conduct simulation studies to validate the proposed estimator. Our analysis provides new insights into individuals’ change in response proﬁles (such as how soon a response occurs) due to the activity suggestions, how such changes differ by the type of suggestions received, and how such changes depend on other contextual information such as being recently sedentary and the day being a weekday.

1/24/2025

Speaker: Krithika Suresh, Research Assistant Professor, University of Michigan at Ann Arbor

Title: Bounded hazard ratio Cox model for the effect of time to treatment on mortality

In resource-limited settings, there is often interest in assessing the effect of time to treatment (TTT) on subsequent mortality. We demonstrate that the traditional Cox proportional hazards model specifying the effect of TTT as a linear term in the log hazard ratio results in a mathematical anomaly that violates the expected monotonic TTT effect on survival (i.e., as TTT increases, survival probability should decrease). Additionally, the quantification of the time to treatment effect from these models is the hazard ratio, which provides an interpretation conditional on surviving until treatment rather than a quantification of the effect of delayed treatment at baseline. We propose a class of bounded hazard ratio (BHR) Cox models that attenuate the hazard ratio for TTT towards the null with increasing treatment time, such that hazard for death after treatment does not exceed the hazard without treatment. Estimation is performed using direct optimization of the partial log-likelihood, and we propose a linearized approximation to fit the model in standard software for large sample sizes. From BHR models, the estimated hazard ratio curve describes how the treatment effect diminishes with delays in treatment. Additionally, we present the marginal survival probability difference comparing immediate treatment to a treatment time in the future. We evaluate the performance of model estimation in a simulation study and demonstrate the use of this approach in an application to treatment for colon cancer using NCDB data.

2/7/2025

Speaker: Guoqing Diao, Professor, Department of Biostatistics and Bioinformatics, The George Washington University

Title: Estimating Predictive Margins and Marginal Effects

Abstract: Predictive margins and marginal effects are useful tools to interpret regression model results in biomedical and epidemiological research, especially for models of non-linear function forms. Proper estimation of the marginal effects and their variances is also called for, which is lacking in the existing statistical software for some commonly encountered data. This article discussed two use cases: survival analysis with competing risks and analysis of binary outcomes with hierarchical clustering. We reviewed the pros and cons of a few methods that have been proposed to handle competing risks, including the Fine-Gray model, cause-specific hazard model, mixture models, and composite outcome approach. We also proposed to use a generalized bootstrap method to construct confidence intervals for the marginal effect, accounting for the clustering effect. As illustrations, we analyzed real data from a COVID Antimicrobial Resistance (AMR) study and a market-size study on new Gram-Negative Antibiotic Use. An R program implementing the proposed method with core code in C language is developed.

Page updated

Google Sites

Report abuse