A recent focus of my research has been Item-Level Heterogeneous Treatment Effects (IL-HTE) where every item on an outcome measure gets a unique treatment effect. It turns out that IL-HTE (a) is very common in real data, (b) means our uncertainty on average treatment effects is much higher than usually reported, and (c) creates weird biases in treatment by covariate interactions when the treatment effects are correlated with item easiness. Below I list a few of my papers on this topic.
IL-HTE Papers
Gilbert, Kim, and MIratrix (2023) - Modeling Item-Level Heterogeneous Treatment Effects With the Explanatory Item Response Model: Leveraging Large-Scale Online Assessments to Pinpoint the Impact of Educational Interventions.
Here, I propose the idea and apply it to an RCT with about 8,000 3rd grade students.
Gilbert (2024) - Modeling item-level heterogeneous treatment effects: A tutorial with the glmer function from the lme4 package in R.
A tutorial for implementing the IL-HTE model in R.
Gilbert, Miratrix, Joshi, and Domingue (2024) - Disentangling Person-Dependent and Item-Dependent Causal Effects: Applications of Item Response Theory to the Estimation of Treatment Effect Heterogeneity
An illustration of how treatment by covariate interactions are confounded by IL-HTE
Gilbert, Hieronymus, Eriksson, and Domingue (2024) - Item-level heterogeneous treatment effects of selective serotonin reuptake inhibitors (SSRIs) on depression: implications for inference, generalizability, and identification
Application to polytomous data in depression surveys.
Gilbert, Kim, and Miratrix (2024) - Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning
Extension to longitudinal data.
Gilbert, Himmelsbach, Soland, Joshi, and Domingue (2024) - Estimating Heterogeneous Treatment Effects with Item-Level Outcome Data: Insights from Item Response Theory
Application to 75 RCT datasets.
Gilbert and Soland (2024) - Mechanisms of Effect Size Differences Between Researcher Developed and Independently Developed Outcomes: A Meta-Analysis of Item-Level Data
We use item-level data as moderators of effect sizes in meta-analysis and find that IL-HTE is the most important
Halpin and Gilbert (2024) - Testing Whether Reported Treatment Effects are Unduly Dependent on the Specific Outcome Measure Used
Halpin's approach is somewhat different than mine as it focuses on bias in the average treatment effect. Here, we apply his method to about 35 of my item-level RCTs.
Gilbert, Himmelsbach, Miratrix, Ho, and Domingue (2025) - Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes
Extending the logic of the IL-HTE model to value added modeling and generalizability theory. Estimates of VAM are much less reliable than we think they are when we consider cluster by item interactions.
Symptom-specific effects of SSRIs on depression (Hamilton Depression Rating Scale).