International Seminar on Selective Inference
A weekly online seminar on selective inference, multiple testing, and post-selection inference.
Gratefully inspired by the Online Causal Inference Seminar
A weekly online seminar on selective inference, multiple testing, and post-selection inference.
Gratefully inspired by the Online Causal Inference Seminar
For announcements and Zoom invitations please subscribe to our mailing list.
Monday, October 27, 2025 [link to join]
Speaker: Aaditya Ramdas (Carnegie Mellon University)
Title: Locally minimax optimal confidence sets for the best model
Abstract: This paper tackles a fundamental inference problem: given n observations from a distribution P over R d with unknown mean µ, we must form a confidence set for the index (or indices) corresponding to the smallest component of µ. By duality, we reduce this to testing, for each r in 1, . . . , d, whether µr is the smallest. Based on the sample splitting and self-normalization approach of Kim and Ramdas (2024), we propose “dimension-agnostic” tests that maintain validity regardless of how d scales with n, and regardless of arbitrary ties in µ. Notably, our validity holds under mild moment conditions, requiring little more than finiteness of a second moment, and permitting possibly strong dependence between coordinates. In addition, we establish the local minimax separation rate for this problem, which adapts to the cardinality of a confusion set, and show that the proposed tests attain this rate. Furthermore, we develop robust variants that continue to achieve the same minimax rate under heavy-tailed distributions with only finite second moments. While these results highlight the theoretical strength of our method, a practical concern is that sample splitting can reduce finite-sample power. We show that this drawback can be substantially alleviated by the multi-split aggregation method of Guo and Shah (2025). Finally, empirical results on simulated and real data illustrate the strong performance of our approach in terms of type I error control and power compared to existing methods.
Discussant: Lihua Lei (Stanford University)
Links: [Relevant papers: paper #1]
Monday, November 3, 2025 [link to join]
Speaker: Lucas Janson (Harvard University)
Title: Chiseling: Powerful and Valid Subgroup Selection via Interactive Machine Learning
Abstract: In regression and causal inference, controlled subgroup selection aims to identify, with inferential guarantees, a subgroup (defined as a subset of the covariate space) on which the average response or treatment effect is above a given threshold. E.g., in a clinical trial, it may be of interest to find a subgroup with a positive average treatment effect. However, existing methods either lack inferential guarantees, heavily restrict the search for the subgroup, or sacrifice efficiency by naive data splitting. We propose a novel framework called chiseling that allows the analyst to interactively refine and test a candidate subgroup by iteratively shrinking it. The sole restriction is that the shrinkage direction only depends on the points outside the current subgroup, but otherwise the analyst may leverage any prior information or machine learning algorithm. Despite this flexibility, chiseling controls the probability that the discovered subgroup is null (e.g., has a non-positive average treatment effect) under minimal assumptions: for example, in randomized experiments, this inferential validity guarantee holds under only bounded moment conditions. When applied to a variety of simulated datasets and a real survey experiment, chiseling identifies substantially better subgroups than existing methods with inferential guarantees. This is joint work with Nathan Cheng and Asher Spector.
Discussant: Jann Spiess (Stanford University)
Links: [Relevant papers: paper #1]
Monday, November 24, 2025 [link to join]
Speaker: Wanrong Zhu (UC Irvine)
We are taking a break for the summer and will resume in the fall. See you soon!
The seminars are held on Zoom and last 60 minutes:
45 minutes of presentation
15 minutes of discussion, led by an invited discussant
Moderators collect questions using the Q&A feature during the seminar.
You can attend by clicking the link to join (there is no need to register in advance).
More instructions for attendees can be found here.
Jelle Goeman (Leiden University)
Nikos Ignatiadis (University of Chicago)
Lihua Lei (Stanford University)
Zhimei Ren (University of Pennsylvania)
Will Fithian (UC Berkeley)
Rina Barber (University of Chicago)
Daniel Yekutieli (Tel Aviv University)
If you have feedback or suggestions or want to propose a speaker, please e-mail us at selectiveinferenceseminar@gmail.com.
Broadly construed, selective inference means searching for interesting patterns in data, usually with inferential guarantees that account for the search process. It encompasses:
Multiple testing: testing many hypotheses at once (and paying disproportionate attention to rejections)
Post-selection inference: examining the data to decide what question to ask, or what model to use, then carrying out one or more appropriate inferences
Adaptive / interactive inference: sequentially asking one question after another of the same data set, where each question is informed by the answers to preceding questions
Cheating: cherry-picking, double dipping, data snooping, data dredging, p-hacking, HARKing, and other low-down dirty rotten tricks; basically any of the above, but done wrong!