ISCB ČR - 2025 František Bartoš

The Czech National Group of the International Society for Clinical Biostatistics (ISCB Czechia)

ISCB ČR

Rethinking Simulation Studies: Living Synthetic Benchmarks for Cumulative Methodological Research

František Bartoš* (1), Samuel Pawel (2), Björn S. Siepe (3)

1) Department of Psychological Methods, University of Amsterdam

2) Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of Zurich

3) Psychological Methods Lab, Department of Psychology, University of Marburg

Location: ZOOM

Meeting ID: 990 6218 4007

Passcode: 183786

Date: Friday 21 November 2025

Time: 14:00 CET

Abstract:

Simulation studies are widely used to evaluate the performance of statistical methods using synthetic data sets generated from a known ground truth. However, the current methodological research paradigm requires researchers to develop and evaluate new methods at the same time. This creates misaligned incentives, such as the need to demonstrate the superiority of new methods, potentially compromising the neutrality of simulation studies. Furthermore, results of simulation studies are often difficult to compare due to differences in data-generating mechanisms, included methods, and performance measures. This fragmentation can lead to conflicting conclusions, hinder cumulative methodological progress, and delay the adoption of effective methods. To address these challenges, we introduce the concept of living synthetic benchmarks.

The key idea is to disentangle method and data-generating mechanism development and continuously update the benchmark whenever a new data-generating mechanism, method, or performance measure becomes available. Such segregation improves the neutrality of method evaluation, puts more focus on the development of both methods and data-generating mechanisms, and makes it possible to compare all methods across all data-generating mechanisms and using all performance measures.

In this paper, we (i) outline a blueprint for building and maintaining such benchmarks, (ii) discuss technical and organizational challenges of implementation, and (iii) demonstrate feasibility with a prototype benchmark for publication bias adjustment methods, including an open-source R package. We conclude that living synthetic benchmarks have the potential to foster neutral, reproducible, and cumulative evaluation of methods, benefiting both method developers and users.

Keywords:

Evidence Synthesis, Method Benchmarking, Neutral Method Comparison, Systematic Review

Personal Home Page:

https://www.frantisek-bartos.info/

References:

Pawel, S., Kook, L., & Reeve, K. (2024). Pitfalls and potentials in simulation studies: Questionable research practices in comparative simulation studies allow for spurious claims of superiority of any method. Biometrical Journal, 66(1), 2200091.
Nießl, C., Herrmann, M., Wiedemann, C., Casalicchio, G., & Boulesteix, A. L. (2022). Over‐optimism in benchmark studies and the multiplicity of design and analysis options when interpreting their results. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(2), e1441.
Heinze, G., Boulesteix, A. L., Kammer, M., Morris, T. P., White, I. R., & Simulation Panel of the STRATOS Initiative. (2024). Phases of methodological research in biostatistics—building the evidence base for new methods. Biometrical Journal, 66(1), 2200222.
Bartoš, F., Pawel, S., & Siepe, B.S. (2025). Living Synthetic Benchmarks: A Neutral and Cumulative Framework for Simulation Studies. arXiv:2510.19489, https://doi.org/10.48550/arXiv.2510.19489

Page updated

Google Sites

Report abuse

Rethinking Simulation Studies: Living Synthetic Benchmarks for Cumulative Methodological Research

František Bartoš* (1), Samuel Pawel (2), Björn S. Siepe (3)

1) Department of Psychological Methods, University of Amsterdam

2) Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of Zurich

3) Psychological Methods Lab, Department of Psychology, University of Marburg

Mezinárodní společnost pro klinickou biostatistiku v České republice, z.s. (ISCB ČR)