Pragmatic Trials Training Program

Estimand and Estimators

(Week of October 06, 2025)

Primary content

Module 4-4 – Estimand and Estimators in Pragmatic Trials (20-Minute Video)

This module breaks down one of the trickiest parts of clinical trials—analysis. What exactly is an estimand? How does it differ from an estimator? And why do trialists talk so much about intention-to-treat versus per-protocol versus as-treated analyses?

** The video's content and narration were generated with the assistance of artificial intelligence, with human guidance and oversight throughout the process. **

Curriculum Wheel

Additional Material

Smeltzer et al. 2022.pdf

Statistical considerations for outcomes in clinical research: A review of common data types and methodology (Source)

Abstract: With the increasing number and variety of clinical trials and observational data analyses, producers and consumers of clinical research must have a working knowledge of an array of statistical methods. Our goal with this body of work is to highlight common types of data and analyses in clinical research. We provide a brief, yet comprehensive overview of common data types in clinical research and appropriate statistical methods for analyses. These include continuous data, binary data, count data, multinomial data, and time-to-event data. We include references for further studies and real-world examples of the application of these methods. In summary, we review common continuous and discrete data, summary statistics for said data, common hypothesis tests and appropriate statistical tests, and underlying assumption for the statistical tests. This information is summarized in tabular format, for additional accessibility.

E9-R1_Step4_Guideline_2019_1203.pdf

Addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials (Source)

The ICH E9(R1) addendum explains how to define exactly what treatment effect a trial aims to measure (the “estimand”) and how to run sensitivity analyses to check if results hold under different reasonable assumptions. It helps sponsors, clinicians, and regulators align trial objectives, design, conduct, analysis, and reporting, especially when participants stop, switch, or add treatments (so-called intercurrent events).

ICH Training Module

Leyrat et al. 2018.pdf

Cluster randomized trials with a small number of clusters: which analyses should be used? (Source)

Background: Cluster randomized trials (CRTs) are increasingly used to assess the effectiveness of health interventions. Three main analysis approaches are: cluster-level analyses, mixed-models and generalized estimating equations (GEEs). Mixed models and GEEs can lead to inflated type I error rates with a small number of clusters, and numerous small-sample corrections have been proposed to circumvent this problem. However, the impact of these methods on power is still unclear.

Methods: We performed a simulation study to assess the performance of 12 analysis approaches for CRTs with a continuous outcome and 40 or fewer clusters. These included weighted and unweighted cluster-level analyses, mixed-effects models with different degree-of-freedom corrections, and GEEs with and without a small-sample correction. We assessed these approaches across different values of the intraclass correlation coefficient (ICC), numbers of clusters and variability in cluster sizes.

Results: Unweighted and variance-weighted cluster-level analysis, mixed models with degree-of-freedom corrections, and GEE with a small-sample correction all maintained the type I error rate at or below 5% across most scenarios, whereas uncorrected approaches lead to inflated type I error rates. However, these analyses had low power (below 50% in some scenarios) when fewer than 20 clusters were randomized, with none reaching the expected 80% power.

Conclusions: Small-sample corrections or variance-weighted cluster-level analyses are recommended for the analysis of continuous outcomes in CRTs with a small number of clusters. The use of these corrections should be incorporated into the sample size calculation to prevent studies from being underpowered.

Kahan et al. 2016.pdf

Increased risk of type I errors in cluster randomised trials with small or medium numbers of clusters: a review, reanalysis, and simulation study (Source)

Background: Cluster randomised trials (CRTs) are commonly analysed using mixed-effects models or generalised estimating equations (GEEs). However, these analyses do not always perform well with the small number of clusters typical of most CRTs. They can lead to increased risk of a type I error (finding a statistically significant treatment effect when it does not exist) if appropriate corrections are not used.

Methods: We conducted a small simulation study to evaluate the impact of using small-sample corrections for mixed-effects models or GEEs in CRTs with a small number of clusters. We then reanalysed data from TRIGGER, a CRT with six clusters, to determine the effect of using an inappropriate analysis method in practice. Finally, we reviewed 100 CRTs previously identified by a search on PubMed in order to assess whether trials were using appropriate methods of analysis. Trials were classified as at risk of an increased type I error rate if they did not report using an analysis method which accounted for clustering, or if they had fewer than 40 clusters and performed an individual-level analysis without reporting the use of an appropriate small-sample correction.

Results: Our simulation study found that using mixed-effects models or GEEs without an appropriate correction led to inflated type I error rates, even for as many as 70 clusters. Conversely, using small-sample corrections provided correct type I error rates across all scenarios. Reanalysis of the TRIGGER trial found that inappropriate methods of analysis gave much smaller P values (P ≤ 0.01) than appropriate methods (P = 0.04–0.15). In our review, of the 99 trials that reported the number of clusters, 64 (65 %) were at risk of an increased type I error rate; 14 trials did not report using an analysis method which accounted for clustering, and 50 trials with fewer than 40 clusters performed an individual-level analysis without reporting the use of an appropriate correction.

Conclusions: CRTs with a small or medium number of clusters are at risk of an inflated type I error rate unless appropriate analysis methods are used. Investigators should consider using small-sample corrections with mixed-effects models or GEEs to ensure valid results.

Kahan et al. 2023.pdf

Estimands in cluster-randomized trials: choosing analyses that answer the right question (Source)

Background: Cluster-randomized trials (CRTs) involve randomizing groups of individuals (e.g. hospitals, schools or villages) to different interventions. Various approaches exist for analysing CRTs but there has been little discussion around the treatment effects (estimands) targeted by each.

Methods: We describe the different estimands that can be addressed through CRTs and demonstrate how choices between different analytic approaches can impact the interpretation of results by fundamentally changing the question being asked, or, equivalently, the target estimand.

Results: CRTs can address either the participant-average treatment effect (the average treatment effect across participants) or the cluster-average treatment effect (the average treatment effect across clusters). These two estimands can differ when participant outcomes or the treatment effect depends on the cluster size (referred to as ‘informative cluster size’), which can occur for reasons such as differences in staffing levels or types of participants between small and large clusters. Furthermore, common estimators, such as mixed-effects models or generalized estimating equations with an exchangeable working correlation structure, can produce biased estimates for both the participant-average and cluster-average treatment effects when cluster size is informative. We describe alternative estimators (independence estimating equations and cluster-level analyses) that are unbiased for CRTs even when informative cluster size is present.

Conclusion: We conclude that careful specification of the estimand at the outset can ensure that the study question being addressed is clear and relevant, and, in turn, that the selected estimator provides an unbiased estimate of the desired quantity.

Google Sites

Report abuse