Contributors: Saskia le Cessie, Bianca de Stavola, Vanessa Didelez, Els Goetghebeur, Erica Moodie
Refers to the activity of controlling for confounding variables, in order to make treated and untreated (or more generally exposed and unexposed) comparable. Adjustment can be performed in several ways, for example by inclusion of these covariates in a regression model for the outcome of interest, or by inverse probability weighting, or by matching on the covariates or stratification by them. Depending on the technique used, the adjustment method can yield a conditional or marginal treatment effect measure.
A rule for the identification of adjustment sets of variables in a causal DAG (Directed Acyclic Graph) that would allow the non-parametric identification of the causal effect of the exposure on the outcome if adjusted for. The set cannot include any variables that are affected by the exposure (e.g. mediators).
A backdoor path relates to a sequence of consecutive edges between the treatment (exposure) and the outcome in a (causal) directed acyclic graph whereby the edge that starts from the treatment is directed into it (i.e. the arrowhead points into the treatment). If open, it represents an association that would remain between treatment and outcome if the causal effect from the treatment to the outcome were removed. This is more general than a confounding path because a backdoor path could also be opened by conditioning on colliders. A backdoor path can be blocked by adjustment for a variable along that path that is not a collider.
A variable (node) on a path in a (causal) directed acyclic graph into which two arrowheads point (and thus “collide”). A path that includes a collider is blocked without any further adjustment. A path that includes a collider is open if the adjustment is made for the collider or one of its descendants.
Probabilistic independence between potential outcomes and treatment (exposure) when pre-treatment covariates are conditioned upon. See exchangeability.
A term frequently used to describe a common cause of treatment and outcome. It points to a variable that potentially affects the unadjusted association between treatment and outcome, but it is not strictly a well-defined term.
An association generated between treatment (exposure) and outcome through common causes. This can be removed by adjustment.
(related to backdoor path). A path involving common causes of treatment (exposure) and outcome.
This states that the potential outcome of an intervention that sets the treatment (exposure) to take the value that is actually observed is equal to the observed outcome, i.e. the potential outcome under observed exposure is factual and observed. For such consistency to hold, the exposure should be “well-defined” in the sense that, if there were different ways of setting the treatment (exposure) to take a particular value, they would all result in the same potential outcome (“treatment variation invariance”).
Refers to the potential outcome under the intervention level that is not observed, i.e., counter-to-fact. Consider an example of a public health intervention with two levels: promoting or not promoting a physical exercise, defining two potential outcomes. For an individual in the study who belongs to the untreated group the observed outcome is the outcome under no treatment, and the counterfactual outcome is the potential outcome under the treatment.
Describing distributional independence between the treatment variable and background variables. This is expected to occur when treatment is randomised. Covariate balance between treated and untreated groups is the goal of adjustment, and is often evaluated to ensure post-adjustment validity. Balance checks are performed by comparing differences in confounder distributions of the treated and untreated strata of a sample after conditioning on functions of covariates or after weighting (e.g. via inverse weighting). For example, constructing strata of the confounders directly yields covariate balance within each stratum. An early reference referring to balancing in the potential outcome framework is Rubin (1974). Here, balance is a consequence of treatment randomisation ensuring the lack of a background variable systematically favouring any of the treatment groups.
A graph is an object of vertices (also referred to as nodes) and edges; a directed acyclic graph has directed edges (arrows) and no cycles: it is not possible to return to a node by following the directed edges. DAGs are typically used in the causal inference literature to visualise probabilistic assumptions regarding dependencies between variables in a data-generating mechanism.
A DAG is said to be causal if a directed edge represents a potential controlled direct effect (controlling for all other parent nodes). It is then the absence of an edge that is especially meaningful: no direct effect. We thus need i) the arrows represent direct causal effects and ii) all common causes of vertices are also present in the DAG. DAGs are referred to as Bayesian networks in the computer science literature. A seminal paper introducing causal DAGs to statisticians is Pearl (Biometrika, 1995).
The do-operator is part of the theoretical framework of Structural Causal Models introduced by Judea Pearl for defining causal effects. Here, the do-operator is used to distinguish observing various treatment (exposure) levels A associated with outcome Y, as in P(Y=y|A=a), from intervening on a treatment level A, as in P(Y=y|do(A=a)). The do-operator do(A=a) points to a potential outcome Y(a) when setting the A level to a. For most practical purposes we can regard the two distributions as equal: P(Y(a)=y) = P(Y=y|do(A=a)).
Refers to the ability to replace potential treated outcomes for those observed in the control group with observed outcomes in the treated group and vice versa for potential untreated outcomes. Hence the treated and untreated groups are exchangeable with respect to their potential outcome variables. Conditional exchangeability means that this property holds conditional on some sufficient set of (pre-treatment) covariates. Examples of terminology used in the literature defining (conditional) exchangeability are: no unmeasured confounding, ignorability, unconfoundedness and selection on observables.
Proposed by Robins (1986), g-estimation most generally refers to methods based on estimating equations which exploit the conditional independence between treatment and the potential outcome (typically the potential outcome under no treatment) that must hold under (conditional) exchangeability. It is most prominently used to fit structural nested models for time-dependent treatments, which typically parametrise the effect of a so-called treatment ‘blip’, i.e. the contrast of treatment versus no treatment at the next time point under a particular treatment strategy at all subsequent times. Moreover, g-estimation is also the basic principle for fitting structural mean models which exploit instrumental variables, and for deriving doubly-robust estimators. G-estimation together with the g-formula and inverse probability of treatment weighting form the so-called ‘g-methods’.
(also known as g-computation, g-computation algorithm; often also: parametric g-formula). The g-formula describes how to obtain the marginal causal effect of multiple, typically sequential, treatments (or exposures) including dynamic/adaptive ones, on a possibly time-varying outcome. This is expressed in terms of a sequence of conditional distributions of the treatment, (time-dependent) covariates and the outcome. Under a sequential (conditional) exchangeability assumption, the g-formula constitutes a valid way of adjusting for time-varying confounding. The g-formula has been proposed and named by Robins (1986). In the special case of a single treatment, the g-formula corresponds to Pearl’s (1995) back-door adjustment formula, and both can be seen as forms of standardization. The g-formula obtains the effect of a treatment strategy by marginalizing (summing/integrating out) all relevant time-varying covariates, and possibly the history of a time-varying outcome, with respect to their conditional distributions given the past. If estimation is based on parametric models for the covariates at each time point given the past (covariates measured prior to that point), then this procedure is known as parametric g-formula.
The collection of methods for causal inference concerning the effects of ‘generalised treatments’ (hence the prefix ‘g-‘), in particular sequential (possibly dynamic/adaptive) treatments, as opposed to single point treatments. These methods include the g-formula, g-estimation and inverse probability of treatment weighting for marginal structural models.
A bias that arises from inappropriately treating exposure as being time-invariant rather than time-dependent. Under this setting, the rule for allocating individuals to treatment (or exposure) groups implies that they cannot possibly experience the outcome (say, death) for a certain period in one treatment group. A typical example is when only individuals who start treatment at any point in time during follow-up are allocated to the treatment group and everyone else to the untreated group; someone who starts treatment two years post baseline must therefore have survived for two years, while all people who died untreated within two years are allocated to the untreated group. Immortal-time bias leads to an overestimation of the survival probability, or the mean/median survival time, in the corresponding group (usually the treatment / exposure group). Allocation to either treatment or untreated group should be decided at baseline (which must be clearly defined), and not by ‘looking into the future’.
An instrumental variable can help to avoid bias due to unobserved confounding by acting as a ‘substitute’ for randomisation. Formally, an instrumental variable (IV) (partially) identifies the causal effect of an exposure / treatment on an outcome in the presence of unmeasured confounding of the treatment/outcome relationship. The defining properties of an IV are: (1) the IV must be associated with treatment, (2) it must be independent of any unmeasured factors U confounding the treatment/outcome relation, and (3) it must be conditionally independent of the outcome given treatment and unmeasured confounders. The final of these conditions implies that, crucially, the IV must not itself have a direct causal effect on the outcome which is known as the exclusion restriction.
A form of adjustment for covariates confounding the association between for instance 1) treatment and outcome or 2) censoring and a survival time. Inverse weighting for the confounding of treatment refers to weighting subjects with the inverse of the (conditional) probability of observed treatment (or exposure). This is the inverse of the propensity score among the treated and the inverse of its complement for the untreated, given (measured) confounders. In the re-weighted population, treatment is independent of covariates included in the propensity score. Intuitively, an individual who did receive treatment even though this was unlikely based on their covariates will be up-weighted, and similarly for untreated, which creates a balance of the covariates in the re-weighted population. Inverse weighting to reduce informative censoring bias uses the inverse of the probability of remaining in study, thus of being uncensored, for those that remain in the study (or are uncensored) at a given time. Hence, inverse weighting can be used to adjust for confounding (or covariate-dependent selection, or informative censoring) provided that the measured covariates are sufficient for conditional exchangeability to hold.
A marginal structural model is a semi-parametric model parameterizing the effect of a (typically time-dependent) exposure on a potential outcome (which may also be time-dependent). The defining feature of an MSM that it marginalizes over all time-varying covariates, i.e. it makes no modelling assumptions about how the outcome depends on any post-baseline covariates (unlike the parametric g-formula, for instance). An MSM may be specified conditionally on baseline covariates. Estimation of parameters of an MSM is achieved by inverse-probability of treatment weighting, but can be estimated by other means including G-computation or targeted learning.
A variable on the causal pathway between a treatment (or exposure) and the outcome. This can be visualized in a DAG by directed arrows linking the exposure to the mediator and the mediator to the outcome while the exposure might also have a direct arrow into the outcome.
An analysis to examine the influence of intermediate factors (mediators) on the causal pathway between a treatment (exposure) and an outcome. It aims to yield estimates of components of the total effect that do or do not involve the mediator(s). Many different estimands have been defined for this setting.
A special type of instrumental variable analysis where genetic variation is assumed to have the properties that allow its use as instrument to investigate the causal relation between a modifiable exposure and an outcome (such as the ALDH2 gene with alcohol intake).
(effect modifier). A variable that affects the strength of the causal relation between exposure and outcome. Statistically, a modifier is often represented by an interaction between the exposure and a covariate in a regression model. This is distinct from a causal interaction that needs a joint intervention to take effect.
An assumption often made in causal inference, stating that conditional on a sufficient set of confounders X, the (conditional) probability to be assigned to any of the treatment (exposure) groups compared is larger than 0 in the study population.
This is the outcome Y if the treatment A were set to a.
A function of (baseline) covariates X, which predicts the outcome. It may take the form of a linear function of the covariates derived from an outcome regression model cast in terms of mean, relative risks, odds ratios or log hazard ratios. Typical examples are the linear function of covariates in a generalized linear model or the log relative hazard. A prognostic score need not be restricted to take a linear functional form.
It is the conditional probability of treatment a given a set of confounders x. Methods using the propensity score are, e.g., propensity score matching or inverse probability of treatment weighting. In practice the propensity score often needs to be estimated, e.g. by a logistic model. When it conditions on a sufficient set of confounders it can be instrumental in estimating a causal effect of treatment.
Bias arising from study design or analysis that conditions on subject characteristics or collider variables.
An analysis that examines changes in results (e.g. outcome parameters and/or their precision) under assumptions different from those made in the primary analysis. Such analyses are especially relevant when untestable assumptions are needed to have a uniquely defined estimator as with missing data or the `no unmeasured confounding’ assumption in causal inference.
The Stable Unit Treatment Value Assumption is typically invoked to justify the concept and formalization of potential outcomes for individuals under specific counterfactual exposures. It intends to have a well-defined unit of treatment (exposure) that comes with an expected causal effect size for subjects, irrespective of the treatment level received by others. This embedded assumption of `no interference’ or `no spillover’ does not cover infectious diseases.
A structural nested model is a semi-parametric model for contrasts between potential outcomes under different treatments. It is typically used for time varying treatments where the treatment effect (the contrast of treatment versus no treatment) at a certain time point is parametrized conditional on the observed treatment and covariate history. Parameters of these models are often estimated using g-estimation.