Speccheck Package

Speccheck is a statistical software that implements a specification check for p-hacking as described in Brodeur et al., 2020.

This specification check for p-hacking reports t-curves and effect size-curves derived from regressions using every possible combination of control variables from the researchers set. Speccheck allows to visually inspect variation in effect sizes, significativity and sensitivity to the inclusion of control variables. It also allows to inspect the effects sizes and statistical significance of all potential sequences that the control variables could have been included.

Speccheck is available only for Stata at the moment. Feedback from users is appreciated. If you encounter a problem with the software please check "help speccheck" in Stata.

To install speccheck (in Stata) for the first time, simply run the following code:

ssc install speccheck

ssc install tuples // speccheck requires the package tuples to run

To see the syntax and help file for speccheck simply enter the following into the Stata command prompt:

help speccheck

Brodeur et al., 2020 provide an illustrative example by applying speccheck to study the returns to education using the NLSY-79 cohort.

Another example of using speccheck is:

ssc install bcuse // to access the data

bcuse wagepan, clear

xtset nr year

speccheck lwage union educ married exper expersq hours, method(xtreg) xt(fe) nocon(Yes)

This command will use panel wage data used in Wooldridge 2016 and originally sourced from Vella and Verbeek (1998). The data are taken from the National Longitudinal Survey (Youth Sample). It is a sample of 545 men who worked full-time every year from 1980 to 1987, having completed their schooling by 1980. Vella and Verbeek (1998) estimate that union membership results in a wage premium of around 8% with individual fixed effects (Table 3, Cols 3-4).

It will regress the natural logarithm of the reported wages against union status (whether the worker had wages determined by a collective agreement). Because we are using xtreg with panel (individual) fixed effects, we are estimating within-person changes from non-union status to union status. The possible additional control variables (education, marital status, a measure of experience, its square, and number of hours worked) could vary within person over time, and explain wages and union status. speccheck will regress all possible combinations of the possible variables (ignoring the case with no controls as we have specified the nocon(Yes) option). For each regression, the estimated coefficient of union is recorded, along with its statistical significance. These are then plotted into the output figures.

The t-curve tells us that the effect of unionization on wages is always statistically significant at the conventional 5% threshold. The effect curve tells us the unionization premium estimates range from 0 to 25%. However, the mass is concentrated just below the 10%.

When we look at statistical significance by the number of controls, we see that as additional controls are added, the statistical significance of the union wage premium estimate falls, but remains above the t=1.96 threshold.

Turning to effect sizes by the number of additional controls, we see that the median estimate (the center line in each of the box and whisker plots) does not vary substantially. The median effect is estimated to be somewhere between 7-8% depending on specification. This is in line with the results from Vella and Verbeek (1998).


Brodeur, Abel, Nikolai Cook, and Anthony Heyes. "A proposed specification check for p-hacking." AEA Papers and Proceedings. Vol. 110. 2020.