Copas and Jackson suggested that a “bound for publication bias”, which they had demonstrated in other fields, would result in an apparent treatment effect being derived mainly or totally from the studies using the smallest sample sizes; a genuine treatment effect would show relatively even effect sizes across different sample sizes, although there would be greater statistical significance with larger groups. Lösel and Schmucker rejected this idea, saying their data showed no difference between published and unpublished studies. But the fact is, that even their "unpublished" studies were in one sense published: that is, their authors revealed them to Lösel and Schmucker. The Copas and Jackson hypothesis implies that this would still result in the suppression of unsuccessful studies, as their authors would not want to reveal them to colleagues.
A proper test of the hypothesis requires a comparison of sample size with treatment effect size. If Copas and Jackson's hypothesis is correct, then the strongest treatment effect should be shown by the smallest studies. If there is a genuine treatment effect, however, then clearly the largest studies will bring this out more strongly. Once again, Lösel and Schmucker's metaanalysis inadvertently provided data enabling this to be examined. The result is in the table below:
It is clear from this that the alleged treatment effect emanates almost entirely from the smallest studies, exactly as the bound for publication hypothesis predicts. To be fair, the supposed treatment effect for the middle group (samples of 101 to 200) reaches statistical significance, but the odds ratio is still only 1.65. The odds ratio for the smallest groups is 4.03. The odds ratio for the largest groups is 0.88, i.e., once again a suggestion that the treatment may actually be making people worse.
Before leaving Lösel and Schmucker, it is interesting to note that they recorded whether the authors of the research studies included in their metaanalysis were also the people who had conducted the treatment programme in question. When the Home Office introduced offending behaviour programmes it was intended to keep the clinical work and the evaluations of it separate, so that clinicians were not reporting on the effectiveness of their own work. They did not entirely keep to this, but it is a sound principle. This is not because clinicians are dishonest, but because human judgement is prone to various kinds of bias. One cannot always be certain where this bias enters into the reporting of research, but it is clear that having people report on the effectiveness of their own clinical work allows for some distortion to creep in where others might report more objectively. The odds ratio for those studies in which people are reported on their own work was 1.92, indicating that treated offenders did almost twice as well as those in the control group. This was statistically significant. The comparable figure for studies in which people were reporting on someone else's work was 0.99, which means no treatment effect was found. In other words, treatment was only reported to be effective where people were reporting on their own work.
