Latest update was on 30th July 2025. Please see Changelog for details if you haven't accessed the DEX website since then

Linked and duplicate publication definitions

Core staff will search for any duplicated or linked publications before assigning records to you, however, you may still come across duplicates/linked publications while you’re extracting.

Duplicates - publications of exactly the same study, which will share the same publication info: title, authors, journal name/issue/vol, url link.

Linked publications – impact evaluation or systematic review* papers (articles, reports, etc.) that report findings for the same study, usually in different versions. Our definition is adopted from the What Works Clearinghouse Procedures handbook 4.1 (Section V., Subsection A., p. 9). The key question you should ask yourself is: do these papers provide largely independent contributions to what we know about how this intervention works, or is this basically one research report split across multiple papers? If the latter is the case, we would consider two or more reports as linked publications for the purpose of the DEP and 3ie produced Evidence Gap Maps**.

*Updates of existing systematic reviews would also be considered as a linked publication. Those are defined as “a new edition of a published systematic review with changes that can include new data, new methods, or new analyses to the previous edition” (Higgins et al. 2023 citing Garner et al 2016).

**For DEP data extraction, we define linked publications differently than when extracting studies included in a systematic review. When producing a systematic review, the key use case is not whether the papers provide a largely independent contribution but whether the effect estimates in these papers are statistically independent.

If you are a coordinator/PM please read further here about how to identify and deal with linked/duplicate publications in the DEP admin panel.

There is a compulsory condition for linked impact evaluation papers:

• To be considered as reporting findings from the same study, have to assess the impact of exactly the same intervention.

The following characteristics should be considered when identifying linked papers:

• Research team. When manuscripts share one or more authors, the reported findings in those manuscripts may be related.

• Sample members. Findings from analyses that include some or all of the same units of analysis (individuals, households, villages, etc.) may be related.

• Group formation procedures, such as the methods used to conduct random assignment or matching. When authors use identical (or nearly identical) methods to form the groups used in multiple analyses, or a single procedure was used to form the groups, the results may not provide independent tests of the intervention.

• Data collection and analysis procedures. Similar to group formation, when authors use identical or nearly identical procedures to collect and analyze data, the findings may be related. Sharing data collection and analysis procedures means collecting the same measures from the same data sources, preparing the data for analysis using the same rules, and using the same analytic methods with the same control variables.

Examples (in progress)

Below are examples of how this rule is applied in different circumstances to identify linked publications.

Example 1:

Findings authored by the same research team. A research team presents findings on the effectiveness of an intervention using two distinct samples in the same manuscript. Because the same research team might conduct analyses that have little else in common, sharing only the research team members is not sufficient for the WWC to consider the findings part of the same study. Therefore, these findings would be considered separate studies. But if the analyses in the manuscript also shared two of the remaining three characteristics, they would instead be considered the same study.

Example 2:

Findings presented by gender. Within a school, authors stratified by gender and randomly assigned boys and girls to condition separately. The authors analyzed and reported findings separately by gender. The WWC would consider this to be a single study because all four of the characteristics listed above are shared by the two samples. First, the same teachers are likely present in both samples, so the sample members overlap. Next, even though boys and girls were randomly assigned to condition separately, the WWC considers strata or blocks within random assignment to be part of a single group formation process. Furthermore, the two samples likely share the same data collection and analysis procedures, and the research teams are the same. Considering this to be a single study is consistent with the goal of the WWC to provide evidence of effectiveness to a combined target population that includes both boys and girls.

Example 3:

Findings presented by grade within the same schools. Within a middle school, authors randomly assigned youth to condition, separately by grade. The authors analyzed and reported findings separately by grade but used the same procedures and data collection. The WWC would consider this to be a single study that tests the effect of an intervention for middle school students. Again, the two samples share all four characteristics.

Example 4:

Findings presented by grade across different schools. Within each participating elementary and middle school, authors randomly assigned youth to condition, separately by grade. The authors analyzed and reported findings separately for elementary and middle schools, and collected data on different outcome measures and background characteristics in the two grade spans. The WWC would consider this to be two distinct studies. The manuscripts share only two of the four characteristics: The data collection was different, and the samples do not overlap.

Example 5:

Findings presented by cohort. Study authors randomly assign teachers within a school to intervention and comparison conditions. The study authors examine the impact of the intervention on achievement outcomes for grade 3 students after one year (cohort 1) and after two years (cohort 2, same teachers but different students). The study authors report results for these two cohorts separately. The WWC would consider this to be a single study that tests the effect of an intervention on third graders because the two samples share all four characteristics.

Example 6:

Findings for the same students after re-randomization. Findings based on an initial randomization procedure and those based on re-randomizing the same units to new conditions might be considered different studies. Despite using different group formation procedures, the first condition is met because the sample members are the same. If the findings were reported by the same research team members, the fourth condition is also met. It is unlikely, but not impossible, that the same data collection and analysis procedures were used given the separation in time. If so, the findings share only two of the four characteristics, and the findings would be considered different studies.

Example 7:

Findings reported by site separately over time. Separately for six states, study authors randomly assigned school districts within a state to intervention and comparison conditions. The same procedures were used at the same time to form the groups, and the same data elements were collected in all six states. The authors published each state’s findings separately, releasing them over time. The final report used a different analytic approach from the previous reports. The authors of the reports changed, but each report shared at least one author with the original report. The WWC would consider all of these but the final report to be a single study of the intervention, because the same group formation procedures were used, the same data collection and analysis procedures were used, and the reports all shared at least one research team member with another report. However, the WWC would consider findings from the site in the final report to be a separate study; because a different analytic approach was used, the findings from the final site only share two characteristics with the findings in the earlier reports.

Example 8:

Findings from replication studies by the same authors. After releasing a report with findings from a randomized controlled trial (RCT), study authors conduct a replication analysis using the same group formation and analysis procedures on a distinct sample: students in different schools and districts. The background characteristics used in the replication analysis differed from those in the original analysis because of differences in administrative data collection. Additionally, the authors introduced a new data collection procedure designed to limit sample attrition. The WWC would consider the replication analysis to be a separate study from the original analysis because the two sets of findings share neither the same sample members nor the same data collection procedures. If the only difference in the data collection procedures had been the background characteristics, the review team could exercise discretion and determine whether the difference is significant enough to consider these separate studies. For example, if the characteristics are specified in the review protocol as required for baseline equivalence, then how they are collected and measured may be significant.

Example 9:

Findings from related samples, based on different designs. Study authors randomly assigned students to a condition and conducted an RCT analysis. Using a subsample of the randomly assigned students, the same authors also examined a quasi-experimental design (QED) contrast that also examined the effectiveness of the intervention. They used different analysis procedures for the two designs. The WWC would consider the QED findings as a separate study from the RCT findings because the findings share only two of the four characteristics: sample members and research team. The WWC considers matching approaches to identifying intervention and comparison groups part of the analysis procedure, so a matching analysis based on data from an RCT would be considered to use different analysis procedures from an analysis of the full randomized sample, even if the analytical models were otherwise identical.

Example 10:

Findings reported for multiple contrasts. If authors compare an intervention group with two different comparison groups, the WWC would consider both contrasts to be part of the same study. They share a research team, sample members, and the group formation process (that is, the intervention group in both contrasts is the same). Because there are many different business-as-usual conditions, all comparisons between the intervention and a comparison group are informative and should be presented as main findings. However, if a contrast is between two versions of the intervention, then the findings should be presented as supplementary.

For further information and examples, see Appendix D of the following document:

https://ies.ed.gov/ncee/wwc/Docs/referenceresources/WWC-Procedures-Handbook-v4-1-508.pdf

Page updated

Report abuse