Deville: appraisal

This appraisal is for Deville WLJM, Brezemer, PD, Bouter LM. Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy. Journal of Clinical Epidemiology 2000: 53(1):65-9.

This appraisal was prepared by Cynthia Fraser in March 2008.

Information and Methodological Issues

Categorisation Issues

Detailed information, as appropriate

A. Information

A.1 State the author's objective

To develop an optimal search strategy to identify diagnostic test evaluations, applicable to any clinical field.

A.2 State the focus of the search

[ ] Sensitivity-maximising

[ ] Precision-maximising

[ ] Specificity-maximising

[x] Balance of sensitivity and specificity / precision

[ ] Other

To develop a more specific strategy than previously reported without losing sensitivity.

A.3. Database(s) and search interface(s).

MEDLINE

A.4.Describe the methodological focus of the filter (e.g. RCTs).

Diagnostic test evaluations

A.5 Describe any other topic that forms an additional focus of the filter (e.g. clinical topics such as breast cancer, geographic location such as Asia or population grouping such as paediatrics).

None

A.6 Other obervations

B. Identification of a gold standard (GS) of known relevant records

B. 1 Did the authors identify one or more gold standards (GSs)?nown relevant records

One reference set (RS) and one control set (CS).

B.2 How did the authors identify the records in each GS? wn relevant records

RS: Title/abstract screening of 9 family medicine journals published 1992-5. The journals selected were considered amongst most relevant by eight Departments of Family Medicine in the Netherlands. Full publications, of those giving indication of diagnostic content, were read for relevance.

CS: False–positive papers retrieved by adapted Haynes 1994 sensitive filter (see C8) applied in Medline records for same 9 journals published 1992-5.

B.3 Report the dates of the records in each GS. wn relevant records

RS and CS: 1992-1995.

B.4 What are the inclusion criteria for each GS? relevant records

RS: Primary publications where at least one diagnostic test was compared with reference standard.

CS: False-positive records (i.e. non RS retrieved by Haynes filter) limited to primary papers and excluding case reports and animal research.

B.5 Describe the size of each GS and the authors’ justification, if provided (for example the size of the gold standard may have been determined by a power calculation) antcords

RS: 75 records.

CS: 137 records.

B.6 Are there limitations to the gold standard(s)? ntcords

Yes

These are small sets of records. They are derived from family medicine journals which may not be representative of diagnostic test evaluation literature.

B.7 How was each gold standard used? cords

[x] to identify potential search terms

[x] to derive potential strategies (groups of terms)

[x] to test internal validity

[ ] to test external validity

[ ] other, please specify

B.8 Other observations. cords

C. How did the researchers identify the search terms in their filter(s) (select all that apply)?

C.1 Adapted a published search strategy.

C.2 Asked experts for suggestions of relevant terms.

C.3 Used a database thesaurus.

C.4 Statistical analysis of terms in a gold standard set of records (see B above).

Yes

Univariate analysis to calculate sensitivity (proportion of RS identified), specificity (proportion of CS correctly classified) and diagnostic odds ratio (positive likelihood ratio/negative likelihood ratio) for identified terms.

C.5 Extracted terms from the gold standard set of records (see B above).

Yes

MeSH terms and text words of MEDLINE records in reference standard were examined for terms relating to diagnosis or test evaluation.

C.6 Extracted terms from some relevant records (but not a gold standard).

C.7 Tick all types of search terms tested.

[x] subject headings

[x] text words (e.g. in title, abstract)

[ ] publication types

[ ] subheadings

[ ] check tags

[ ] other, please specify

C.8 Include the citation of any adapted strategies.

C.9 How were the (final) combination(s) of search terms selected?

Forward stepwise logistic regression analysis was used to identify the combination of terms that best discriminate between RS and CS.

C.10 Were the search terms combined (using Boolean logic) in a way that is likely to retrieve the studies of interest?

Yes

C.11 Other observations.

D. Internal validity testing (This type of testing is possible when the search filter terms were developed from a known gold standard set of records).

D.1 How many filters were tested for internal validity? cords).

D.2 Was the performance of the search filter tested on the gold standard from which it was derived?ds).

Yes

All filters were tested on RS.

D.3 Report sensitivity data (a single value, a range, ‘Unclear’* or ‘not reported’, as appropriate). *Please describe. ds).

1. 70.7%

2. 73.3%

3. 80.0%
4. 89.3%

D.4 Report precision data (a single value, a range, ‘Unclear’* or ‘not reported’ as appropriate). *Please describe. ).

3.48% (referred to as positive predictive value)

D.5 Report specificity data (a single value, a range, ‘Unclear’* or ‘not reported’ as appropriate). *Please describe. ).

1. 98.5%

2. 98.4%

3. 97.3%

4. 91.9%

D.6 Other performance measures reported.

Diagnostic odds ratio(DOR)

1. 158

2. 170

3. 143
4. 95

D.7 Other observations.

The Haynes filter (external validity) was also tested for comparison. Strategy 3 was considered the most accurate with better performance than the Haynes filter.

E. External validity testing (This section relates to testing the search filter on records that are different from the records used to identify the search terms).

E.1 How many filters were tested for external validity on records different from those used to identify the search terms?

Strategy 4, which was the most sensitive.

E.2 Describe the validation set(s) of records, including the interface.

33 papers on physical diagnostic tests for meniscal lesions. The filter was applied across all of MEDLINE. No further details are provided.

For each filter report the following information.

E.3 On which validation set(s) was the filter tested?

See E2.

E.4 Report sensitivity data for each validation set (a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).

61%

E.5 Report precision data for each validation set (report a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).

4.7% (predictive value)

E.6 Report specificity data for each validation set (a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).

Not reported.

E.7 Other performance measures reported.

None

E.8 Other observations

The Haynes sensitive filter was also tested. The performance (sensitivity and predictive value) of Strategy 4 was better than the Haynes sensitive filter.

F. Limitations and Comparisons

F.1 Did the authors discuss any limitations to their research?

External validation results show low sensitivity and further validation on different clinical areas should be undertaken.

F.2 Are there other potential limitations to this research that you have noticed?

F.3 Report any comparisons of the performance of the filter against other relevant published filters (sensitivity, precision, specificity or other measures).

Haynes 1994 sensitive strategy:

RS. Sensitivity 73.7%; Specificity 94.3%;

DOR: 45.

External validation set: Sensitivity 45%; Predictive value 3.4%.

F.4 Include the citations of any compared filters.

CS: Haynes RB et al Developing optimal search strategies for detecting clinical sound studies in Medline. JAMIA 1994;1:447-58.

F.5 Other observations and / or comments.

G. Other comments. This section can be used to provide any other comments. Selected prompts for issues to bear in mind are given below.

G.1 Have you noticed any errors in the document that might impact on the usability of the filter?

G.2 Are there any published errata or comments (for example in the MEDLINE record)?

G.3 Is there public access to pre-publication history and / or correspondence?

G.4 Are further data available on a linked site or from the authors?

G.5 Include references to related papers and/or other relevant material.

None

G.6. Other comments