van der Weijden: appraisal

This appraisal is for van der Weijden T, Ijzermans CJ, Dinant GJ, van Duijn NP, de Vet R, Buntinx F. Identifying relevant diagnostic studies in MEDLINE. The diagnostic value of the erythrocyte sedimentation rate (ESR) and dipstick as an example. Family Practice 1997;14(3):204-8.

This appraisal was prepared by Cynthia Fraser.

Information and Methodological Issues

Categorisation Issues

Detailed information, as appropriate

A. Information

A.1 State the author's objective

To examine sensitivity and positive predictive value of MEDLINE searching for diagnostic studies of two test (erythrocyte sedimentation rate (ESR) and dipstick) relevant to primary care.

A.2 State the focus of the search

[ ] Sensitivity-maximising

[ ] Precision-maximising

[ ] Specificity-maximising

[ ] Balance of sensitivity and specificity / precision

[x] Other

No specific focus

A.3. Database(s) and search interface(s).

MEDLINE: CD-ROM Silverplatter (ESR) CD-ROM Ovid (Dipstick)

A.4.Describe the methodological focus of the filter (e.g. RCTs).

Diagnostic test studies

A.5 Describe any other topic that forms an additional focus of the filter (e.g. clinical topics such as breast cancer, geographic location such as Asia or population grouping such as paediatrics).

Two tests commonly used in primary care: ESR and dipstick urinary analysis.

A.6 Other obervations

B. Identification of a gold standard (GS) of known relevant records

B. 1 Did the authors identify one or more gold standards (GSs)?nown relevant records

Two

One for ESR; one for dipstick test. They are described as “reference standards”.

B.2 How did the authors identify the records in each GS? wn relevant records

ESR: Journal articles from personal reference database of an ‘expert’ in the field were critically appraised for inclusion.

Dipstick: MEDLINE and EMBASE were searched; citation searching; literature search from ‘experienced member of Cochrane Collaboration’. Combined search results were screened and assessed for inclusion.

B.3 Report the dates of the records in each GS. wn relevant records

ESR: 1985-1994
Dipstick: 1990-1995

B.4 What are the inclusion criteria for each GS? relevant records

ESR: primary care setting assessing diagnostic value of ESR.

Dipstick: assessing urinary tract infection and reported empirical data.
For both, articles restricted to those that were indexed in Index Medicus.

B.5 Describe the size of each GS and the authors’ justification, if provided (for example the size of the gold standard may have been determined by a power calculation) antcords

ESR: 221 records
Dipstick: 60 records

B.6 Are there limitations to the gold standard(s)? ntcords

Yes

There is limited reporting so it is unclear how systematic and comprehensive the methods were to derive the gold standards.
Dipstick set was very small.

B.7 How was each gold standard used? cords

[ ] to identify potential search terms

[ ] to derive potential strategies (groups of terms)

[ ] to test internal validity

[x] to test external validity

[ ] other, please specify

B.8 Other observations. cords

C. How did the researchers identify the search terms in their filter(s) (select all that apply)?

C.1 Adapted a published search strategy.

C.2 Asked experts for suggestions of relevant terms.

C.3 Used a database thesaurus.

Unclear

C.4 Statistical analysis of terms in a gold standard set of records (see B above).

C.5 Extracted terms from the gold standard set of records (see B above).

Unclear

C.6 Extracted terms from some relevant records (but not a gold standard).

Unclear

C.7 Tick all types of search terms tested.

[x] subject headings

[x] text words (e.g. in title, abstract)

[ ] publication types

[x] subheadings

[x] check tags

[ ] other, please specify

C.8 Include the citation of any adapted strategies.

C.9 How were the (final) combination(s) of search terms selected?

There was no systematic selection - several variants explored ‘until a reasonable and manageable number of citations was found’

ESR

1. MeSH short version: MeSH diagnostic methodology terms only and included using MeSH term ‘Diagnosis’ exploded with diagnosis subheading (exp Diagnosis/di)

2. MeSH extended: MeSH terms only and included using MeSH term ‘Diagnosis’ exploded with all subheadings (exp Diagnosis/)

3. MeSH extended+ freetext: MeSH terms plus free text terms.

4. These terms were combined with diagnostic test terms relating to ESR using boolean operator AND, and limited to humans

Dipstick

1. MeSH terms only relating to test and condition

2. MeSH plus free text terms relating to test and condition

Neither combination included diagnostic methodology terms. Test and condition terms were combined using Boolean operator AND, and limited to humans.

ESR and Dipstick strategies were tested with and without the addition of Primary Care terms.

C.10 Were the search terms combined (using Boolean logic) in a way that is likely to retrieve the studies of interest?

Yes

C.11 Other observations.

D. Internal validity testing (This type of testing is possible when the search filter terms were developed from a known gold standard set of records).

D.1 How many filters were tested for internal validity? cords).

None

D.2 Was the performance of the search filter tested on the gold standard from which it was derived?ds).

D.3 Report sensitivity data (a single value, a range, ‘Unclear’* or ‘not reported’, as appropriate). *Please describe. ds).

D.4 Report precision data (a single value, a range, ‘Unclear’* or ‘not reported’ as appropriate). *Please describe. ).

D.5 Report specificity data (a single value, a range, ‘Unclear’* or ‘not reported’ as appropriate). *Please describe. ).

D.6 Other performance measures reported.

D.7 Other observations.

E. External validity testing (This section relates to testing the search filter on records that are different from the records used to identify the search terms).

E.1 How many filters were tested for external validity on records different from those used to identify the search terms?

ESR: 3 filters

Dipstick: 2 filters

E.2 Describe the validation set(s) of records, including the interface.

See section B: Gold standard

For each filter report the following information.

E.3 On which validation set(s) was the filter tested?

Gold standards

E.4 Report sensitivity data for each validation set (a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).

ESR:

MeSH short version: 31.0% MeSH extended: 69.0%

MeSH extended plus free text: 91.0% Dipstick:

MeSH: 68.0%
MeSH plus free text: 98.0%

E.5 Report precision data for each validation set (report a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).

Reported as Positive Predictive Value

ESR:

MeSH short version: 34.0% MeSH extended: 11.0%

MeSH extended plus free text: 10.0% Dipstick:

MeSH: 72.0%
MeSH plus free text: 67.0%

E.6 Report specificity data for each validation set (a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).

N/A

E.7 Other performance measures reported.

Inclusion of Primary Care terms increased positive predictive value but lowered sensitivity to 10%(ESR) and 7% (Dipstick)

E.8 Other observations

Lower positive predictive value for ESR search may be due to use of diagnostic methodology terms instead of disease/condition terms.

F. Limitations and Comparisons

F.1 Did the authors discuss any limitations to their research?

Gold standards were “not perfect” but authors believe that it was unlikely that important key studies were excluded.

F.2 Are there other potential limitations to this research that you have noticed?

Search terms relating to different tests were used in conjunction with the diagnostic methodology terms and with the disease terms so comparison between the two strategies is problematic.

F.3 Report any comparisons of the performance of the filter against other relevant published filters (sensitivity, precision, specificity or other measures).

F.4 Include the citations of any compared filters.

F.5 Other observations and / or comments.

G. Other comments. This section can be used to provide any other comments. Selected prompts for issues to bear in mind are given below.

G.1 Have you noticed any errors in the document that might impact on the usability of the filter?

G.2 Are there any published errata or comments (for example in the MEDLINE record)?

G.3 Is there public access to pre-publication history and / or correspondence?

G.4 Are further data available on a linked site or from the authors?

G.5 Include references to related papers and/or other relevant material.

G.6. Other comments