Bachmann JMLA filter: appraisal

This appraisal is for Bachmann LM, Estermann P, Kronenberg C, ter Riet G. Identifying diagnostic accuracy studies in EMBASE. Journal of the Medical Library Association 2003;91(3):341-6.

This appraisal was prepared by Julie Glanville

Information and Methodological Issues

Categorisation Issues

Detailed information, as appropriate

A. Information

A.1 State the author's objective


To develop and test search strategies to identify diagnostic articles recorded on EMBASE.

A.2 State the focus of the search

[] Sensitivity-maximising

[ ] Precision-maximising

[ ] Specificity-maximising

[x] Balance of sensitivity and specificity / precision

[ ] Other


A.3. Database(s) and search interface(s).


EMBASE. Datastar was used for identifying the gold standard but it is unclear which interface was used for the testing.

A.4.Describe the methodological focus of the filter (e.g. RCTs).


Diagnostic studies

A.5 Describe any other topic that forms an additional focus of the filter (e.g. clinical topics such as breast cancer, geographic location such as Asia or population grouping such as paediatrics).


None

A.6 Other obervations



B. Identification of a gold standard (GS) of known relevant records


B. 1 Did the authors identify one or more gold standards (GSs)?nown relevant records

One

GS for construction of filter.

B.2 How did the authors identify the records in each GS? wn relevant records


Four general medical journals for year 1999 were hand-searched by one researcher. A second researcher independently duplicated the handsearch

in a random 10% of issues.

B.3 Report the dates of the records in each GS. wn relevant records


1999

B.4 What are the inclusion criteria for each GS? relevant records


Diagnostic accuracy studies where “at least one test was compared with a reference standard.” Tests were defined as procedures used to change the estimate of the likelihood of disease presence.

B.5 Describe the size of each GS and the authors’ justification, if provided (for example the size of the gold standard may have been determined by a power calculation) antcords


61

B.6 Are there limitations to the gold standard(s)? ntcords

Yes

GS set is small.

B.7 How was each gold standard used? cords

[x ] to identify potential search terms

[ ] to derive potential strategies (groups of terms)

[x ] to test internal validity

[ ] to test external validity

[ ] other, please specify


B.8 Other observations. cords



C. How did the researchers identify the search terms in their filter(s) (select all that apply)?


C.1 Adapted a published search strategy.

No


C.2 Asked experts for suggestions of relevant terms.

No


C.3 Used a database thesaurus.

No


C.4 Statistical analysis of terms in a gold standard set of records (see B above).

Yes

Word frequency analysis using Idealist to obtain frequencies of all words in the records.

Terms were examined by two researchers and were excluded if not semantically associated with diagnosis. Truncated terms were used if there were words with a common stem, e.g. “diagnos*”.

C.5 Extracted terms from the gold standard set of records (see B above).

No


C.6 Extracted terms from some relevant records (but not a gold standard).

No


C.7 Tick all types of search terms tested.

[x] subject headings

[x] text words (e.g. in title, abstract)

[] publication types

[ ] subheadings

[ ] check tags

[ ] other, please specify


C.8 Include the citation of any adapted strategies.



C.9 How were the (final) combination(s) of search terms selected?


Sensitivity and precision of the 23 most frequent text word terms in retrieving GS records were calculated. Sensitivity x Precision for each term was calculated. The ten terms with the highest sensitivity-precision scores were combined using OR into a series of strategies whose retrieval performance was tested.

Only one EMTREE term (Diagnostic Accuracy) was identified for consideration but was rejected prior to this selection process due to having low precision.

C.10 Were the search terms combined (using Boolean logic) in a way that is likely to retrieve the studies of interest?

Yes


C.11 Other observations.



D. Internal validity testing (This type of testing is possible when the search filter terms were developed from a known gold standard set of records).

D.1 How many filters were tested for internal validity? cords).

Eight

8 search combinations were tested and 3 were recommended for use.

D.2 Was the performance of the search filter tested on the gold standard from which it was derived?ds).

Yes


D.3 Report sensitivity data (a single value, a range, ‘Unclear’* or ‘not reported’, as appropriate). *Please describe. ds).


1. 91.8%

2. 100%

3. 73.8%

D.4 Report precision data (a single value, a range, ‘Unclear’* or ‘not reported’ as appropriate). *Please describe. ).


1. 9.2%

2. 3.7%

3. 17.6%

D.6 Other performance measures reported.


Number needed to read (NRR) (1/precision):

1. 10.9

2. 27

3. 5.7

D.7 Other observations.



E. External validity testing (This section relates to testing the search filter on records that are different from the records used to identify the search terms).

E.1 How many filters were tested for external validity on records different from those used to identify the search terms?



E.2 Describe the validation set(s) of records, including the interface.



For each filter report the following information.

E.3 On which validation set(s) was the filter tested?



E.4 Report sensitivity data for each validation set (a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).



E.5 Report precision data for each validation set (report a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).



E.6 Report specificity data for each validation set (a single value, a range or ‘Unclear’ or ‘not reported’, as appropriate).



E.7 Other performance measures reported.



E.8 Other observations



F. Limitations and Comparisons



F.1 Did the authors discuss any limitations to their research?


The authors acknowledged that their choice of journals to handsearch and the limit to one year might limit the generalisability of the search filter.

The authors noted that their strategies had not been validated against other gold standard(s).

Phrase analysis was not possible and might have provided useful relevant terms for the filter.

F.2 Are there other potential limitations to this research that you have noticed?



F.3 Report any comparisons of the performance of the filter against other relevant published filters (sensitivity, precision, specificity or other measures).



F.4 Include the citations of any compared filters.



F.5 Other observations and / or comments.


There was limited exploration of suitable EMTREE terms.

G. Other comments. This section can be used to provide any other comments. Selected prompts for issues to bear in mind are given below.

G.1 Have you noticed any errors in the document that might impact on the usability of the filter?

No


G.2 Are there any published errata or comments (for example in the MEDLINE record)?

No


G.3 Is there public access to pre-publication history and / or correspondence?

No


G.4 Are further data available on a linked site or from the authors?

No


G.5 Include references to related papers and/or other relevant material.

None


G.6. Other comments


The authors provide specific and comprehensive filters for Datastar, Ovid and Silverplatter interfaces to EMBASE. The searches differ slightly but the

differences are not explained.