Appraisal of: Wilczynski NL, Marks S, Haynes RB. Search Strategies for Identifying Qualitative Studies in CINAHL. Qualitative Health Research. 2007;17(5):705-710
Reviewer(s):
Andrew Booth
Full Reference:
Wilczynski NL, Marks S, Haynes RB. Search Strategies for Identifying Qualitative Studies in CINAHL. Qualitative Health Research. 2007;17(5):705-710
Short description:
This study developed and tested optimal search strategies to retrieve qualitative studies from CINAHL for the 2000 publishing year. The research team hand-searched 75 journals indexed in CINAHL, identifying 277 qualitative studies from 8,493 total articles. Qualitative studies were defined by three criteria: content related to how people experience certain situations, appropriate data collection methods for qualitative data, and appropriate analytical methods for qualitative data.
The authors tested 5,020 unique search terms (both indexing terms and text words) and 17,921 search strategies combining multiple terms using Boolean OR logic. Search strategies were evaluated as diagnostic tests against the gold standard of hand-searching, with performance measured by sensitivity, specificity, precision, and accuracy.
The study presents three types of optimal search strategies: those maximizing sensitivity (98.9% sensitivity, 54.0% specificity), those maximizing specificity (53.1% sensitivity, 99.5% specificity), and those optimizing both (94.2% sensitivity and specificity). The results demonstrate that combining indexing terms and text words can achieve high performance for retrieving qualitative studies from CINAHL, with CINAHL-specific subject headings (such as "audiorecording," "qualitative studies," "grounded theory," and "thematic analysis") performing particularly well compared to MEDLINE equivalents.
Limitations stated by the author(s):
The authors do not explicitly state limitations in a dedicated section. However, they acknowledge that precision figures reported are based on a subset of CINAHL records (75 journals only) and that when searching the entire CINAHL database, precision will likely be lower due to the lower concentration of qualitative studies in the complete database. They also note ongoing research to determine the effects on precision when combining these search strategies with content terms and when used in journal subsets.
Limitations stated by the reviewer(s):
1. Limited temporal scope and potential obsolescence: The study analyzed only journals from the publishing year 2000, making the findings over 20 years old at present. Database indexing practices, controlled vocabularies, and the volume and characteristics of qualitative research have evolved substantially since then. The applicability of these search strategies to contemporary CINAHL searching is uncertain. [External Validity; Temporal Validity]
2. Restricted journal sample: Only 75 of the 170 hand-searched journals were indexed in CINAHL, representing a selective subset chosen based on clinical relevance and impact factors for internal medicine, general practice, mental health, and general nursing. This sampling approach may not represent the full diversity of qualitative research indexed in CINAHL, particularly studies from allied health professions, specialized nursing areas, or international journals with different indexing practices. [Selection Bias; Limited Generalizability]
3. Narrow and potentially problematic definition of qualitative research: The criteria used to classify qualitative studies (content related to experiences, appropriate data collection, appropriate analytical methods) are quite general and may exclude legitimate qualitative studies. The paper does not specify how "appropriate" methods were determined or how mixed-methods studies were handled. More rigorous application of established qualitative research frameworks (e.g., distinguishing ethnography, phenomenology, grounded theory) might have improved study classification validity. [Classification Bias; Construct Validity]
4. No inter-rater reliability testing for study classification: While the larger project reported 89% agreement for methodological criteria (kappa 0.78-0.99), this specific analysis involved applying qualitative-specific criteria. The paper does not report whether inter-rater reliability was specifically assessed for classifying studies as qualitative versus non-qualitative using the three stated criteria, raising questions about the consistency and reproducibility of the gold standard. [Measurement Error; Reliability Issues]
5. Absence of external validation: The search strategies were developed and tested on the same dataset without external validation in a separate sample of journals or a different time period. This risks overfitting the strategies to the specific characteristics of the development set and may overestimate their performance in real-world applications. [Internal Validity; Optimism Bias]
6. Limited consideration of search strategy complexity and usability: While the study reports multiple search strategies with different performance characteristics, there is minimal discussion of the practical implications of strategy complexity (number of terms, ease of implementation) or guidance on when to select each approach. The trade-offs between sensitivity, specificity, and practical feasibility for different review contexts are not fully explored. [Applicability; Implementation Considerations]
7. Potential bias in term selection and testing: The initial compilation of 5,020 search terms was based on input from clinicians and librarians in the United States and Canada, which may not represent international perspectives or capture emerging qualitative research terminology. Additionally, the automated testing procedure, while comprehensive, could miss potentially effective term combinations if individual component terms had sensitivity or specificity below 10%. [Selection Bias; Search Bias]
8. Incomplete reporting of methodology: Several methodological details are missing or unclear, including the specific operational definitions used to assess whether data collection and analytical methods were "appropriate for qualitative data," how disagreements in hand-searching were resolved, and whether any strategies were employed to ensure consistent application of inclusion criteria across the six research assistants over time. [Incomplete Reporting; Reproducibility Concerns]
9. Statistical considerations: The study does not report confidence intervals for precision estimates or discuss the statistical power of comparisons between different search strategies. With 277 qualitative studies as the target set, the stability of performance estimates for strategies retrieving small numbers of studies may be questionable. [Statistical Validity; Precision of Estimates]
10. Database-specific limitations: The strategies are specific to CINAHL's indexing structure and the Ovid search interface as it existed in 2000. Changes to database indexing, the introduction of new subject headings, modifications to explosion hierarchies, and differences across search platforms (Ovid vs. EBSCO) may affect current performance. [Temporal Validity; Platform Dependency]
Study Type:
Diagnostic test accuracy study (search filter development and validation study)
Related Chapters:
Tags:
• Search filters
• Qualitative research
• CINAHL database
• Search strategies
• Sensitivity and specificity
• Information retrieval
• Hand searching
• Methodological research
• Database indexing
• Evidence synthesis
• B. Designing strategies - general
• Validation study