What's new in this update
One new study relevant to search strategy development for systematic reviews or HTAs has been added. The study compares the recall and precision of MeSH terms (or similar controlled vocabulary) versus text-word searching (37). Minor revisions have also been made to reflect the latest update of the Cochrane Handbook (version 6.3, February 2022).
The Cochrane Information Retrieval Methods Group have published an evidence-based chapter on search methods for the Cochrane Handbook (1) and accompanying technical supplement (29), which provides the basis for this summary alongside guidance produced by the Centre for Reviews and Dissemination (2), the Agency for Healthcare Research and Quality (AHRQ) (3) and EUnetHTA (32).
Sensitivity and precision
Searches for systematic reviews aim to be as extensive as possible in order to ensure as many of the relevant studies as possible are identified. However, the Cochrane Handbook states that it is still necessary to "strike a balance between striving for comprehensiveness and maintaining relevance when developing a search strategy" (1). Increasing the sensitivity of a search increases the possibility of identifying all relevant studies, but also tends to reduce precision because the number of irrelevant results is increased (1, 2). Sampson et al examined a cross section of 94 health related SRs that reported the flow of bibliographic records through the review process and found that search precision of approximately 3% was typical (4). The number of results retrieved, which therefore must be screened against eligibility criteria, has implications for the resources required to conduct a SR. This trade-off between sensitivity and precision should be acknowledged and discussed with the wider review team, and an appropriate balance sought within the context of the resources available.
An emphasis on search strategy sensitivity over precision typically reflects the context of SRs of quantitative research on clinical interventions. This emphasis may not be the same in searches developed for different purposes within the health technology assessment (HTA) context. In the context of qualitative SRs or qualitative evidence syntheses for example, there is discussion as to whether these types of reviews share the same need as SRs of quantitative research for ‘comprehensive’, ‘exhaustive’ bibliographic database searches (5). Guidance from the Cochrane Qualitative and Implementation Methods Group recommends that search procedures in the context of qualitative evidence synthesis should generally privilege specificity over sensitivity (6). Similarly, in the context of conducting a search to inform an ‘evidence-map’ (where an overview of the extent, nature and characteristics of a research area is of interest) research has indicated that less sensitive searches may be appropriate. In a study which compared a ‘highly sensitive’ search strategy with a ‘highly specific’ search strategy for an evidence-mapping exercise on diabetes and driving to inform clinical guidance development, the authors reported that the results of the ‘highly specific’ search would have been sufficient for answering the research question (7). The authors concluded that using highly specific instead of sensitive search strategies is “fully adequate for evidence maps with the aim of covering mainly the breadth rather than depth of a research spectrum”.
Recent research has also suggested that the conventional approach to search methodology – with its focus on sensitive searches of bibliographic databases as the primary method of study identification – may not be optimal for some SRs on complex topics, or in areas other than clinical health. In the context of a SR to evaluate the health benefits of environmental enhancement and conservation activities, Cooper et al (8) compared an approach led by searches of bibliographic databases with an approach led by supplementary search methods. The authors found that extensive bibliographic database searching was of limited value in terms of contribution to synthesis, but that grey literature searching was valuable and identified studies that made unique contributions to both the quantitative and qualitative synthesis. The authors concluded that the approach led by supplementary search methods (where the primary methods of study identification were grey literature searching and contacting experts, supplemented by bibliographic database searches which emphasised precision over sensitivity) was valid when compared with the conventional approach. An investigation in the context of a systematic review of prognostic factors has also assessed the performance of a targeted database search used in conjunction with supplementary search methods (30). The study found that the supplementary search methods were necessary to retrieve the known relevant studies. A broader, more sensitive database search would not have found all of the known studies without also using supplementary search methods. The authors concluded that designing a more precise search strategy for bibliographic databases, and putting more comprehensive efforts into supplementary search methods, was the most efficient way to locate prognosis studies.
Structuring the search
The Cochrane Handbook suggests a search strategy should be structured around the main concepts being examined by the review (1). For reviews of interventions, this can often be expressed using PICO (Patient (or Participant or Population), Intervention, Comparison and Outcome).
It is usually seen as undesirable to include all elements of the PICO in the search strategy as some concepts are often poorly described or non-existent in the title and abstract of a database record or the assigned indexed terms. Frandsen et al (31) investigated the presence of PICO elements in a sample of database records from Embase and PubMed. They reported that PICO elements C and O had lower retrieval potential than elements P and I, and that this was a particular issue when searching for primary and secondary outcomes. The authors concluded that their findings support recommendations not to include outcomes as a search concept. Tsujimoto et al (33) assessed Cochrane reviews that included outcomes in their literature search strategy. They found that approximately 10% of the reviews included terms related to the studies’ outcomes in their search strategies, but that the limitations of this practice were rarely acknowledged. The authors suggested that the systematic reviewers who decide to search for outcomes should both justify this decision and comment on the potential limitations. Further, most reviews that implemented the outcome terms in their search strategies also assessed outcomes that were not included in those strategies. The authors stated that outcomes that were included in search strategies were more likely to have results that favoured the intervention and were statistically significant, when compared with those that were not included.
Frandsen et al (34) evaluated the retrieval potential of each element of conceptual frameworks in the context of health science qualitative systematic reviews. The authors analysed the presence of elements from conceptual frameworks in publication titles, abstracts, and controlled vocabulary in CINAHL and PubMed, using a set of qualitative reviews and their included studies as a gold standard. The authors determined whether particular publications could be retrieved if a specific element from the conceptual framework was used in the search strategy. The authors found high relative recall for patient/population (99%) and research type (97%). The relative recall of intervention/phenomenon of interest was 74.3% and outcome was 78.6%. The relative recall for context was relatively low (61.3%). Relative recall of elements in elements in conceptual frameworks was found to be much lower in the context of qualitative reviews compared to that reported in a study on the effect of using the PICO model to develop search strategies for quantitative reviews (31). Based on their findings, the authors suggest that searching the literature for a qualitative review requires careful planning and maybe even the use of several strategies to compensate for the lower recall of many of the elements of conceptual frameworks.
The Cochrane Handbook states that the selection of concepts should be made on a question-by-question basis. It is asserted that in some cases it is possible and acceptable to search for the comparator, for example if the comparator is explicitly placebo; and in other cases, outcomes can be well-defined and consistently reported in abstracts (1).
For reviews of many interventions a search may reasonably be comprised of the population, intervention, and a study design filter if appropriate (32). A validated search filter is recommended where one exists for the concept of interest (3). In some topic areas, for example complex interventions, where many of the concepts are particularly ill-defined, it may be preferable to use a broader search strategy (such as searching only for the population or intervention) and increase the resources allocated to sifting records (2). Alternatively, searchers may explore a multi-stranded or multi-faceted approach that uses a series of searches, with different combinations of concepts, to try and capture a complex research question (1).
Alternatives to the PICO framework have also been evaluated for searches in some fields; examples include the SPIDER tool to structure searches for qualitative and mixed methods research (9) and the BeHEMoTh tool to structure searches for theory (10). In a structured methodological review on searching for qualitative research, Booth lists 11 different notations for use in this context (including PICO, SPIDER and BeHEMoTh), but states that, as with quantitative reviews, there is little empirical data to support the merits of question formulation (5). In a SR published in 2018, Eriksen and Frandsen investigated whether the use of the PICO model as a search strategy tool affected the quality of a literature search (11). The authors found only three studies which assessed the effect of using the PICO model versus other available models or unguided searching. The authors concluded that no solid conclusions could be drawn about the effect of using the PICO model on the quality of the literature search.
Selecting search terms
The Cochrane Handbook recommends each concept of a robust search strategy should consist of text words together with subject terms, if the latter are available (29). In their study comparing the recall and precision of MeSH terms (or similar controlled vocabulary) versus text-word searching, DeMars and Perruso found that the combination of text-word and MeSH strategies provided the most comprehensive results (37). The choice of free-text terms should include consideration of synonyms, related terms, acronyms and variant spellings. Syntax such as truncation and proximity operators should also be considered when using free-text terms.
Methods for identifying search terms include techniques such as checking the bibliographic records of known relevant studies, consulting topic experts and scanning database subject indexing guides (3). EUnetHTA describes the use of these type of sources as a 'conceptual approach' to search term identification (32). Alternatives to the 'conceptual approach', designed to increase search design objectivity, have been proposed and explored.
Bramer et al evaluated a structured approach where thesaurus terms and synonyms for title / abstract searching were collected from the Emtree thesaurus, combined into a search strategy, and then tested for completeness using an ‘optimization method’ (12). This method involved identifying articles indexed with identified Emtree thesaurus terms but which did not include the synonyms already used in the search strategy in their title or abstract. Relevant terms from the titles and abstracts of these records were then added to the search strategy, and their added value was evaluated in discussion with the researcher who had requested the search. Further optimisation was done by reversing this process: looking for new thesaurus terms in articles where the titles and/or abstracts contained one of the identified synonyms but lacked the thesaurus terms already identified. The authors concluded that the method creates opportunities for faster development of SR search strategies that find more relevant studies than other methods with equivalent search precision.
Text mining is a rapidly developing tool with potential application in a range of tasks associated with the production of SRs, including the identification of search terms (2). AHRQ published a review on the use of text-mining tools as an emerging methodology within SR processes, including the literature search (13). The aim of the AHRQ project was to provide a ‘snapshot’ of the state of knowledge, rather than an in-depth assessment. The review referred to 12 studies where text-mining tools were used for development of ‘topic’ search strategies and identified several general approaches to development. These included assessing word frequency in citations (using tools such as PubReminer or EndNote) and automated term extraction (using tools such as Termine). The review reported that all of the identified studies found benefit in automating term selection for SRs, especially those comprising large unfocused topics. The AHRQ review made no conclusions which were specific to the use of text-mining tools for the literature search process. The general conclusions on the use of text-mining for SR processes were that text-mining tools appeared promising, but further research was warranted.
Studies cited in the AHRQ review included a study by O'Mara-Eves et al which evaluated whether additional search terms for the topic of ‘community engagement’ were generated when using the text-mining data-extraction tool Termine (14) in addition to typical search development techniques. The study authors reported that although in many cases the terms generated by text-mining had already been identified by the reviewers as relevant, text-mining did reveal some useful synonyms and terms associated with the topic that had not previously been considered. The study authors stated that the text-mining approach studied should never be used on its own but alongside usual search development processes. The authors concluded that text mining helped to identify relevant search terms for a broad topic that was inconsistently referred to in the literature.
The use of text-analytic software to identify free text terms and subject headings through frequency analysis has been explored by researchers at the German HTA agency IQWiG in three published studies (15, 16, 21) and the findings debated in related correspondence (17, 18, 19, 20). The most recent paper compared their ‘objective approach’ with the ‘conceptual approach’ (21). The authors reported that the ‘objective approach’ yielded higher sensitivity than the ‘conceptual approach’, with similar precision, and stated that ‘objective approaches’ should be routinely used in the development of high-quality search strategies.
Stansfield et al (22) used a case study of searching to inform a guideline on the care and support of older people with learning disabilities, and other examples, to reflect on the utility of text-mining technologies in improving the precision and sensitivity of search strategies. The technologies investigated include term frequency–inverse document frequency (TF-IDF) analysis and Lingo3G automated clustering tool within EPPI-Reviewer 4.0, Termine, BibExcel, and EndNote. The authors concluded that text mining could aid the discovery of search terms for search strategies for diversely-described topics to support an iterative search strategy development process, and that using multiple tools appeared to be particularly fruitful, though the overriding challenge of finding efficient ways to identify an unknown body of literature for incorporation in SRs still remained.
Paynter et al (35, 36) compared the process of developing MEDLINE strategies using text-mining tools with 'usual practice'. The authors reported that across all reviews, usual practice searches seemed to perform better than strategies developed using text-mining tools, but because of the small sample size, none of these differences was statistically significant. For simple SR topics (i.e., single indication–single drug), strategies developed using text-mining tools were slightly less sensitive, but reduced time spent in search design. For complex SR topics (e.g., multicomponent interventions), strategies developed using text-mining tools were less sensitive than usual practice searches, although they did identify unique eligible citations not found by the usual practice searches. The authors stated that based on the findings of their study, text-mining technology is not ready to be used as the sole process for developing systematic review searches, but the time savings in search design and relatively high sensitivity for complex reviews suggest that this technology may be useful in reviews that do not require maximum sensitivity, such as rapid or scoping reviews. In addition, they stated that text-mining tools are useful in combination with usual practice to find citations missed by the usual search process (whilst acknowledging that adding a text-mining tools step to the 'usual practice' search strategy development process will increase the screening burden and time required for search development).
The Technical Supplement to Chapter 4 of the Cochrane Handbook summarises the potential role of text mining in search term selection and provides examples of the available tools (29). The authors conclude that whilst text mining has great potential in this context, more research is needed to help searchers identify which of the available tools work best and for which types of question. It is also noted that it can be challenging to document and report the use of text-mining for strategy development, and little guidance is currently available.
Combining search terms with Boolean operators and other search syntax
The Cochrane Handbook describes how a search strategy should be built up using controlled vocabulary terms, text words, synonyms and related terms for each concept at a time, joining together each of the terms within each concept with the Boolean ‘OR’ operator. The sets of terms may then be combined with AND which limits the results to those records that contain at least one search term from each of the sets. If an article does not contain at least one of the search terms from each of the sets then it will not be retrieved. Cochrane advise against the use of the NOT operator where possible to avoid inadvertently excluding relevant records (1, 29).
EUnetHTA methods guidance suggests that the use of separate search lines for each subject heading and for free-text terms facilitates the quality assurance of the search strategy by enhancing readability and therefore making it easier to identify and correct errors (32).
The AHRQ manual refers searchers to the PRESS (Peer Review of Electronic Search Strategies) Checklist (23) and states that search strategies should make use of the advanced search techniques such as truncation, wildcards and proximity searching described in the PRESS document (3). In 2015, the PRESS 2015 Guideline Statement was published, which updated and expanded on the previous PRESS publications (24).
Although search strategy development and construction for SRs conventionally aims for sensitivity, researchers have investigated the potential of 'focusing' search terms to reduce the number of search results and therefore screening burden. Focusing techniques which have been investigated include searching with subject headings limited to those with a major focus (major subject headings) and searching using terms in titles and abstracts alone (i.e. not including controlled vocabulary in the search strategy).
In a 2015 report produced by CADTH (25), researchers reran the search strategies reported in HTAs or SRs produced by a range of agencies, varying the use of major Emtree headings. The impact of the changes on the retrieval of the known relevant records (included studies) in the HTAs or SRs was assessed. The authors stated that overall their findings suggested that focusing Emtree headings was likely to reduce already suboptimal sensitivity for only small gains in precision. The report's recommendations for practice stated that searchers who were confident that their strategy was highly sensitive might wish to use focused Emtree terms for the intervention concept of their search. They suggested using caution when considering focusing the Emtree terms for the population concept, when considering focusing Emtree terms in more than two concepts, or when considering focusing terms in non-drug treatment reviews.
In a 2018 study, Bramer et al investigated whether researchers could use 'focused' searches to reduce the screening time burden (26). The original search strategies (designed by a single librarian) from a broad range of SRs were modified in four ways: by searching Embase thesaurus terms as major descriptors; by removing thesaurus terms from the Embase search so that terms were searched in the title and/or abstract fields only; by searching both MEDLINE and Embase thesaurus terms as major descriptors; by searching both MEDLINE and Embase for terms in the title and/or abstract fields only. The authors concluded that if the number of search results retrieved was too high for the project resource context, search strategies in Embase alone or in both Embase and MEDLINE could be focused by searching for thesaurus terms as major descriptors. They stated that this approach 'may not ultimately have negative consequences in SRs', as long as thorough searches in other databases (such as Web of Science) were performed in addition to the MEDLINE and Embase searches. They also stated however that the reduction in search result numbers was likely to be limited. The authors did not recommend searching Embase and MEDLINE using terms in titles and abstracts alone, as this resulted in too many relevant articles being missed.
Checking and testing search strategies
Search strategies should be checked to ensure they are fit for purpose: that they are likely to find relevant studies. This is difficult to ascertain but checking of search strategies can be carried out by expert / peer review (for example, using the PRESS Checklist (23, 24)), comparing against previously published strategies, or by testing that known relevant documents are retrieved by the strategy (3). The Cochrane Handbook cautions that relying on testing strategies against only known documents can cause the strategy to be biased towards known studies and as a result other relevant records may be missed (1). Citation searching and reference checking are suggested as additional useful checks of strategy performance as they may identify documents the searches have already retrieved but were not necessarily known about in advance. If the search strategy is able to retrieve such documents, then this may suggest that its performance is acceptable (1).
Alternatively, more formal testing can be undertaken. Such methods are summarised by Booth, whose brief review identified eight methods for determining optimal retrieval of studies for inclusion in HTAs (27). The review concluded that although numerous methods were described in the literature, there was little formal evaluation of the strengths and weakness of each approach.
Sampson and McGowan developed and assessed a method (Inquisitio Validus Index Medicus) for validation of MEDLINE search strategies (28). The method used a version of the known relevant item approach, testing recall of relevant indexed studies identified through all search methods and indexed in the database being tested. The validation occurred once screening had been completed and the eligible studies were known. Poorly performing search strategies could be amended, re-tested and re-run. New studies identified by the amended search could be screened and any relevant studies could be included in the review. The authors reported that the validation method was robust and was able to demonstrate that the retrieval of relevant studies from MEDLINE in a sample of six updated Cochrane reviews was sub-optimal. The authors concluded that the Inquisitio Validus test was a simple method of validating the search, and could determine whether the search of the main database performed adequately or needed to be revised to improve recall, allowing the searcher an opportunity to improve their search strategy.