Search strategy development

Last revised: 


The Cochrane Information Retrieval Methods Group have published an evidence-based chapter on search methods for the Cochrane Handbook (1), which provides the basis for this summary alongside guidance produced by the Centre for Reviews and Dissemination (2) and the Agency for Healthcare Research and Quality (AHRQ) (3). 

The revised and updated searching chapter of the Cochrane Handbook is in preparation. To avoid duplication of effort with the development of the Cochrane Handbook, appraisals have not been prepared for studies in this chapter. Once the revised Cochrane Handbook is available it will be used to update this chapter.

Sensitivity and precision

In order to retrieve as many studies relevant to the review as possible, and to compensate for the limitations of information source records and indexing, search strategy development and construction for systematic reviews should generally aim for sensitivity (1). Increasing the sensitivity of a search increases the possibility of identifying all relevant studies, but also tends to reduce precision because the number of irrelevant results is increased (1, 2). Sampson et al examined a cross section of 94 health related systematic reviews that reported the flow of bibliographic records through the review process and found that search precision of approximately 3% was typical (4). The number of results retrieved, which therefore must be screened against eligibility criteria, has implications for the resources required to conduct a systematic review. This trade-off between sensitivity and precision should be acknowledged and discussed with the wider review team, and an appropriate balance sought within the context of the resources available.

The emphasis on search sensitivity above precision typically reflects the context of systematic reviews of quantitative research. This emphasis may not be the same in searches developed for different purposes within the HTA context.  In the context of qualitative systematic reviews or qualitative evidence syntheses for example, there is discussion as to whether these type of reviews share the same need as systematic reviews of quantitative research for ‘comprehensive’, ‘exhaustive’ bibliographic database searches (5).  Similarly, in the context of conducting a search to inform an ‘evidence-map’ (where an overview of the extent, nature and characteristics of a research area is of interest) research has indicated that less sensitive searches may be appropriate.  In a study which compared a ‘highly sensitive’ search strategy with a ‘highly specific’ search strategy for an evidence-mapping exercise on diabetes and driving to inform clinical guidance development, the authors report that the results of the ‘highly specific’ search would have been sufficient for answering the research question (6). The authors conclude that using highly specific instead of sensitive search strategies is “fully adequate for evidence maps with the aim of covering mainly the breadth rather than depth of a research spectrum”.

Structuring the search

The Cochrane Handbook suggests a search strategy should be structured around the main concepts being examined by the review.  For reviews of interventions, this can be expressed using PICO (Patient (or Participant or Population), Intervention, Comparison and Outcome).  It is usually seen as undesirable to include all elements of the PICO in the search strategy as some concepts are often poorly described or non-existent in the title and abstract of a database record or the assigned indexed terms. For reviews of many interventions a search may reasonably be comprised of the population, intervention, and a study design filter (1) if appropriate. A validated search filter is recommended where one exists for the concept of interest (3). 

In some topic areas, for example complex interventions, where many of the concepts are particularly ill-defined, it may be preferable to use a broader search strategy (such as searching only for the population or intervention) and increase the resources allocated to sifting records (2). Alternatives to the PICO framework have also been evaluated for searches in some fields; examples include the SPIDER tool to structure searches for qualitative and mixed methods research (7) and the BeHEMoTh tool to structure searches for theory (8). In a structured methodological review on searching for qualitative research, Booth lists 11 different notations for use in this context (including PICO, SPIDER and BeHEMoTh), but states that, as with quantitative reviews, there is little empirical data to support the merits of question formulation (5). Methley et al tested the SPIDER search tool in a systematic narrative review of qualitative literature, comparing it with use of the PICO tool and a modified version of PICO with added qualitative search terms (PICOS) (9). The authors conclude that where comprehensiveness is a key factor the PICO tool should be used preferentially due to the risk of missing relevant studies using the SPIDER tool.

Selecting search terms

The Cochrane Handbook recommends that in order to identify as many relevant records as possible, search strategies should combine subject headings selected from the database’s controlled vocabulary or thesaurus (with appropriate “explosions”) and a wide range of free-text terms (1). The choice of free-text terms should include consideration of synonyms, related terms and variant spellings. 

Methods for identifying search terms have traditionally included techniques such as checking the bibliographic records of known relevant studies, consulting topic experts and scanning database subject indexing guides (3). However, text mining is a rapidly developing tool with potential application in a range of tasks associated with the production of systematic reviews, including the identification of search terms (2). AHRQ have published a review on the use of text-mining tools as an emerging methodology within systematic review processes, including the literature search (10). The aim of the AHRQ project was to provide a ‘snapshot’ of the state of knowledge, rather than an in-depth assessment. The review refers to 12 studies where text-mining tools were used for development of ‘topic’ search strategies and identifies several general approaches to development.  These include assessing word frequency in citations (using tools such as PubReminer or EndNote) and automated term extraction (using tools such as Termine).  The review reports that all of the identified studies found benefit in automating term selection for systematic reviews, especially those comprising large unfocused topics. The AHRQ review made no conclusions which were specific to the use of text-mining tools for the literature search process. The general conclusions on the use of text-mining for systematic review processes were that text-mining tools appear promising, but further research is warranted.

Studies cited in the AHRQ review include a study by O'Mara-Eves et al which evaluated whether additional search terms for the topic of ‘community engagement’ were generated when using the text-mining data-extraction tool Termine (11), in addition to typical search development techniques.  The study authors report that although in many cases the terms generated by text-mining had already been identified by the reviewers as relevant, text-mining did reveal some useful synonyms and terms associated with the topic that had not previously been considered.  The study authors state that the text-mining approach studied should never be used on its own but alongside usual search development processes. The authors conclude that text mining helped to identify relevant search terms for a broad topic that was inconsistently referred to in the literature.

The AHRQ review also cited 2 studies published by researchers at the German HTA agency IQWiG. In the first study, the authors propose an 'objective approach' to strategy development using text analysis methods (12). The authors argue that this method ensures the process of selecting search terms is transparent and reproducible and allows a searcher with little specialist knowledge of the search topic to make decisions on the inclusion of terms that are informed by evidence.  In the second study the authors aim to validate the ‘objective approach’, and conclude that it was noninferior to the standard 'conceptual approach' (13). Subsequent correspondence on this publication (14, 15) and the authors' responses to this correspondence (16, 17) has debated the study’s conclusions and the strengths and limitations of the methods used for this research.  Since the publication of the AHRQ review, IQWiG researchers have published a third paper on their ‘objective approach’, comparing it with the ‘conceptual approach’ (18). The authors report that the ‘objective approach’ yielded higher sensitivity than the ‘conceptual approach’, with similar precision and state that ‘objective approaches’ should be routinely used in the development of high-quality search strategies.

Combining search terms with Boolean operators and other search syntax

The Cochrane Handbook describes how a search strategy should be built up using controlled vocabulary terms, text words, synonyms and related terms for each concept at a time, joining together each of the terms within each concept with the Boolean ‘OR’ operator. The sets of terms may then be combined with AND which limits the results to those records that contain at least one search term from each of the sets. If an article does not contain at least one of the search terms from each of the sets then it will not be retrieved. Cochrane advise against the use of the NOT operator where possible to avoid inadvertently excluding relevant records (1).

The AHRQ manual refers searchers to the PRESS (Peer Review of Electronic Search Strategies) Checklist (19) and states that search strategies should make use of the advanced search techniques such as truncation, wildcards and proximity searching described in the PRESS document (3).  In 2015, the PRESS 2015 Guideline Statement was published, which updated and expanded on the previous PRESS publications (20).

Testing search strategies and deciding when to stop searching

Search strategies should be tested to ensure they are fit for purpose: that they find relevant studies. This is difficult to ascertain but testing of search strategies can be carried out informally by expert review, checking that known relevant documents are retrieved by the strategy, or by comparing against previously published strategies (3). 

Alternatively, more formal testing can be undertaken. Such methods are summarised by Booth, whose brief review identified eight methods for determining optimal retrieval of studies for inclusion in HTAs (21). The review concludes that although numerous methods are described in the literature, there is little formal evaluation of the strengths and weakness of each approach. Sampson and McGowan developed and assessed a method (Inquisitio Validus Index Medicus) for validation of MEDLINE search strategies (22). The method uses a version of the known relevant item approach, testing recall of relevant indexed studies identified through all search methods and indexed in the database being tested. The validation occurs once screening has been completed and the eligible studies are known. Poorly performing search strategies can be amended, re-tested and re-run. New studies identified by the amended search can be screened and any relevant studies can be included in the review.  The authors report that the validation method was robust and was able to demonstrate that the retrieval of relevant studies from MEDLINE in a sample of six updated Cochrane reviews was sub-optimal. The authors conclude that the Inquisitio Validus test is a simple method of validating the search, and can determine whether the search of the main database performs adequately or needs to be revised to improve recall, allowing the searcher an opportunity to improve their search strategy.

One aspect of testing searches is to inform reviewers when searching has retrieved 'enough' studies. There is little research evidence on empirically based 'stopping rules' but methods such as capture-mark-recapture have been explored for developing such rules (23). Capture-mark-recapture has also been reported as being used to evaluate searches by estimating retrospectively their closeness to capturing the total body of literature (24, 25). It involves hand-searching a sample journal and running a search strategy on information sources indexing the same journal. The number of relevant records identified by each process is then used to gain a statistical estimate of what has been missed by all searches conducted (24).

Despite these investigations the ARHQ guidelines state that no currently available method can be easily applied to searches for comparative effectiveness reviews. It is argued that the searcher’s judgement is required to decide whether searching additional sources is likely to result in the retrieval of unique items or whether the search has reached the point of saturation. The decision must balance the desire to identify all relevant studies with the resources available to carry out the search (3).


Reference list