52 weeks of BetterEvaluation: Week 40: How to find evidence and use it well

When reviewing evidence for decision-making, the first challenge is deciding how to choose which types of evidence to include.

In this blog, Jessica Hagen-Zanker from the Overseas Development Institute introduces a new approach to literature reviews which combine the rigour of full systematic reviews, without their disadvantages of resource intensiveness and inflexibility.

Printed journal articles piled on a desk

Under pressure to achieve every greater (cost) effectiveness, many donors are increasingly enamoured of Systematic Reviews (SRs), one among a range of tools used to assess the relative effectiveness of different policies and programmes. SRs have been called the ‘cleanest form of research summary’ by the Guardian’s Ben Goldacre: they are rigorous transparent and seemingly objective. However, SRs can also be resource intensive and can miss relevant studies from the so-called ‘grey literature’. In an attempt to find a robust alternative to SRs, Richard Mallett and I have developed a guidance note which proposes an empirically informed approach that is more rigorous and evidence-focused than orthodox literature review, but at the same time more reflexive and user-friendly than SRs. The good news is that we are already receiving a positive reaction from many researchers and policy influencers.

Why systematic reviews fall short

At first sight, SRs seem appealing. They promise a neutral, objective and comprehensive approach to evidence retrieval, assessment and synthesis. However, those who have first-hand experience of conducting and using systematic reviews are increasingly questioning these assumptions (e.g. Mallett et al, Walker et al and Pritchet and Sandefur). The practical challenges have been discussed at length (I won’t go into the detail here). The actual policy relevance of SR findings has also been debated, for example at the Dhaka Colloquium on SRs in International Development. At the core of the critiques is the concern that SRs, if carried out in a rigid and non-reflexive manner, may generate partial and misleading statements about ‘what works’ but are nevertheless seen to be authoritative and trustworthy: a risky prospect for evidence-based policy-making and programming.

Even something that appears as simple as DFID’s how-to-note on the ‘assessment of the strength of evidence’, is neither a straightforward nor a bias-free exercise: any answers to the question of whether a particular research method is appropriate for answering a research question inevitably involves some degree of subjectivity. I would also argue that ranking evidence exclusively on the basis of the methods used to generate it presumes that variations in research quality are primarily determined by the type of method used, rather than how the method was applied. For example, two Randomised Control Trials looking at the same research question may look very different in terms of design, implementation and interpretation of results. Why should classify both of them as high quality, when there may in fact be striking differences in the quality of each study?

While SRs clearly privilege studies scoring high on internal validity (i.e. demonstrating causal relationships), they do so at the risk of neglecting external validity (i.e. findings are transferrable to another context) which undermines their very purpose: to inform decision-makers about ‘what works’ across contexts. I’m inclined to echo Pritchett and Sandefur’s warning: “Avoid strict rankings of evidence. These can be highly misleading. At a minimum, evidence rankings must acknowledge a steep trade-off between internal and external validity. We are wary of the trend towards meta-analyses or ‘systematic reviews’ in development”.

Making literature reviews fit for purpose

Building on our experience of doing systematic reviews and literature reviews, we have been experimenting with doing rigorous, evidence-based literature reviews, which stick to the core principles of ‘full’ systematic reviews (rigour, transparency and a commitment to taking questions of evidence seriously) while allowing for a more flexible, robust and user-friendly (and often cheaper) handling of retrieval and analysis methods. This process is similar to Rapid Evidence Assessments and is outlined in our new guidance note. Aimed to help anyone who is planning to do an evidence-based literature review, we cover the whole literature review process, but place particular emphasis on the need to get the retrieval phase right, using three interrelated tracks: academic literature search; snowballing; and grey literature capture.

Our approach is more sensitive to the realities of finding information within international development and places greater emphasis on locating grey literature and resources not found within the standard peer review channels. Instead of ignoring the subjectivity inherent in different steps of the process, we use it to our advantage, for example by using snowballing to include influential studies in the field.

We also provide alternatives to outright evidence assessment and give examples of how to classify but not grade evidence, for example using infographics. In the example below the reader can obtain an immediate sense of the thematic composition of the evidence reviewed and the size of the evidence base, as well as the kinds of methods used within each thematic category.

This process shouldn’t be set in stone…

We don’t see this as the final word in reviewing evidence: not every review serves the same objective, and retrieval, assessment and synthesis methods should be driven by the nature of the research question – not vice versa. Because different subject areas may be characterised by very different bodies of evidence – particularly in terms of methodologies and data – a degree of reflexivity and innovation may benefit the analysis phase. As reviewers will not know (or at least should not know) what the evidence base looks like until after they have completed the retrieval and screening phases of their review, sticking rigidly to a predetermined assessment and analysis framework may not make sense in many cases. An iterative approach on the other hand, should help rebalance the scale towards usefulness, while not disregarding rigour. We hope you will experiment with the process so that it can help you meet your objective – let us know how you get on!

Resource

Methods for synthesising data across evaluations
These methods answer questions about a type of intervention rather than about a single case – questions such as “Do these types of interventions work?” or “For whom, in what ways and under what circumstances do they work?”

Photo: My research workflow and note taking by Raul P/Flickr

52 weeks of BetterEvaluation: Week 40: How to find evidence and use it well

Why systematic reviews fall short

Making literature reviews fit for purpose

This process shouldn’t be set in stone…

Resource

Reflections on meeting the challenge of communicating the validity of culturally responsive evaluation (CRE) and getting influential voices and changemakers to listen

Reflection on the review of the IEG@50 writing competition on culturally responsive evaluation

The future of evaluation: Young and emerging evaluators as champions of cultural responsiveness