52 weeks of BetterEvaluation: Week 34 Generalisations from case studies?

An evaluation usually involves some level of generalising of the findings to other times, places or groups of people.

An evaluation usually involves some level of generalising of the findings to other times, places or groups of people. If an intervention is found to be working well then we could generalise to say that it will continue to work well, or it will work well in another community, or when expanded to wider populations. But how far can we generalise from one or more case studies? And how do we go about constructing a valid generalisation? In this blog, Rick Davies explores a number of different types of generalisation and some of the options for developing valid generalisations.

These questions are prompted by two events. One was my recent reading of the July 2013 issue of Evaluation which reported on the uses of case studies in evaluation. The other was my recent involvement in an advisory committee meeting reviewing the impact of evaluations on government policies. The issue of generalisation came up in both discussions.

In the advisory committee meeting two challenges were discussed, nicknamed Scylla and Charybydis (after the Greek myth in which sailors are forced to choose between passing too close to a whirlpool or a sea monster). One was the risk of producing truisms such as, “it is important to consult with stakeholders when attempting to influence policy” – while this may be true of all policy influencing activities if it does not help us distinguish between successful and unsuccessful attempts it will be of little value. The other challenge was the risk of producing no generalisations at all, other than to say “all success is context-specific”. This type of generalisation is likely to be equally unhelpful.

But there are three more possible types of generalisation to consider. A third type came to mind after the meeting, which could be called evidence-free generalisations. These are claims which look useful, but which can’t be linked back to specific cases within the study. These linkages are important for two reasons. One is to check the validity of the claim, and the other is to have some idea of the coverage of the claim (how many cases it relates to).

A fourth type could be called within-sample generalisations. In the study I mentioned a sample of around 100 evaluations were being examined for their influence on policy, but there are more outside that set, implemented with funding from the same source. Within the 100 evaluations it would be reasonable to expect to find some generalisations that applied to a number of those cases, and it should be possible to make strong claims about these because the generalisations would be able to refer to known cases.

The last type of generalisation could be called beyond-sample generalisations. This is the territory of random sampling and generalisations based on inferential statistics, where findings from a sample of cases are used to generalise to a wide population. Larger sample sizes typically mean more secure generalisations. In the same issue of Evaluation, Yin argues that a specific kind of case based generalisation, called “analytic generalisation” can have a similar scope, reaching beyond the cases themselves. He also refers to an example, in a review of multiple case studies of immunisation practice, in the same volume.

So how can we go about constructing a valid generalisation? It struck me that the process being used by the policy influence study to produce within-sample generalisations could be usefully informed by the literature on Qualitative Comparative Analysis (QCA), a method described in detail by Befani’s article in Evaluation (July 2013). This needn’t involve having to adopt the approach in its entirety, but rather by paying attention to particular concepts, in particular the concept of “causal configurations” – combinations of conditions associated with specific outcome; and the distinctions between necessary and / or sufficient conditions, both of which have been described on the Better Evaluation website.

There is also an important connection that needs to be made with process tracing. This typically involves the examination of events within a specific case, whereas QCA-type analyses involve comparisons across cases. The two methods are more valuable when used in conjunction. Examination of individual cases can prompt initial ideas about what conditions to examine across all cases when looking for common causal configurations. When they are found their validity as causal explanations can then be tested by using various process tracing methods (e.g. hoop tests and smoking gun tests), which make use of the same concepts of causal configurations and necessary and/or sufficient causes. When such tests fail further examination of individual cases may be needed to generate new conditions that then need to be included in a re-analysis of causal configurations.

The QCA perspective provides us with a useful middle ground between Scylla and Charybydis: between overly-inclusive generalisations and the complete avoidance of generalisations. By ensuring that there is an on-going dialogue between within-case analysis and between-case analysis we can also avoid evidence-free generalisations and ensure that within-sample generalisations are as strong as possible.

52 weeks of BetterEvaluation: Week 34 Generalisations from case studies?

'52 weeks of BetterEvaluation: Week 34 Generalisations from case studies?' is referenced in:

Approach

Investing in young and emerging evaluators: Reflections from the 2nd Summer Evaluation Bootcamp in Mongolia

Improving data quality: Early lessons from the Senegal power compact

Introducing the Causal Pathways Resource Hub