Evaluability assessment

An assessment of the extent to which an intervention can be evaluated in a reliable and credible fashion.

This overview is based on a literature review of Evaluability Assessment commissioned by the UK Department of International Development (DFID) in 2012 and published as DFID Working Paper (Davies 2013). The review identified 133 documents including journal articles, books, reports and web pages, published from 1979 onwards. Approximately half of the documents were produced by international development agencies; most of the remaining documents covered American domestic agency experience with Evaluability Assessments (the latter has been more recently summarised by Trevisan and Walser, 2014).

1. What is evaluability?

Amongst international development agencies there appears to be widespread agreement on the meaning of the term “evaluability”. The following definition from the Organisation for Economic Co-operation and Development-Development Assistance Committee (OECD-DAC) is widely quoted and used:

“The extent to which an activity or project can be evaluated in a reliable and credible fashion” (OECD-DAC 2010; p.21)

2. Where is it used?

Evaluability Assessments have been used since the 1970s, initially by government agencies in the United States, and subsequently by a wider range of domestic organisations. International development agencies have been using Evaluability Assessments since 2000. Although the most common focus of an Evaluability Assessment is a single project, Evaluability Assessments have also been carried out on sets of projects, policy areas, country strategies, strategic plans, work plans, and partnerships.

3. What do Evaluability Assessments examine?

The DFID Working Paper (Davies 2013) on Evaluability Assessment identified these dimensions of evaluability:

Evaluability “ in principle”, given the nature of the project theory of change
Evaluability “in practice”, given the availability of relevant data and the capacity of management systems able to provide it.
The utility and practicality of an evaluation, given the views and availability of relevant stakeholders

The overall purpose of an Evaluability Assessment is to inform the timing of an evaluation and to improve the prospects for an evaluation producing useful results. However, the focus and results of an Evaluability Assessment will depend on its timing. Early assessments may have wider effects on long term evaluability but later assessments may provide the most up to date assessment of evaluability.

Design stage

Evaluability assessment focus: Theory of change (ToC)

Evaluability assessment results: improved project design

Inception stage

Evaluability assessment focus: Theory of change (ToC) and data availability

Evaluability assessment results: improved monitoring and evaluation framework

Implementation stage

Evaluability assessment focus: Theory of change (ToC) and data availability and stakeholders

Evaluability assessment results: improved evaluation of terms of reference (ToR)

4. How do you do an Evaluability Assessment?

Two forms of advice are commonly provided. The first is about the sequencing of activities, given in the form of various stage models. The second is about the contents of inquiries, often structured in the form of checklists.

Stage models include largely predictable (but often iterated) steps involving planning, consultation, data gathering, analysis, report writing and dissemination. Two of these are worth commenting on here:

The first relates to the planning stage. An important early step in an Evaluability Assessment is the reaching of an agreement on the boundaries of the task, which has two aspects:

The extent to which the Evaluability Assessment should proceed from a diagnosis of evaluability on to a prescription and then implementation of changes that are needed to address evaluability problems. For example, revision of a theory of change or development of an M&E framework.
The range of project documents and stakeholders that need to be identified and then examined and interviewed respectively. These choices have direct consequences for the scale and duration of the work that needs to be done.

The second relates to the analysis stage, where two tasks can be identified:

At the base is the synthesis of answers from multiple documents and interviews with respect to a specific checklist question. Here, the assessment needs to: (a) focus on the validity and reliability of the data; and then, (b) the identification of the consensus and outlier views.
At the next level is the synthesis of answers across multiple questions within a given evaluability dimension. Here, the assessment needs to: (a) identify any existence of any “obstacle” problems that must be removed before any other progress can be made; and then, (b) assesses the relative importance of all other problems.

Checklists are used by many international agencies, with varying degrees of rigour and flexibility. At best, their use provides an accountable means of ensuring systematic coverage of all relevant issues. The DFID Working Paper synthesised the checklists used by 11 different agencies into a set of three checklists that cover the dimensions of evaluability listed above. These can provide a useful “starter pack” which can be adapted according to circumstances. If an aggregate score on evaluability (or on multiple aspects of evaluability) needs to be calculated, then explicit attention needs to be given to the weighting given to each item on a checklist. It is unlikely that all items will be of equal importance.

5. How much time and money is involved?

The time required to complete an Evaluability Assessment can range from a few days to a month or more. A key determinant is the extent to which stakeholder consultations are required and whether multiple projects are involved. Evaluability Assessments at the design stage may be carried out largely on the basis of desk-based work, whereas Evaluability Assessments prior to a proposed evaluation is much more likely to require extensive stakeholder consultation.

It is the relationship between the cost of an Evaluability Assessment and the cost of an evaluation that is important, rather than its absolute cost. When the proportionate cost of an Evaluability Assessment is high then, correspondingly, large improvements in evaluation results will be needed to justify those costs.

6. When would you not do an Evaluability Assessment?

Some project designs are manifestly unevaluable and some M&E frameworks are manifestly inadequate at first glance. In these circumstances, an Evaluability Assessment would not be needed to make a decision about whether to go ahead with an evaluation. Efforts need to focus on the more immediate tasks of improving project design and/or the M&E framework.

In other circumstances, the cost of a proposed evaluation may be quite small, and thus, the cost-effectiveness of making an additional investment in an Evaluability Assessment may be questionable. On the other hand, with large projects, even those that appear relatively evaluable, investment in an Evaluability Assessment could still deliver cost-effective changes.

7. What are the alternatives to an Evaluability Assessment?

At the design and approval stages of a project, the associated quality assurance processes can include evaluability-oriented questions. The process of Evaluability Assessment can in effect be institutionalised within existing systems rather than contracted as a special event.

At the inception stage, some organisations may routinely commission the development of an M&E framework that should intrinsically address evaluability questions. Or, they may have established procedures for reviewing the M&E system which are more purpose-specific than a generic Evaluability Assessment tool of the kind provided by the DFID working paper.

Prior to a proposed evaluation, some organisations may commission preparatory work that takes on a wider ambit than an Evaluability Assessment. Approach Papers may cover issues listed in Evaluability Assessment checklists but also scan a much wider literature for evidence for and against the relevance and effectiveness of the type(s) of interventions being evaluated.

An example

In 2000, ITAD, a UK consultancy firm, carried out an Evaluability Assessment of 28 human rights and governance projects, funded by the Swedish International Development Cooperation (Sida) in four countries in Africa and Latin America (Poate et al. 2000). This assessment is impressive in a number of respects. Analysis was done with the aid of a structured checklist that helped minimise divergences of treatment by the consultants who worked on the study. Nineteen evaluation criteria were investigated by means of subsidiary questions, and a score given for each criterion. The most common evaluability problems that were found related to unavailability of data, followed by issues of project design including insufficient clarity of purpose and the difficulties of causal attribution. Nevertheless, the authors were able to spell out a range of evaluation methods that could be explored, along with the type of capacity building work needed to address the identified issues. Their report includes a full data set of checklist ratings of all projects on all criteria, thus, enabling others to do further analysis of this experience with other research or evaluation purposes in mind.

Advice for evaluability assessments

The design of checklist can be usefully informed by theory and not just ad hoc or experience-based conjecture. Sources can include relevant evaluation standards, codes of ethics and syntheses of studies of evaluation use.

Checklists weightings have been used by a number of agencies. Because of the diversity of possible approaches to evaluation and specific evaluation contexts, it is hard to justify any universally applicable set of weightings for a given checklist. However, weightings can be assigned “after the fact”(i.e., after a specific Evaluability Assessment has been carried out for a particular project in a given context). Like all good weightings, their use needs to be accompanied by text explanations.

Literature

Planning Evaluability Assessments: A Synthesis of the Literature with Recommendations. (Davies R, 2013)
This report presents a synthesis of the literature on Evaluability Assessments up to 2012. The main focus of the synthesis is on the experience of international agencies and on recommendations relevant to their field of work. The synthesis provides recommendations about the use of evaluability assessments.
A bibliography on evaluability assessment. (Davies R, 2012).
This bibliography includes links to a range of literature related to evaluability assessment.
Evaluability Assessment: Improving Evaluation Quality and Use. (Trevisan M, Walser T, 2014), Sage Publications.
This book summarises a wealth of American domestic agency experience. Stages of an Evaluability Assessment process are described by individual chapters, each of which includes a checklist of issues to examine, along with case examples.

Evaluability assessment

1. What is evaluability?

2. Where is it used?

3. What do Evaluability Assessments examine?

Design stage

Inception stage

Implementation stage

4. How do you do an Evaluability Assessment?

5. How much time and money is involved?

6. When would you not do an Evaluability Assessment?

7. What are the alternatives to an Evaluability Assessment?

An example

Advice for evaluability assessments

Resources

Literature

Guides

Tools

Blogs

Expand to view all resources related to 'Evaluability assessment'

Resource

'Evaluability assessment' is referenced in:

Blog

Theme