On the way forward for SDG indicator 4.1.1a: setting the record straight

By Silvia Montoya, Director of the UNESCO Institute for Statistics, and Luis Crouch, member of the UIS Governing Board

Following the approval of the SDG monitoring framework in 2017, two comprehensive reviews were scheduled by the Inter-agency and Expert Group on SDG Indicators (IAEG-SDGs), the UN-coordinated group of countries that is charged with indicator development. The first review in 2019/20 focused on indicator methodology. The second, upcoming review in 2024/25 will focus on indicator coverage. Last October, the IAEG-SDGs issued the review criteria: ‘data must be available for at least 40 percent of countries and of the population across the different regions where the indicator is relevant; and a plan for how data coverage will be expanded must be included if current data coverage is below 50 percent’.

In the case of SDG 4, two indicators have coverage below 40%: early childhood development (4.2.1) and youth and adult literacy proficiency (4.6.1). But it is indicator 4.1.1, the percentage of students who achieve the minimum proficiency level in reading and mathematics, that has attracted the most interest. Coverage of indicator 4.1.1 is sufficient at the end of primary (4.1.1b) and the end of lower-secondary education (4.1.1c): 46% of the population and 60% of countries. But it is low at grades 2/3 (4.1.1a), which led the IAEG-SDGs to reclassify it from Tier I to Tier II: 16% of the population and 20% of countries. They come from two cross-national assessment initiatives: LLECE in Latin America (grade 3) and PASEC in francophone Africa (grade 2).

Many viewed this reclassification with alarm because of the signal it might send that early grade learning matters less, even though it is an issue of global significance. A handful of blogs were written to protest – and they almost invariably asked why three other assessments have not been used to report on indicator 4.1.1a. This blog explains the issues with these assessments, recent efforts to address them, and how more countries can report on this indicator.

Three assessments have been proposed as ways to increase coverage

Of the two authors of this blog, one was one of the creators of one of these assessments (Early Grade Reading Assessment, EGRA), and an advisor to the two other assessments (Foundational Learning Module of the Multiple Indicators Cluster Survey, MICS; and the citizen-led assessments of the People’s Action for Learning, or PAL, Network). The other author is responsible for the definition of standards for reporting on the indicator – and a champion of the process to encourage and ultimately convince the IAEG-SDGs to add the early grade level to indicator 4.1.1 in 2018. We are therefore writing with both experience and a sense of responsibility in outlining the issues.

The first thing to note is that these three assessments were not originally designed for global, comparative reporting. Their potential to generate such comparable results has also not been sufficiently documented in a clear, standardized and centralized way. These assessments are administered on a one-to-one basis, either at school (EGRA) or at home (MICS and PAL Network), rather than to a group of children in a classroom.

EGRA and PAL Network assessments were created in the mid-2000s with the original intent to generate policy awareness, almost always on a country-by-country basis, by measuring concrete and easy-to-communicate skills that are precursors to reading with understanding. EGRA was used to evaluate the effectiveness of donor-funded projects, often in selected regions of a country. The PAL Network assessments were citizen-led initiatives intended to put pressure on governments to pay attention to low levels of learning. They both gained popularity and spread. They were often also used in research.

Soon after the SDGs were declared and the concept of a learning outcome indicator was floated, in the mid-2010s, the multipurpose MICS household survey team also decided to develop a module to increase the amount of measurement, at the time when definitions of the minimum proficiency level were being initiated by the UNESCO Institute for Statistics, the custodian agency of indicator 4.1.1.

What are the issues?

Against what seems to be a current of opinion voiced in the above-mentioned blogs, it is important to note that there was never any reluctance to consider the potential of these assessments to be used for global reporting and be an alternative that countries could consider. But it was also noted that issues related to the original intent of these assessments, which affected their design and rigour, may make them unable to withstand the scrutiny of global reporting.

These assessments:

are not backed by evidence that is documented in an agreed-upon and centralized way on how the transparency of each language’s orthography affects reading accuracy and therefore how results would need to be adjusted to make reporting comparable;
tend to measure precursor skills to reading with understanding: the level they assess is below minimum proficiency, according to the globally agreed definition as visually described in the figure;
vary in how they are administered, and such processes are not always centrally documented: for example, whether different assessors in one-on-one assessments reach the same conclusions on children’s learning tends to be not measured, not reported or not reported in a standardized way;
often do not have clear, accessible, and centralized documentation of their sampling (e.g. who was excluded, which children can replace those that have been sampled but cannot take part, whether children that could not be assessed the first time could be approached again etc.), even though such differences in survey design affect results, while many samples are not nationally representative.

What are ways forward?

Some blogs have suggested to ignore these issues – in other words, to ignore the definition of minimum proficiency agreed by global consensus – in order to boost the number of countries that can potentially report. They have pointed to, for instance, how measurement of child mortality is carried out. But while there is some leeway in, say, defining what counts as a live birth, a death is a biologically clear event. In contrast, the accumulation and progression of learning is a long and culturally determined process. Accepting the results of assessments that we positively know are not measuring the minimum proficiency level and are loose in their documentation is not likely to lead to progress. What needs to be done?

A first step to overcome all the issues noted above is detailed criteria for reporting to help guide these and other assessments that are looking more at foundational and precursor skills on how to improve in the future. This work has begun. In early December, the UIS convened a meeting of the Global Alliance to Monitor Learning, a constituent group of the Technical Cooperation Group (TCG) on SDG 4 Indicators, to chart a way forward. Criteria were proposed and then vetted by a Technical Advisory Group (TAG) in early March. Building on the TAG’s feedback the criteria will be refined and published.

It is one thing to see whether an assessment measures well, but a second step is to map their ‘score’ onto the UIS-generated Global Proficiency Framework and Minimum Proficiency Level statements.

A third step envisaged for the future is for the UIS to vet reports to make sure they meet the criteria set out in steps 1 and 2 above.

In parallel to – and independently of – these efforts, countries will need to develop plans how they might report this indicator to UIS. Equipped with those plans that will make firm commitments on how to increase reporting, the UIS can engage the IAEG-SDGs and the UN Statistical Commission in dialogue to argue in favour of reclassifying indicator 4.1.1a back to Tier I.

Finally, in addition to all these steps, it is ultimately each country’s decision and responsibility to choose which eligible assessment it wants to use to report on the indicator – and for organizations associated with particular assessments to support country decisions by providing them with the best possible documentation