Rosetta Stone: Improving the global comparability of learning assessments

By Silvia Montoya, Director of the UNESCO Institute for Statistics and Andres Sandoval-Hernandez, Senior Lecturer, University of Bath

International large-scale assessments (ILSAs) in education are considered by many to be the best source of information for measuring and monitoring progress of several SDG 4 indicators. They currently provide information about literacy levels among children and youth from around 100 education systems with  unrivalled data quality assurance mechanisms.

However, while there are many of these assessments, they are not easy to compare, making it hard to assess the progress of one area of the world against another. Each assessment: has a different assessment framework; is measured on a different scale; and is designed to inform decision-making in different educational contexts.

For this reason, the UNESCO Institute for Statistics (UIS) has spearheaded Rosetta Stone. This is a methodological programme led by the International Association for the Evaluation of Educational Achievement and the TIMSS & PIRLS International Study Center at the Lynch School of Education at Boston College. Its aim is to offers a strategy for countries participating in different ILSAs to measure and monitor progress on learning to feed into SDG indicator 4.1.1 in a comparable fashion. This is a pioneering effort, perhaps the first of its kind in the field of learning measurement.

The methodology and first results from this effort have just been published by the UIS in the Rosetta Stone study. It has successfully aligned the findings from the Trends in International Mathematics and Science Study (TIMSS) and the Progress in International Reading Literacy Study (PIRLS) – two international, long-standing sets of metrics and benchmarks of achievement – to two regional assessment programmes:

  • UNESCO’s Regional Comparative and Explanatory Study (ERCE; Estudio Regional Comparativo y Explicativo) in Latin America and Caribbean countries; and
  • the Programme for the Analysis of Education Systems (PASEC; Programme d’Analyse des Systèmes Éducatifs) in francophone sub-Saharan African countries

Using the Rosetta Stone study, countries with PASEC or ERCE scores can now make inferences about the likely score range on TIMSS or PIRLS scales. This allows countries to compare their students’ achievement in IEA’s scale, and specifically for the minimum proficiency level, and so to measure global progress towards SDG indicator 4.1.1. Details of the method used to produce these estimations and the limitations of their interpretation can be consulted in the Analysis Reports. The dataset used to produce Figures 1 and 2, including standard errors, can be found in the Rosetta Stone Policy Brief.

Percentage of students above the minimum proficiency level

Figure a. ERCE and Rosetta Stone scales

Note: ERCE is administered to grade 6 and PIRLS and TIMSS to grade 4 students; MPL = minimum proficiency level.

Figure b. PASEC and Rosetta Stone scales

Note: PASEC is administered to grade 6 and PIRLS and TIMSS to grade 4 students; MPL = minimum proficiency level.

The following are some of the key findings from the analysis:

  • Rosetta Stone opens up endless possibilities for secondary analyses that can help improve global reporting on learning outcomes and facilitate comparative analyses of education systems around the globe.
  • The Rosetta Stone study results for ERCE and PASEC suggest that similar alignment can be established for other regional assessments (e.g. SAQMEC, SEA-PLM, PILNA). This would allow all regional assessments to compare not only to TIMSS and PIRLS but also to each other.
  • As the graphs show, it is important to note that the percentages estimated based on Rosetta Stone are in many cases considerably different from those reported based on PASEC and ERCE scores. In most cases, the percentages are higher when the estimations are based on Rosetta Stone for ERCE and lower for PASEC. These discrepancies could be due to differences in the assessment frameworks, or because of differences in the minimum performance level set by each assessment to represent SDG indicator 4.1.1. For example, while ERCE considers that the minimum performance level has been reached when students can ‘interpret expressions in figurative language based on clues that are implicit in the text’, PASEC considers that it has been reached when students can ‘[…] combine their decoding skills and their mastery of the oral language to grasp the literal meaning of a short passage’.
  • Increasing national sample sizes and adding more countries per regional assessment would further improve the accuracy of the concordance and would allow research to be conducted to explain the observed differences in the percentage of students achieving minimum proficiency when estimated with Rosetta Stone versus ERCE or PASEC.
  • Further reflection about the establishment of the minimum proficiency levels for global and regional studies that best map into the agreed global proficiency level is needed. This would ensure more accurate comparisons of the percentages of students that achieve the minimum proficiency level in each education system.

Both regional assessments and Rosetta Stone play an irreplaceable role in the global strategy for measuring and monitoring progress of SDG indicator 4.1.1 in learning. Together, they enhance the possibilities for deeper analyses at the country level and breadth of global comparisons that can be carried out and, in consequence, improve the quality and relevance of the information available to policymakers.


Leave a Reply