Datasets

2024 (forthcoming in Cliometrica) - "A new data set on learning-adjusted years of schooling"


This paper presents the largest data set on learning-adjusted year of schooling (LAYS), a combination of both years of schoolings and learning outcomes. Our internationally comparable database focuses on both quantity and quality of schooling. Quantity dimension is measured by years of schooling and uses the latest data from Barro and Lee (2017) while quality dimension is taken from linking standardized, psychometrically-robust international and regional achievement tests (PISA, TIMSS, SACMEQ, LLECE, PIRLS, and PASEC) and hybrid tests (EGRA and ASER). The data are available for almost all countries in the World between 1970 and 2020, although the panel is unbalanced. Several findings can be highlighted. A global convergence on both learning outcomes and enrollment occurs since 1970. A very low number of countries perform better over time regarding the quality of schooling, while most countries have a stable level of their learning outcomes. Our LAYS indicator is more predictive for economic growth than standard years of schooling.

Download dataset (external link)

2018 - Global Data Set on Education Quality (1965-2015)  (With Noam Angrist and Harry Patrinos)

This paper presents the largest globally comparable panel database of education quality. The database includes 163 countries and regions over 1965-2015. The globally comparable achievement outcomes were constructed by linking standardized, psychometrically-robust international and regional achievement tests. The paper contributes to the literature in the following ways: (1) it is the largest and most current globally comparable data set, covering more than 90 percent of the global population; (2) the data set includes 100 developing areas and the most developing countries included in such a data set to date -- the countries that have the most to gain from the potential benefits of a high-quality education; (3) the data set contains credible measures of globally comparable achievement distributions as well as mean scores; (4) the data set uses multiple methods to link assessments, including mean and percentile linking methods, thus enhancing the robustness of the data set; (5) the data set includes the standard errors for the estimates, enabling explicit quantification of the degree of reliability of each estimate; and (6) the data set can be disaggregated across gender, socioeconomic status, rural/urban, language, and immigration status, thus enabling greater precision and equity analysis. A first analysis of the data set reveals a few important trends: learning outcomes in developing countries are often clustered at the bottom of the global scale; although variation in performance is high in developing countries, the top performers still often perform worse than the bottom performers in developed countries; gender gaps are relatively small, with high variation in the direction of the gap; and distributions reveal meaningfully different trends than mean scores, with less than 50 percent of students reaching the global minimum threshold of proficiency in developing countries relative to 86 percent in developed countries. The paper also finds a positive and significant association between educational achievement and economic growth. The data set can be used to benchmark global progress on education quality, as well as to uncover potential drivers of education quality, growth, and development. 

Download dataset (external link)

2014 - "International Database on Human Capital Quality : An Update" (With Claude Diebolt and Jean-Luc de Meulemeester)

The aim of this article is to propose a new database allowing a comparative evaluation of the relative performance of schooling systems around the world. We measure this performance through pupils’ achievement in standardized tests. We merge all existing regional and international student achievement tests by using a specific methodology. When compared with other existing databases, our approach innovates in several ways, especially by including regional student achievement tests and intertemporal comparable indicators. We provide a data set of indicators of quality of student achievement for 103 countries/areas in primary education and 111 countries/areas in secondary education between 1965 and 2010. 

Dowload dataset (Excel format)

2007 - "International Database on Human Capital Quality" (with Hatidje Murseli)

In this research work, we have used a methodology which enables us to obtain qualitative indicators of  human capital (QIHC) for 105 countries. This methodology relies on the potential to reconsider survey results comparatively by analysing the results of countries which took part in at least two different surveys. This allowed us to build indicators of comparable data concerning the quality of human capital in numerous countries and between 1964-2005: our results represent a valuable comparison to what has been done so far.  

Download in STATA format:

Dowload cross section dataset

Download panel dataset

Download in Excel format