Data for Development

Last changes made: 17th October 2016

This website links to a variety of datasets for empirical development economists, most of which are freely accessible. Since many users will be university students or faculty there are also links to subscription-based databases - marked $$ - since commonly their institutions will provide access to these. There are links to over 280 datasets in the macro section alone. Go to macro and micro sub-pages for direct links.

All the datasets listed are strictly for research purposes only; sources should be referenced and previous work acknowledged.

I split the data links into three separate webpages: the present one, and then separate pages for macro and micro data, where the data are organised by topic. I've lumped together all historical data (19th century or earlier) on the macro page, where you will also find all GIS/geo-spatial data. See my Stata code to analyse macro panel data. Follow me on Twitter @MEDevEcon to get updates. Disclaimer: All the links are provided in good faith and that I cannot take responsibility for pages maintained by external providers. If a link is broken you can typically still find the data if you google some of the details of the blurb provided. Of course over the past few years Our World in Data (Max Roser and his team) has provided a wealth of data (plus visualisation) which are easy to use and download.

Note: Every now and then I receive requests to 'share' my website, i.e. to allow others editorial rights on these pages - I am more than happy to receive suggestions for additional data by email, but since these are my personal webpages I am naturally unwilling to have other people make any changes they want.

My Top 10 Data Links and Tools

As of 13th April 2011, these are my personal favourites:

  1. Stata tool wbopendata which allows you to download entire 'topics' of data from the World Bank's archives. My little do-file helps you to transform these into Stata long format (so that we can carry out panel empirics).
  2. Easy-to-use geo-spatial data (including a GDP measure!) from the G-Econ research project at Yale University.
  3. The latest World Bank World Development Report Conflict, Security and Development comes with a comprehensive data file covering a wide range of sources.
  4. UN ComtradeTools and Stata: Stata Daily blog suggests easy way of getting trade data (using UN ComtradTools) into Stata. See also my simple Stata 10 do-file with additional information.
  5. The Penn World Table (PWT) data compiled by the Center for International Comparison at UPenn (for the last time with version 7!) is still one of the standard resources for development economists.
  6. The World Bank Wealth of Nations dataset provides country-level data on comprehensive wealth, adjusted net saving and non-renewable resource rents indicators.
  7. The GTAP group at Purdue's AgEcon Department not only provides resources and tools for trade analysis but also free data on FDI, migration, CO2...
  8. The disaggregated ACLED (Armed Conflict Location and Events Dataset), compiled by the Centre for the Study of Civil War (CSCW) at the Peace Research Institute Oslo (PRIO).
  9. 'The' Data blog developmentdata.org by Gunilla Petterson at Sussex University.
  10. Another excellent data blog DEVECONDATA by Masayuki Kudamatsu at IIES (Stockholm University).

Back up to the Table of Contents

Latest Additions to MEDevEcon

Note: If you've followed a Twitter link then the data should be listed below. However, searching/browsing in the Micro and Macro sites by topic may be more fruitful. Please also read above disclaimer and additional information. If you're using long-T panel data (i.e. data from a bunch of countries for many years), I've written a few simple Stata routines for empirical analysis (Stata Journal article here, gentle introduction to the panel time series field here). Applications can be found in my research papers here.

2016

17/10/2016 Economic History: The Jordà-Schularick-Taylor Macrohistory Database covers 17 advanced economies since 1870 on an annual basis. It comprises 25 real and nominal variables. Among these, there are time series that had been hitherto unavailable to researchers, among them financial variables such as bank credit to the non-financial private sector, mortgage lending and long-term house prices. The database captures the near-universe of advanced-country macroeconomic and asset price dynamics, covering on average over 90 percent of advanced-economy output and over 50 percent of world output.

2015

17/02/2015 Economic History: The Electronic Repository for Russian Historical Statistics, compiled by Andrei Markevich at Stanford's Hoover Institution, contains a selection of basic indicators of social and economic development within seven broad topics for five historical cross-sections (1795, 1858, 1897, 1959, 2002). Subdivided in twenty-six subtopics these data can be downloaded in excel format. Data are provided for individual regions according to the administrative-territorial division of the Russian state for each cross-section. Note: the data labels/variable names and (seemingly) very detailed descriptions are all in Russian.

17/02/2015 Political Economy: Benjamin Graham at the University of Southern California has created the the International Political Economy Data Resource. "In some respects, the dataset is akin to the Quality of Government Institute's "standard" panel dataset in that it includes many of the same regime-type measures and such. But this dataset has much more of an IPE focus than the QoG data and includes various measures of exchange-rate classifications, financial openness, tariffs/trade policy, membership in international organizations, and such that are not currently in the QoG data. So it doesn't cast as wide a net as the QoG data do, but it delves into IPE-related measures to a greater degree than does any of the QoG datasets."

A working paper associated with the dataset can be found here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2534067

[Thanks to Rob O'Reilly, a political scientist at Emory University, for pointing me to this dataset - the above blurb was written by Rob as well!].

2014

10/10/2014 Intangible Investment OECD: The INTAN-Invest project run by folk at Imperial College, The Conference Board and LUISS including Carol Corrado and Jonathan Haskel has published broad sector-level data for knowledge based capital in a number of OECD countries. The data cover 8 sectors (agri, services, manufacturing, construction, etc.) in 14 countries from 1995 to 2010 and can be downloaded in excel format alongside detailed documentation from the project website.

11/09/2014 Workhorse PolEcon Database: Freedom House is one of the most commonly used data providers for all measures related to political economy. Their current version ranges from 1973 to 2014 and covers 195 countries, data can be downloaded in Excel format. [Thanks to Corcaigh (UCC) Phd student Sean O'Conner for pointing out this oversight]

14/07/2014 More Historical Data: The Department of Economic History at Lund University provides a number of very long time series of prices for agricultural products as well as prices and wages for other industrial sectors. The earliest data are for 1776, and the coverage is typically at the regional level (22 regions - whatever happened to region 9?). Data can be downloaded in excel spreadsheet format.

11/07/2014 Historical Data: The International Institute of Social History based in Amsterdam provides access to a wide variety of historical datasets: wages, prices, and exchange rates for many countries around the world.

09/07/2014 Deposit Insurance: A team of researchers at the IMF and World Bank (Asli Demirgüç-Kunt, Edward Kane and Luc Laeven) have updated some previous database on deposit insurance at private banks: they asked "national officials for information on capital requirements, ownership and governance, activity restrictions, bank supervision, as well as on the specifics of their deposit insurance arrangements." In addition to the data in excel format the link provides access to a working paper which discusses construction and provides some descriptives. Coverage is for over 100 countries in the years 2003, 2010 and 2013. [h/t Andrea Presbitero]

23/06/2014 Climate Classification: The Veterinärmedizinische Universität Wien provides detailed information on the Köppen-Geiger climate classification for the entire world: "Based on recent data sets from the Climatic Research Unit (CRU) of the University of East Anglia and the Global Precipitation Climatology Centre (GPCC) at the German Weather Service, we present here a new digital Köppen-Geiger world map on climate classification for the second half of the 20th century." Data in ASCII, as well as shape format and as grid file for GIS software.

23/06/2014 All the Ginis in one Spot: Branko L. Milanovic, of the World Bank and CUNY, has amalgamated all the different sources for measures of income inequality (Gini Coefficients) in one Stata dataset. Includes data from "Luxembourg Income Study (LIS), Socio-Economic Database for Latin America (SEDLAC), Survey of Living Conditions (SILC) by Eurostat, World Income Distribution (WYD; the full data set is available here), World Bank Europe and Central Asia dataset, World Institute for Development Research (WIDER), World Bank Povcal, and Ginis from individual long-term inequality studies" where the last item is novel to the latest update in 2013. There are also Excel files and documentation at the same link. [via @UrbDemogrphics]

23/06/2014 New WID inequality data: UNU WIDER has updated their World Income Inequality Database (WID) today: "The data covers new countries, and nearly 2000 new observations have been added. The current update includes observations for seven more years, with the latest observations now reaching the year 2012." Follow UNU-WIDER on twitter for updates at https://twitter.com/UNUWIDER.

26/05/2014 Everything on Social Policy: The Quality of Government Institute at the University of Gothenburg publishes the QoG Dataset in Stata, SPSS and csv format. "The aim of the QoG Social Policy Dataset is to promote cross-national comparative research on social policy output and its correlates, with a special focus on the connection between social policy and quality of government (QoG). To accomplish this we have compiled a number of freely available data sources, including aggregated public opinion data." There are three versions: (1) a cross-section with global coverage (2002); and two panels for 40 countries either annual (1946-2009) or 5-yearly (1970-2005). The topics covered are Social policy, Tax system, structural conditions for social policy, Public opinion, Political indicators and Quality of government. ***NEW***This is now also provided in Stata format via two user-written commands.***NEW*** [Thanks to Andrea Presbitero for the pointer]

26/03/2014 Everything on Malaria: The Malaria Atlas Project (MAP) was founded in 2005 and is now led by a team based at the University of Oxford, consisting of Professor Simon Hay, Dr Peter Gething, Catherine Moyes, Professor Dave Smith, Dr Kevin Baird at the Eijkman-Oxford Clinical Research Unit in Indonesia, and Dr Andy Tatem at the University of Southampton. The project website brings together various data to provide detailed information on malaria risk. Data are often available in the form of map (for individual years) or spreadsheets for panels. [This data source was highlighted in a recent paper by Tracy Jones which featured at the annual CSAE conference 2014]

12/03/2014 Set of health-related datasets: Plamen Nikolov, an economics PhD candidate at Harvard, has put together a handout coa number of micro and macro datasets for development economics. The micro datasets have a distinct focus on household and health-related datasets. [Thanks to Plamen for the pointer]

2013

20/11/2013 Public Expenditure: The Statistics of Public Expenditure for Economic Development (SPEED) database, compiled by Washington-based IFPRI (International Food Policy Research Institute - their name somewhat undersells the wide-ranging research carried out in the institution), provides the most comprehensive and publicly available public expenditure information for 147 countries in the following sectors: agriculture, education, health, defense, social protection, mining, transport and communication (as well as these two sectors separately), and total expenditure for the period of 1980-2010. Data is downloadable as a vast Excel spreadsheet with lots of additional documentation. [Thanks to my friend and recent IMF recruit La-Bhus Fah Jirasavetakul for the pointer]

03/11/2013 Stock returns and disasters: Scott Baker and Nick Bloom, both at Stanford University, provide all the quarterly stock market returns and disaster data (in Stata) for 60 countries (including a considerable number of emerging economies) over the 1970-2010 time period used in their uncertainty and growth paper. They also provide the do-files to replicate all empirical results in the paper.

25/10/2013 Civil Conflict: James Feardon at Stanford University provides a number of datasets he created/compiled to analyse civil conflict. His personal website provides the Stata datasets as well as access to the academic paper he has written with various co-authors. [Thanks to my buddy Eoin McGuirk @eoinmcguirk at Berkeley/Trinity Dublin for the pointer]

25/10/2013 Women in Parliament: The Geneva-based Inter-Parliamentary Union (IPU) provides data on Women in National Parliaments from 1997 (archive link) to the present day, covering 188 countries.

[Via the Norwegian Social Science Data Services MacroDataGuide]

25/10/2013 Political Science: The International Institute for Democracy and Electoral Assistance (International IDEA) provides the Unified Database, which covers topics including Direct Democracy, Electoral Justice, Electoral System Design, Gender quotas, Voter Turnout, Voting from Abroad, Electoral Management Design, Political Finance. The Unified Database has global coverage, including data from provinces and previously existing, now dissolved, states. [Via the Norwegian Social Science Data Services MacroDataGuide]

25/10/2013 Fractionalisation: The Fractionalisation dataset, compiled by Alberto Alesina and associates, measures the degree of ethnic, linguistic and religious heterogeneity in various countries. Covering 215 countries (past and present) the dataset contains only one observation for each country. The language and religion indices are based on data from 2001. Most of the data used to compute the ethnic fractionalisation index are from the 1990s, but for some countries older data are used (as far back as 1979). [Via the Norwegian Social Science Data Services MacroDataGuide]

25/10/2013 Roads: Columbia University hosts Nasa's Socioeconomic Data and Applications Center (SEDAC) which provides gROADS, the Global Roads Open Access Data Set (1980 – 2010): "an open access, well documented global data set of roads between settlements using a consistent data model (UNSDI-T v.2) which is, to the extent possible, topologically integrated." [Via the Norwegian Social Science Data Services MacroDataGuide]

25/10/2013 Environment: The Yale/Columbia Environmental Sustainability Index (ESI) is a measure of overall progress towards environmental sustainability. The index provides a composite profile of national environmental stewardship based on a compilation of indicators derived from underlying datasets. They provide access to the ESI data and associated maps for the Pilot 2000, 2001, 2002 and 2005 versions of the ESI. "Note that because of data and methodological improvements to each subsequent version of the ESI, the country scores cannot be utilized in time-series analysis."

[Via the Norwegian Social Science Data Services MacroDataGuide]

25/10/2013 More Electoral Systems: The Democratic Electoral Systems Around the World (DES) dataset, compiled by Nils-Christian Bormann and Matt Golder, describes some of the more important electoral institutions used in legislative and presidential elections around the world in a consistent and comparative manner. In total, the data contain information on 1,197 legislative and 433 presidential elections that occurred in democracies from 1946 (or independence) through 2011. Available in Stata and Excel format with detailed codebook.

[Via the Norwegian Social Science Data Services MacroDataGuide]

25/10/2013 Electoral Systems: Comparative Study of Electoral Systems (CSES) is a collaborative program of research among election study teams from around the world. "The CSES is composed of three parts: first, a common module of public opinion survey questions is included in each participant country's post-election study. These "micro" level data include vote choice, candidate and party evaluations, current and retrospective economic evaluations, evaluation of the electoral system itself, in addition to standardized sociodemographic measures. Second, district level data are reported for each respondent, including electoral returns, turnout, and the number of candidates. Finally, system or "macro" level data report aggregate electoral returns, electoral rules and formulas, and regime characteristics." Covers >50 countries from 1996-2011.

[Via the Norwegian Social Science Data Services MacroDataGuide]

+++ Go to the Macro data website.

25/10/2013 Governance: The Centripetal Democratic Governance dataset was compiled by John Gerring, Strom Thacker and Carola Moreno for a study that examined various political institutions’ impact on the quality of governance. The dataset consists of 42 variables measuring the degree to which government institutions centralise power. In addition, the dataset contains social and economic variables and measures of bureaucratic quality from other sources. The dataset covers 225 countries and territories over the 1960-2011 time horizon. Available in Stata, ASCII and Excel format.

[Via the Norwegian Social Science Data Services MacroDataGuide]

+++ Go to the Macro data website.

25/10/2013 African elections: The African Elections Database (AED) created by Albert C. Nunley provides election data on 48 sub-Saharan countries, from 1990 to 2011. Each country's election page starts with a political profile. The political profile gives an overview of the political leadership and a brief history of the political situation in the country since its independence. A list of political parties (sorted alphabetically by acronym) and coalitions are found at the end of the political profile. [Via the Norwegian Social Science Data Services MacroDataGuide]

+++ Go to the Macro data website.

25/10/2013 Ethnicity GIS data: Nils Weidmann, Jan Ketil Rød and Lars-Erik Cederman from ETH in Zurich have created GREG: the 'Geo-referencing of ethnic groups' dataset employs geographic information systems (GIS) to represent group territories as polygons, covering a total of 8969 polygons, provided in ESRI shapefile format.

+++ Go to the Macro data website.

24/10/2013 African Infrastructure: The African Development Bank provides the Africa Interactive Infrastructure Atlas in form of PDFs and GIS maps, covering ICT (International gateways, Backbone networks, and GSM coverage), Power (Power plants and Transmission network) and Transport (Airports and air traffic, Ports and sea traffic, Roads (condition and traffic), and Railways). For the interactive PDFs you need to download and open them in Acrobat Reader - the interactive element won't work in your web-browser. Raw data and models employed to create the maps are also available for download.

+++ Go to the Macro data website.

24/10/2013 Child Mortality estimates: The UN Inter-agency Group for Child Mortality Estimation (IGME) provides various estimates on their website. These can be downloaded in Excel format and cover almost 200 economies with estimates reaching as far back as 1950 - for the U5MR I looked at there were not just median estimates but upper and lower bounds.

+++ Go to the Macro data website.

07/05/2013 Project-level data on Chinese finance projects in Africa: AidData, a partnership between Brigham Young University, the College of William and Mary, and a non-profit development organization, Development Gateway, has released a new database that captures China's development finance activities in Africa. "This database will provide a foundation for researchers, policymakers, journalists, and civil society organizations to analyze the distribution and impact of Chinese development finance to the region. The database contains nearly 1,700 official finance projects in 50 African countries, totaling over $70 billion in reported financial commitments [...] The dataset uses a media-based data collection methodology developed by AidData, which helps synthesize and standardize vast amoun,t of project-specific information contained in thousands of English and Chinese language media reports." Data can be downloaded in full (excel) or visually analyzed. Right now the database runs from 2000 to 2011.

+++ Go to the Micro data website.

02/05/2013 Maps: A vast repository of digitised maps is now available on the web. "The David Rumsey Map Collection was started over 25 years ago and contains more than 150,000 maps. The collection focuses on rare 18th and 19th century maps of North and South America, although it also has maps of the World, Asia, Africa, Europe, and Oceania." This covers maps from periods of the 1700s to the 1950s.

+++ Go to the Macro data website.

25/04/2013 Banking competition: Sofronis Clerides, Manthos D. Delis and Sotirios Kokas from the Department of Economics, University of Cyprus (Delis is at the University of Surrey) have created a dataset for the estimated degree of competition in the banking sectors of 148 countries over the period 1997-2010. The dataset is contained in tables at the end of their working paper, so a bit of copy and pasting will do the job. You'll find some relevant work on banking regulation and competition (including a 2012 JDE paper) on Delis' personal website.

+++ Go to the Macro data website.

25/04/2013 Macro data: The data aggregation website Quandl provides access to a vast number (they say 5m) data series for countries around the world. This resource picks up data from various well-known sources (e.g. IMF, World Bank) and links to them from their own easy-to-use website. Perhaps the best feature are dedicated programs for R, Python, Matlab, Excel, Maple, Julia, Clojure [they're making up these names, or is there really a stats program called Julia?] and also Stata. The latter can be installed by typing "ssc install quandl" in Stata (see helpfile for syntax). The only downside so far seems to be that you cannot download panel data like for instance in the World Bank WDI Stata command wbopendata. Maybe Felix Leung is already busy coding that feature...

+++ Go to the Macro data website.

02/04/2013 Macroeconomic disasters: Robert Barro at Harvard University provides a number of datasets on macroeconomic disasters related to his own research work on his website. This includes GDP and consumption time series for developing and developed economies from the late 19th century onward. The data can be downloaded in Excel format and related working papers are provided in a separate section of the website.

+++ Go to the Macro data website.

02/04/2013 Credit to the Private Sector: The Bank for International Settlements (BIS) "has constructed long series on credit to the private non-financial sector for 40 economies, both advanced and emerging. Credit is provided by domestic banks, all other sectors of the economy and non-residents. The 'private non-financial sector' includes non-financial corporations (both private-owned and public-owned), households and non-profit institutions serving households as defined in the System of National Accounts 2008. In terms of financial instruments, credit covers loans and debt securities." The data is quarterly from 1940 to 2012 (unbalanced panel) and can be downloaded in excel format alongside detailed documentation.

[Thanks to my buddy Andrea Presbitero at Università Politecnica delle Marche for the pointer]

+++ Go to the Macro data website.

28/03/2013 Bank ownership: Stijn Claessens and Neeltje van Horen from De Nederlandsche Bank have compiled a database with ownership information for 5,324 banks active in 137 countries over the period 1995-2009 (year of establishment can be earlier and is recorded; Banca Monte dei Paschi di Siena is the oldest in this database, established 1472). "It includes for each bank its year of establishment, its year of inactivity, its ownership (foreign or domestic) and if foreign owned the home country of the majority shareholder." Downloadable as an Excel file. Sadly no information about state-share of ownership (e.g. recent nationalisation following the global financial crisis). For detailed description of the database, see Claessens and Van Horen, 2013, "Foreign banks: Trends and impact", Journal of Money, Credit and Banking, forthcoming. [Thanks to my buddy Andrea Presbitero at Università Politecnica delle Marche for the pointer]

+++ Go to the Macro data website.

26/03/2013 Barriers to trade: A World Bank research team including Kee Hiau Looi, Alessandro Nicita and Marcelo Olarreaga has devised Overall Trade Restrictiveness Indices (OTRI) for aggregate trade, as well as manufacturing and agricultural trade separately. For now this measure is only available for 2009, but the team suggests that this will be updated once new data become available. "The Overall Trade Restrictiveness Index (OTRI) summarizes the trade policy stance of a country by calculating the uniform tariff that will keep its overall imports at the current level when the country in fact has different tariffs for different goods. In a nutshell, the OTRI is a more sophisticated way to calculate the weighted average tariff of a given country, with the weights reflect the composition of import volume and import demand elasticities of each imported product." These data as well as demand elasticities are available in excel/CSV format, a number of papers by the authors in the EJ and REStat are also referenced/linked.

+++ Go to the Macro data website.

26/03/2013 Debt, crises, etc.: The companion website to Carmen Reinhart and Ken Rogoff's This time is different: Eight centuries of financial follies provides access to all the great data they compiled for their research. Topics covered include very long time series for debt/GDP ratio, inflation, exchange rate regimes and many more.

+++ Go to the Macro data website.

21/02/2013 Banking regulation: James R. Barth, Gerard Caprio, Jr. and Ross Levine (Auburn, Williams, UC Berkeley) have compiled data on Bank Regulation and Supervision in 180 Countries from 1999 to 2011. "[T]he measures are based upon responses to hundreds of questions, including information on permissible bank activities, capital requirements, the powers of official supervisory agencies, information disclosure requirements, external governance mechanisms, deposit insurance,

barriers to entry, and loan provisioning. The dataset also provides information on the organization of regulatory agencies and the size, structure, and performance of banking systems. Since the underlying surveys are large and complex, we construct summary indices of key bank regulatory and supervisory policies to facilitate cross-country comparisons and analyses of changes in banking policies over time." Me-thinks: Was there no banking regulation before 1999?

+++ Go to the Macro data website.

06/02/2013 World KLEMS: I recently had to opportunity to find out more about the India part of the new World KLEMS project, which like the EU namesake involves the University of Groningen with a number of international partners. Asia, India and Latin America KLEMS are not live yet but China KLEMS, involving the China Industrial Productivity Database 2011 is live. If you're interested in sectoral data it is also worth noting that Margaret McMillan (IFPRI/Tufts) together with Dani Rodrik (Harvard), Jon Temple (Bristol) and Marcel Timmer (Groningen) have started an ESRC/DfID-funded project last year with the aim to construct "a harmonised long-term sectoral dataset for several countries in Sub-Saharan Africa... This dataset will consist of time series information on value added in international prices and employment for ten broad economic sectors for the period from 1960 to 2010." Guess it doesn't harm pointing to my recent work on the analysis of sectoral data vs aggregate data forthcoming in the World Bank Economic Review.

+++ Go to the Macro data website.

06/02/2013 Trade Costs: The World Bank provides the Trade Costs Dataset which contains estimates of bilateral trade costs in agriculture and manufactured goods for the 1995-2010 period. It is built on trade and production data collected in 178 countries. Symmetric bilateral trade costs are computed using the Inverse Gravity Framework (Novy 2009), which estimates trade costs for each country pair using bilateral trade and gross national output.

+++ Go to the Macro data website.

06/02/2013 Prussian Statistics: The ifo Prussian Economic History Database (iPEHD) is a county-level database covering a rich collection of variables for all counties of Prussia during the 19th century. The Royal Prussian Statistical Office collected these data in a number of censuses over the period 1816-1901 (over 600,000 observations), with much county-level information surviving in the archives. These data provide a unique treasure for unprecedented micro-regional empirical research in economic history, analyzing the importance of such factors as education, religion, fertility, and many others for the economic development of Prussia in the 19th century. Excellent documentation is provided. [Thanks to Branko Milanovic (@BrankoMilan) who tweeted about this data source]

+++ Go to the Macro data website.

17/01/2013 Export for Development: A team at the World Bank comprising Tolga Cebeci, Ana M. Fernandes, Caroline Freund, and Martha Denisse Pierola have come up with the Exporter Dynamics Database. This presently covers around 45 developed and developing countries, covering mainly 2003-2009 but also the 1990s for some countries. "It allows for cross-country comparisons of exporters based on factors such as size, survival, growth, and concentration. More countries will be added as the database expands. Until now, most databases focus not on exporting firms, but on the aggregate flow of goods across borders based on countries or products." Melitz will be happy!

+++ Go to the Macro data website.

2012

21/12/2012 Migration to OECD countries: Giovanni Peri of UC Davis has now published the bilateral migration data used in some of his recent work (aka the Ortega-Peri Database). These can be downloaded in Stata format from Giovanni's personal website where the papers are also available. The data cover 1980-2008, 15 migration destinations in the developed world and 221 migration source countries. [Thanks to Chris Parsons at Oxford's International Migration Institute for the pointer; Chris' own efforts have helped to build a decadal bilateral migration matrix which includes developing economies as recipient countries]

+++ Go to the Macro data website.

21/12/2012 More debt data: The World Bank's International Debt Statistics are now available as part of the institution's Open Data Initiative: "high frequency, quarterly, external and public debt data for both high-income and developing countries collected and compiled by the World Bank in partnership with the International Monetary Fund. Now users can not only examine trends in debt flows within the developing world, but also take a closer look at the external debt of high-income countries, and develop a more complete understanding of global financial flows". Picking the standard measure of external debt burden (in % of GNI) I found data from 1970 to 2011 for around 140 countries (unbalanced). A large number of more differentiated data are available, with varying time series and cross-section coverage.

+++ Go to the Macro data website.

28/11/2012 Debt restructuring: Christoph Trebesch at Munich University (LMU) provides data on debt restructuring episodes from 1950-2010 from a research project with Udaibir Das and Michael Papaioannou (IMF). The data can be downloaded in Excel format and provides information on the timing (month/year) of the restructuring, amount, etc. Over 600 episodes are recorded. An accompanying IMF working paper provides details on concepts and reviews the existing literature.

+++ Go to the Macro data website.

22/11/2012 More econ history data: Patrick Manning at the World History Center, University of Pittsburgh, is governor to the World-Historical Dataverse Project, which is "intended to the contribute to creation of a comprehensive set of data on social-scientific, health, and environmental data for the world as a whole and for its constituent regions and localities, for the past four or five centuries." At present a total of nineteen datasets are linked but I imagine this is going to increase soon. [Also check out Manning's article in the Journal of Comparative Economics "Historical datasets on Africa and the African Atlantic" (subscription required) from which the previous entry was taken]

+++ Go to the Macro data website.

21/11/2012 African historical trade data: Data on Anglo-African Trade (1699-1808), originally compiled by Marion Johnson, is available on the Dutch Data Archiving and Networked Services webpages, crediting J. Th. Lindblad at Leiden. "This dataset contains figures on the trade between England and Africa during the period 1699-1808: imports, exports, re-exports and indirect imports. A distinction is made between different trade flows (Londen, outports, re-exports in time and out of time, etc.). Quantities and values are given for 1100 different commodities in the eighteenth century, units (also decimalized) and pounds. Aggregates are given for each year and for each type of trade. The dataset also includes the total trade figures for England between 1700 until 1800. The dataset has been created for research purposes, in order to analyse the trade between England and Africa in the eighteenth century." Documentation is limited and you have to register and log in to get access to these data (in txt format).

+++ Go to the Macro data website.

14/11/2012 Looong-run temperature data: Michael E. Mann, Raymond S. Bradley, and Malcolm K. Hughes provide the data to go with their 1998 Nature article entitled 'Global-Scale Temperature Patterns and Climate Forcing over the past Six Centuries'. There are annual grid-ed temperature data for 1730-1980 and even longer time series going back to the 1400s. [Thanks to James Fenske at Oxford for pointing out this database]

+++ Go to the Macro data website.

07/11/2012 More Economic History: A team of researchers headed by David Eltis and Martin Halbert (both at Emory University in Atlanta) provide a fantastic resource for the empirical analysis of the slave trade: The Trans-Atlantic Slave Trade Database "comprises nearly 35,000 individual slaving expeditions between 1514 and 1866. Records of the voyages have been found in archives and libraries throughout the Atlantic world. They provide information about vessels, enslaved peoples, slave traders and owners, and trading routes. [...] The website provides full interactive capability to analyze the data and report results in the form of statistical tables, graphs, maps, or on a timeline." The dataset contains the 99 variables and is made available in three formats: SPSS (.sav), comma delimited (.csv), and dBase (.dbf). [Thanks to James Fenske at Oxford for pointing out this database]

+++ Go to the Macro data website.

18/10/2012 Economic History: The European State Finance Database is an open repository for economic historians co-managed by Dr D’Maris Coffman (Centre for Financial History, Newnham College, University of Cambridge) and Dr Anne Murphy (University of Hertfordshire). It "represents the outcome of an international collaborative research project for the collection, archiving and dissemination of data on European fiscal history across the medieval, early modern and modern periods." At the moment there are links to around 60 datasets, covering Spanish crown finance, Restoration Excise Receipts (1660-1708) and many other interesting-sounding datasets. Definitely a treasure trove for empirically-minded economic historians. [Thanks to my buddy Mark Koyama for pointing out this database]

+++ Go to the Macro data website.

29/09/2012 FDI for GER/USA/JPN: The Kiel Institute for the World Economy provides very detailed foreign-investment data for three OECD economies, namely Germany, Japan and the United States. The data are annual for 1980 to 2010 and give you the share of each of these three countries' sectoral investment in geographic regions (and a small groups of named countries within each region outside the OECD) as a percentage of total sectoral FDI.

+++ Go to the Macro data website.

29/09/2012 A Smörgåsbord of Data: The somewhat ominously-named Economics Web Institute provides a great deal of data for download, much of it focused on OECD economies, but also some LDC gems. You need to look through the individual links and perhaps download some stuff to see what's on offer, as there is typically not much info on coverage (countries, years, level of (dis-)aggregation). A link here led me to the Princeton-based International Networks Archive, which also provides lots of data, organised under headlines. Many of the datasets assembled seem dated, but are nevertheless worth a look. If links are broken then the 'Sources' information may offer some leads.

+++ Go to the data repositories linked at MEDecEcon.

17/09/2012 Replication data for aid regressions: A fantastic resource for aid empiric fans is provided by AidData: replication data for a vast number of empirical papers related to aid and development (all those Tarp et al, Rajan and Subramanian, Burnside and Dollar, Roodman papers) are linked or provided for download. [Thanks to Paddy Carter at Bristol for the link]

+++ Go to the Macro data website.

14/09/2012 More Data on Financial Development: The World Bank (Cihák, Demirgüç-Kunt, Feyen & Levine) provides the Global Financial Development Database (GFDD) which covers 1960-2010 for 203 countries. "The Global Financial Development Database is based on this 4x2 framework. It builds on, updates, and extends previous efforts, in particular the data collected for the “Database on Financial Development and Structure”, the Financial Access Survey, the Global Findex and Financial Soundness Indicators. The database includes measures of (a) size of financial institutions and markets (financial depth), (b) degree to which individuals can and do use financial services (access), (c) efficiency of financial intermediaries and markets in intermediating resources and facilitating financial transactions (efficiency), and (d) stability of financial institutions and markets (stability). The dataset can be used to document cross-country differences and time series trends." Data can be downloaded in an Excel file and there is additional documentation.

+++ Go to the Macro data website.

28/08/2012 Agriculture in Africa: HarvestChoice, a collaboration between IFPRI and researchers at the University of Minnesota, "generates knowledge products to help guide strategic investments to improve the well-being of poor people in sub-Saharan Africa through more productive and profitable farming." A vast number of datasets related to agricultural production, markets, demography, climate, etc. is available for download from their website. A lot of emphasis is placed on spatial?GIS data with further tools for map-making etc available on the website. There are also a wealth of publications and policy briefs on all topics related to production, R&D and innovation in agriculture (includes analysis of US data).

+++ Go to the Macro data website.

02/08/2012 Services Trade Restrictions: The World Bank’s Services Trade Restrictions Database provides comparable information on services trade policy measures for 103 countries, five sectors (telecommunications, finance, transportation, retail and professional services) and key modes of delivery. "Compared to the vast empirical literature on policies affecting trade in goods, the empirical analysis of services trade policy is still in its infancy. One major constraint has been inadequate data on policies affecting services trade. Our limited knowledge of the pattern of services policy contrasts with the importance of services. Today, some 80 percent of GDP in the United States and the European Union originates from services, and the proportion is well over 50 percent in most countries, industrial and developing alike."

+++ Go to the Macro data website.

18/07/2012 Micro data on migration and development: Britain's ippr in partnership with the Global Development Network (GDN) provides data from a major project on migration and development, aimed to assess migration’s impacts, collect evidence on those impacts, help to build research capacity on migration and development issues in developing countries and examine fresh policy options for improving migration’s contribution to development. Apart from rich qualtitative data the researchers collected new nationally-representative household surveys in Colombia, Fiji, Georgia, Ghana, Jamaica, Macedonia and Vietnam. The final implemented survey questionnaires are also provided alongside the datasets, which are provided in Stata format. [This project was featured in a recent tweet by CGD's Michael Clemens @m_clem]

+++ Go to the Micro data website.

18/07/2012 Historical Prices: David Ormrod, James M. Gibson and Owen Lyne (University of Kent) provide decadal time series data on rent movements in London and the South-East of England for 1580-1914. Their paper lends support to the notion that ‘the city drove the countryside, not the reverse’ in terms of development, which is (said to be) expressed in research by Bob Allen amongst others. The debate seems relevant to development economics today where some people talk about anti-agriculture bias and suggest that because the largest share of the work force is engaged in agriculture this sector must be the focus on development/policy efforts.

+++ Go to the Macro data website. Note that I have placed all historical data on the 'Macro' website, regardless of unit of observation.

16/07/2012 More on Inequality: The World Bank Development Economics Department has developed the Global Income Distribution Dynamics (GIDD), the first global CGE-microsimulation model. "The GIDD takes into account the macro nature of growth and of economic policies and adds a microeconomic—that is, household and individual—dimension to it. The GIDD includes distributional data for 121 countries and covers 90 percent of the world population." The data cover the period 1992 to 2005 (survey year), although most observations are in 2000-2002 and 2005. "The GIDD database is not a mere compilation of secondary cross-country inequality indices. Instead, it is an actual presentation of a truly global income distribution based entirely on household survey data. Additionally, the GIDD global income distribution data includes information on the conditional distribution of important household income determinants like education, age, household size, among others." There is extensive documentation for the data on the website, together with a research agenda and recent work by the group. Download is in excel spreadsheet or as a Stata file [This is used in a recent paper on trade in agriculture and global poverty by Bussolo, De Hoyos and Medvedev (all World Bank) in World Economy Vol.34(12), December 2011.]

+++ Go to the Macro data website.

09/07/2012 Terrorism: The Global Terrorism Database (GTD) compiled by the National Consortium for the Study of Terrorism and the Responses to Terrorism (START) is "an open-source database including information on terrorist events around the world from 1970 through 2010 (with additional annual updates planned for the future) ... [The] GTD includes systematic data on domestic as well as transnational and international terrorist incidents that have occurred during this time period and now includes more than 98,000 cases. For each GTD incident, information is available on the date and location of the incident, the weapons used and nature of the target, the number of casualties, and --when identifiable-- the group or individual responsible." You need to register to gain access to the data. [This is used in research by Walter Enders (the time-series man) and Gary Hoover, both of the University of Alabama, presented at the Chicago AEA and discussed in the AER P&P 102(3), pp.267-272. Hoover, incidentally, has an interesting survey of journal editors about plagiarism/research ethics.]

+++ Go to the Macro data website.

26/06/2012 Inequality: The University of Texas Inequality Project provides the the UTIP-UNIDO dataset which calculates the industrial pay-inequality measures for 156 countries from 1963-2003. It has a total of 3,554 observations based on the UNIDO Industrial Statistics, thus representing a very large cross-section dimension and containing annual data. A paper detailing the methodology of data construction can be found here. [This is used in research by Silke Bumann, a PhD student at Groningen University in the Netherlands]

+++ Go to the Macro data website.

23/05/2012 Human Capital: The Washington-based Education Policy and Data Center (EPDC) "provides global education data, tools for data visualization, and policy-oriented analysis aimed at improving schools and learning in developing countries." They say they have "the world’s largest international education database with over 3.8 millon data points from 200 countries. The data comes from national and international websites including household survey datasets as well as studies and reports." This is not just macro data, but also household surveys and census data; another very useful thing they do is to provide Stata do-files to construct indicators from the hh data.

+++ Go to the Macro data website. +++ Go to the Micro data website.

15/05/2012 Terrorism: Haverford College in the United States hosts the Global Terrorism Resource Database, compiled by Nicholas Lotito (class of 2010), and updated by Katie Drooyan (class of 2011), under the direction of Professor Barak Mendelsohn. Although the bulk of terrorism research findings are presented via traditional literature (e.g. articles, journals, reports, and press releases), this database focuses on other sources. In particular, this database lists sources for raw datasets and databases that combine a significant number of resources (e.g. the US government's Worldwide Incidents Tracking System; the Global Terrorism Database compiled by the National Consortium for the Study of Terrorism and Responses to Terrorism (START) at the University of Maryland; Al-Qa’ida Attacks: 1994-2007 – RAND Corporation; International Terrorism: Attributes of Terrorist Events (ITERATE); University of Oklahoma and the University of Arkansas American Terrorism Database; and many more). The site also hosts the Al-Qaeda Statements Index, a student-created Haverford resource.

+++ Go to the Macro data website.

30/04/2012 Financial Inclusion: The Global Financial Inclusion (Global Findex) Database is a project funded by the Bill & Melinda Gates Foundation to measure how people in 148 countries --- including the poor, women, and rural residents --- save, borrow, make payments and manage risk. The dataset has been compiled by Leora Klapper and Asli Demirguc-Kunt of the World Bank and can be downloaded from the World Bank Open Data website (there are a total of 517 indicators for a max of 164 countries --- at the moment this is for 2011 only).

+++ Go to the Macro data website.

20/04/2012 Global Health Database: The Institute for Health Metrics and Evaluation (IHME) in Seattle has created GHDx, the Global Health Data Exchange. This is an excellent data resource, a "catalog of the world's health and demographic data. Use the GHDx to research population census data, surveys, registries, indicators and estimates, administrative health data, and financial data related to health." Follow IHME on twitter: @IMHE_UW - they've already got 1,200 followers so their tweets are obviously very useful.

+++ Go to the Micro data website.

19/04/2012 Brazilian Data: The Institute for Applied Economic Research (Ipea) in Brazil provides a range of macro data for the country and its regions. The link is for the Portuguese site, there's also an English version. [Thanks to Manoel Bittencourt, Senior Lecturer at the University of Pretoria/South Africa, for the link]

+++ Go to the Macro data website.

19/04/2012 Bolivian Household Survey: The National Statistical Office of Bolivia provides access to a number of demographic and health surveys, as well as income expenditure surveys for the 1989-2009 period. The website is in Spanish and registration (free) is required. [Thanks to Gustavo Canavire-Bacarreza, graduate student at Georgia State in Atlanta, for the link]

+++ Go to the Micro data website.

11/04/2012 Taxes: The World Tax Indicators (WTI) at the International Center for Public Policy, Georgia State University, offers extensive coverage of the Personal Income Tax (PIT), Corporate Income Tax (CIT), and Value Added Tax (VAT)/ Retail Sales Tax (RST) with greater year, country, and tax category coverage than is currently offered via existing data portals. The WTI uses the raw data to develop several tax indicators such as time varying measures of PIT structural progressivity and tax complexity and offers a large representative dataset with variables that are consistent within countries over time. PIT data is already available for download as Excel or Stata files (175 countries or more, depending on measure; up to 25 years of data), including substantial documentation (brief registration required).

+++ Go to the Macro data website.

02/04/2012 Aquastat: The Food and Agriculture Organisation (FAO) of the UN publishes AquaStat which represents a "global information system on water and agriculture, developed by the Land and Water Division. The main mandate of the programme is to collect, analyze and disseminate information on water resources, water uses, and agricultural water management with an emphasis on countries in Africa, Asia, Latin America and the Caribbean." A bit more specifically, the main Aquastat database reports 70 variables under the headings 'Land use and population', 'Climate and water resources', 'Water use (by sector and by source)', 'Irrigation and drainage development' and 'Environment and health' for 5-year intervals from 1958-1962 onwards for a large number of countries. Other databases include the excellent 'Geo-referenced database on dams' and data on 'River sediment yields'. The data can be exported in CSV format.

+++ Go to the Macro data website.

30/03/2012 Mexican Household Survey: The Mexican Family Life Survey (MxFLS) is a multi-thematic and longitudinal database which collects, with a single scientific tool, a wide range of information on socioeconomic indicators, demographics and health indicators on the Mexican population. MxFLS is the first Mexican survey with national representation departing from a longitudinal design, tracking the Mexican population for long periods of time regardless of migration decisions with the objective of studying the dynamics of economy, demographics, epidemiology, and population migration throughout this panel study of at least a 10-year span. The data can be downloaded in Stata format.

+++ Go to the Micro data website.

23/03/2012 Macro indicators and rankings: The (deep breath) European Commission Joint Research Centre's Institute for the Protection and Security of the Citizen have a very nice website of Statistical Sources gathering links to various datasets from a wide range of institutions (FAO, IMF, OECD, UN, World Bank). The FAO databases are particularly interesting, as are the SIPRI data (see 21/3/2012).

+++ Go to the Macro data website.

21/03/2012 Dams in Africa and the Middle East: As part of AQUASTAT the Food and Agriculture Organisation (FAO) of the UN provides databases for over 1,300 dams in Sub-Saharan Africa and over 1,100 dams in the Middle East/Central Asia (excel files with substantial documentation). Each dam is geo-referenced and additional information includes dam height, capacity, reservoir area, river, nearest city, among others.

+++ Go to the Macro data website.

21/03/2012 War and Peace: The Stockholm International Peace Research Institute (SIPRI) has a number of extremely detailed databases related to military expenditure, arms transfers, arms embargos as well as multilateral peace operations. The arms transfers database, for instance, includes trade registers with information on each deal including, inter alia, the suppliers and recipients, the type and number of weapon systems ordered and delivered, the years of deliveries, and the financial value of the deal. Some of the data can be downloaded as excel files, others as Word rich text format.

+++ Go to the Macro data website.

21/03/2012 Russia: The Russia Longitudinal Monitoring Survey (RLMS) is a series of nationally representative surveys designed to monitor the effects of Russian reforms on the health and economic welfare of households and individuals in the Russian Federation. These effects are measured by a variety of means: detailed monitoring of individuals' health status and dietary intake, precise measurement of household-level expenditures and service utilization, and collection of relevant community-level data, including region-specific prices and community infrastructure data. Data have been collected 19 times since 1992. Of these, 15 represent the RLMS Phase II, which has been run jointly by the Carolina Population Center at the University of North Carolina at Chapel Hill, headed by Barry M. Popkin, and the Demoscope team in Russia, headed by Polina Kozyreva and Mikhail Kosolapov. You need to register to get access to the data and describe your research project. In return the website is probably one of the best I've come across to give information about the data and what has been done with it [This link features on Stefania Lovo's website].

+++ Go to the Micro data website.

21/03/2012 Emerging Economies: As part of a project analyzing poverty and social assistance in the transition economies a team at the World Bank under the guidance of Branko Milatovic have created HEIDE (Household Expenditure and Income Data for Transitional Economies), a very large integrated household and individual-level data for nine Eastern European economies in 1993. The (Stata) data covers expenditure, income, assets, household descriptives, individual characteristics and amounts to a total of around 3 million observations. There are files describing variables, data cleaning etc. and a link to a working paper about the project. [This link features on Stefania Lovo's website].

+++ Go to the Micro data website.

18/02/2012 Labour market regulations: Mariya Aleksynska and Martin Schindler at the IMF have created a new database of labor market regulations covering 1980-2005 in 91 countries, including low-, middle- and high-income countries. The database contains information on unemployment insurance systems, minimum wage regulations, and employment protection legislation. [Thanks to my former PhD colleague Bob Rijkers, now at the World Bank, for the link].

+++ Go to the Macro data website.

18/02/2012 Emerging and developing economics: Geert Bekaert and Campbell R. Harvey at Duke have compiled a country risk database which provides 'A Chronology of Important Financial, Economic and Political Events in Emerging Markets' for 55 countries. The data is presented on country-specific websites so you'll have a little copying and pasting to do before you can analyse the data. [Thanks to my former PhD colleague Bob Rijkers, now at the World Bank, for the link].

+++ Go to the Macro data website.

10/01/2012 Long-run growth: Jerry Dwyer at the Federal Reserve Bank of Atlanta provides data from his 2006 Economic Inquiry article with Scott L. Baier and Robert Tamura. This covers output, physical and human capital for 145 countries over a long time horizon (1831-2000); the data provides between 2 and 17 time-series observations per country, with an average of around 7. Additional variables of particular interest include average age and experience of the workforce, which allow for Mincerian wage equation-type analysis at the macro level. The data is provided in a neat excel file with additional information on variable definition and construction also provided (along with the article).

+++ Go to the Macro data website.

2011

28/11/2011 Conflict and Peacekeeping: Page Fortna at Columbia University has a couple of interesting datasets for the analysis of civil war and interstate conflict. 'Peacekeeping and the Peacekept: Data on Peacekeeping in Civil Wars 1989-2004' and 'The Cease-Fires Data Set: The Duration of Peace after Interstate Wars 1946-1994' are provided in Stata format together with some more information on the data. Page's own research papers (on the same site) should also be insightful. [Thanks to Martha Ross at Nottingham for the pointer]

+++ Go to the Macro data website.

25/11/2011 Ghanaian Cocoa Farmers: Fellow CSAE member Andy Zeitlin provides data and background material on a survey of Ghanaian Cocoa Farmers in which he has been involved for a considerable number of years now. The data is now available in 5 waves from 2002 to 2010. Please note: "[t]he data are available in Stata format for public use, and the CSAE is very happy for these to be used. I only ask that you contact me to let me know if you are planning to make use of these data." On Andy's research page (link above) you can find a couple of his papers using this unique dataset.

+++ Go to the Micro data website.

25/11/2011 Management and Motivation in Ugandan Primary Schools: Fellow CSAE member Andy Zeitlin provides data and background material on a project which investigates the impact of strengthening information flows on learning outcomes in rural, government primary schools in Uganda. "The baseline survey includes data collected in 100 schools, in 4 districts. This field exercise included collection of a school-level survey instrument, standardized testing of pupils in P3 and P6, and individual questionnaires administered to a sample of head teachers, teachers, School Management Committee members, and parents. Data from the baseline survey are available in Stata format, together with supporting documentation." You should also check out the papers with my colleague Abigail Barr Andy has written using the data, which are available in the 'Research' section of his website.

+++ Go to the Micro data website.

25/11/2011 Field Experiments in Development: John List at University of Chicago has created a website where he lists "publications and discussion papers in experimental economics that make use of the 'field' in some manner". The information includes a link to the paper, year of publication and sometimes JEL codes. Papers are classified into three categories: "1. Artefactual field experiments, which are the same as conventional lab experiments but with a non-standard subject pool (i.e., non-students). Running Peruvian borrowers through lab games (Karlan, 2005 AER) would be an example of an artefactual field experiment. 2. Framed field experiments, which are identical to artefactual field experiments but with field context in either the commodity, task, or information set that the subjects use. An example would be work that elicits valuations for public goods that occur naturally in the environment of the subjects (see some of Bohm's work). 3. Natural field experiments, which are identical to framed field experiments except that the subjects do not know that they are participants in an experiment. An example could be found among the recent surge in fundraising experiments (see, e.g., List and Lucking-Reiley, 2002, JPE)."

+++ Go to the Micro data website.

18/11/2011 Climate data: A team headed by Cort J. Willmott at the University of California, Los Angeles has put together a website with four to five decades of data on Monthly Air Temperature, Monthly Total Precipitation, Monthly Terrestrial Water Budgets and Monthly Moisture Indices. You'll need some help from a GIS person to get the data transformed.

+++ Go to the Macro data website.

06/11/2011 Human Capital data: A collaborative effort by the IIASA World Population Program and the Vienna Institute of Demography (VID) has reconstructed population data by Age, Gender and Level of Educational Attainment for 120 Countries over the 1970-2000 period. The authors use a method which 'backprojects' the past levels from 2000 data. The files are in excel format and there are a number of working papers with technical details, comparison with observed data, etc. [Thanks to my buddy and human capital wizard Fabio Manca for the link].

+++ Go to the Macro data website.

17/10/2011 Migration in Latin America: Mexico's Universidad de Guadalajara and Princeton University host the Latin American Migration Project (LAMP), "which was created in 1982 by an interdisciplinary team of researchers to advance our understanding of the complex processes of international migration and immigration to the United States." The researchers have conducted surveys in Colombia, Costa Rica, Dominican Republic, El Salvador, Guatemala, Haiti, Nicaragua, Paraguay, Perú and Puerto Rico, each time in various communities. There's a wealth of information on the website, including survey design, questionnaires, etc. The data is available in SAS, SPSS and Stata format for all country studies.

+++ Go to the Macro data website.

17/10/2011 More Macro Data: The Norwegian Social Science Data Services (NSD) have compiled The Macro Data Guide, "An International Social Science Resource" covering many sources with data arranged by country or topic. It seems that coverage is particular strong on topics of political science, including elections, parties, etc (but that's just my perception). For each dataset there is very useful background information on coverage, time span, topics, documentation and when the dataset was last accessed. Definitely a good starting point for any macro data search.

+++ Go to the Macro data website.

14/10/2011 Brazilian Macro Data: Marc Muendler at UCSD has brought together a number of useful tools for the analysis of Brazilian data (and some data, too). This includes various price indices, sectoral FDI (1980-2000), tariffs and exchange rates.

+++ Go to the Macro data website.

18/09/2011 World Top Incomes: Facundo Alvaredo, Tony Atkinson, Thomas Piketty and Emmanuel Saez at Oxford, PSE and Berkeley have created the World Top Incomes Database. "The world top incomes database aims to providing convenient on line access to all the existent series. This is an ongoing endeavour, and we will progressively update the base with new observations, as authors extend the series forwards and backwards. Despite the database's name, we will also add information on the distribution of earnings and the distribution of wealth. Around forty-five further countries are presently under study." This is very much work in progress.

+++ Go to the Macro data website.

04/09/2011 Corporate Taxation: A recent article by Simeon Djankov and co-authors in the AEJ: Macro comes with cross-section data on effective corporate income tax rates in 85 countries (2004). "The data come from a survey, conducted jointly with PricewaterhouseCoopers, of all taxes imposed on "the same" standardized mid-size domestic firm." The authors provide the data in Stata format, together with a do-file.

+++ Go to the Macro data website.

06/07/2011 Migration Data: The Global Bilateral Migration Database compiled by the World Bank provides "global matrices of bilateral migrant stocks spanning the period 1960-2000, disaggregated by gender and based primarily on the foreign-born concept... Over one thousand census and population register records are combined to construct decennial matrices corresponding to the last five completed census rounds". Data for up to 226 countries can be downloaded into an excel file.

+++ Go to the Macro data website.

04/07/2011 Inequality Data: A 2005 IMF working paper by Garbis Iradian (Deputy Director, Africa/Middle East at the Institute of International Finance, Washington) provides inequality data for 82 countries over the period 1965–2003 (the data is averaged over periods of three to seven years). The data is constructed from household surveys.

+++ Go to the Macro data website.

08/06/2011 Mortality/Nutrition/Vaccination: The Complex Emergency Database (CE-DAT) is an international initiative that monitors and evaluates the health status of populations affected by complex emergencies. CE-DAT is managed by the Centre for Research on the Epidemiology of Disasters (CRED), based at the School of Public Health of the Université catholique de Louvain in Brussels, Belgium. The data is at subnational level (building on over 2,000 surveys) and covers 1998-2010 (with gaps). It can be viewed in table format or as a map.

+++ Go to the Macro data website.

28/05/2011 World Bank Data Apps: The following dedicated apps for World Bank data have caught my eye: (1) the winner of the apps competition, StatPlanet, provides tables and maps for individual WDI and other WB data - the most appealing feature is that the maps can be stored as png file, just as if you'd done it with Stata's spmap. (2) Blind Data gives you a quick and easy visual check of the data coverage (years, # of countries) for WDI and other WB data. (3) MDG Maps, which does exactly what it says. (4) Development Timelines provides 'historical context to international development data', i.e. what events took place in the country (Education policy, Economy, Conflict, Other (domestic) and the 'International agenda').

+++ Go to the Macro data website.

20/05/2011 African Infrastructure: The World Bank has a dedicated website for the Africa Infrastructure Country Diagnostic (AICD) which combines data collection and analysis on the status of the main network infrastructures. "The AICD database provides cross-country data on network infrastructure for nine major sectors: air transport, information and communication technologies, irrigation, ports, power, railways, roads, water and sanitation." This is a relatively young data collection effort, with only a few years of data available at the time of writing. Download is via the WB's excellent open data system (view data, download as excel or CSV).

+++ Go to the Macro data website.

20/05/2011 Landmines: The World Bank provides the Landmine Contamination, Casualties and Clearance database. which contains country level data on a broad range of issues related to landmines and cluster munitions, including contamination, casualties and clearance, and their associated cost. The data was compiled from two sources: Landmine and Cluster Munition Monitor and annual surveys by the United Nations Mine Action Service (UNMAS). Coverage is 1999-2009 with annual updates scheduled for October.

+++ Go to the Macro data website.

20/05/2011 Barriers to Trade: The World Bank's Temporary Trade Barriers Database (TTBD) website hosts newly collected, freely available, and detailed data on more than thirty different national governments’ use of policies such as antidumping (AD, 1980s-2010), global safeguards (SG, 1995-2010), China-specific transitional safeguard (CSG, 2002-2010) measures, and countervailing duties (CVD, 1980s-2010). The information provided here in this detailed database will cover over 95% of the global use of these particular import-restricting trade remedy instruments. Information is provided in excel files on a country-by-country basis, given the amount of detail provided for each county. The website also features research reports and meta-information. Chad P. Bown seems to be the person in charge.

+++ Go to the Macro data website.

09/05/2011 Gapminder: The visualisation folk at Gapminder (including multiple Roslings) provide very convenient access to a lot of demographic and health data (HIV/AIDS, birthrates, cancer, ...) alongside other useful development data (aid, trade, employment). "Gapminder is a non-profit venture – a modern 'museum' on the Internet – promoting sustainable global development and achievement of the United Nations Millennium Development Goals... The initial activity was to pursue the development of the Trendalyzer software. Trendalyzer sought to unveil the beauty of statistical time series by converting boring numbers into enjoyable, animated and interactive graphics... In March 2007, Google acquired Trendalyzer from the Gapminder Foundation and the team of developers who formerly worked for Gapminder joined Google in California in April 2007." Poor chaps: New salary = googol*previous salary? The data commonly span several decades and are available for download in excel format (wide). [Thanks to Christoph Lakner at CSAE for the pointer.]

+++ Go to the Macro data website.

05/05/2011 Global Migration: Louis Putterman at Brown University provides another historical dataset, the World Migration Matrix (1950-2000), detailing for each of 165 countries "the proportion of the ancestors in 1500 of that country's population today that were living within what are now the borders of that and each of the other countries." There's a lot of documentation provided to reference all these estimates.

+++ Go to the Macro data website.

05/05/2011 Agricultural Transition: Louis Putterman at Brown University has compiled an Agricultural Transition Year Data Set which provides estimates for "the year when the first significant region within each of 165 present-day countries underwent a transition from reliance mainly on gathered wild and hunted food sources to reliance mainly on cultivated crops (and livestock)." This data is very much in line with the long-run growth theory work coming out of Brown.

+++ Go to the Macro data website.

22/04/2011 Social Policy and Quality of Government: The Quality of Government Institute at the University of Gothenburg publishes the QoG Dataset in Stata, SPSS and csv format. "The aim of the QoG Social Policy Dataset is to promote cross-national comparative research on social policy output and its correlates, with a special focus on the connection between social policy and quality of government (QoG). To accomplish this we have compiled a number of freely available data sources, including aggregated public opinion data." There are three versions: (1) a cross-section with global coverage (2002); and two panels for 40 countries either annual (1946-2009) or 5-yearly (1970-2005). The topics covered are Social policy, Tax system, structural conditions for social policy, Public opinion, Political indicators and Quality of government.

+++ Go to the Macro data website.

18/04/2011 Banking Crises and Measures: The personal website of Luc Laeven (Deputy Division Chief in the Research Department of the International Monetary Fund and Full Professor of Finance at CentER, Tilburg University) carries a number of interesting datasets for cross-country analysis, including the 'Banking Crisis Database (2010)', Crisis resolution database and Deposit Insurance Database, together with some papers he's written describing and analysing the data. [Thanks to my buddy Andrea Presbitero at Università Politecnica delle Marche for the pointer]

+++ Go to the Macro data website.

15/04/2011 World Development Indicators: The World Bank has just published the latest edition of its World Development Indicators, leading up to 2009 or even 2010 for some indicators.

+++ Go to the Macro data website.

15/04/2011 Microdata repository: The World Bank has just created a new Central Microdata Catalog for all the micro-level datasets "in catalogs maintained by the World Bank and a number of contributing external repositories." At the moment of writing this repository includes 378 datasets. Slowly, slowly this Open Data malarky is getting serious...

+++ Go to the Micro data website.

13/04/2011 Human Rights data: The Cingranelli-Richards (CIRI) Human Rights Dataset, hosted by SUNY Binghampton, contains standards-based quantitative information on government respect for 15 internationally recognized human rights for 195 countries, annually from 1981-2009. The data describe a wide variety of government human rights practices (15) including torture, workers' rights, and women’s rights over a 29-year period. This dataset is featured in the World Bank WDR 2011 (and is conveniently included in its dedicated Excel file).

+++ Go to the Macro data website.

13/04/2011 WDR 2011 - Fragile States: The World Bank has recently published its annual World Development Report, which this year focuses on Conflict, Security and Development. A dedicated website makes the data underlying the analysis in the report easily accessible. The excel spreadsheet covers a total of 211 countries, with maximum coverage over the years 1960-2009. The data is not limited to conflict and political economy issues but also covers geography, colonial history and foreign aid among other topics. All of the data is publicly available (and many datasets are featured here on MEDevEcon), but the unique advantage here is bringing a vast number of conflict-related data from dozens of sources (PRIO, UNHCR, Polity IV, etc.) together in a single spreadsheet (and doing a great job documenting the data and sources.

+++ Go to the Macro data website.

08/04/2011 Aid Flow Stats and Visualisation: The World Bank and the Organisation for Economic Co-operation and Development (OECD) "have partnered to make global data on aid funding more easily accessible. Aidflows offers new transparency about the flow of development funds from countries providing aid resources (donors) to countries receiving these funds (beneficiaries). This initiative is part of ongoing efforts to enhance the open access to data and information on development aid." For the moment it seems (conditional on my not being too inept to find the option) that display of data is limited to the last decade - it might be useful to change this given that lots more data is available. There are a lot of graphs and tables, bringing together WB and OECD indicators/data - a useful feature is the link to the WB and OECD data sources, i.e. you get taken to OECD DAC dataset if you want more details on ODA. [This was mentioned in a blog entry by Neil Fathom of the World Bank]

+++ Go to the Macro data website.

06/04/2011 Financial Openness: The Chinn-Ito index (KAOPEN) is an index measuring a country's degree of capital account openness. The index was initially introduced in Chinn and Ito (Journal of Development Economics, 2006). KAOPEN is based on the binary dummy variables that codify the tabulation of restrictions on cross-border financial transactions reported in the IMF's Annual Report on Exchange Arrangements and Exchange Restrictions (AREAER). The dataset is available in the Excel or STATA format. The data file contains the Chinn and Ito index series for the time period of 1970-2007 for 182 countries. [Thanks to Malgorzata Sulimierska at Sussex University for the link]

+++ Go to the Macro data website.

05/04/2011 Humanitarian Aid Tracking: The UN Office for Coordination of Humanitarian Affairs (OCHA) maintains the Financial Tracking Service (FTS). "FTS is a global, real-time database which records all reported international humanitarian aid (including that for NGOs and the Red Cross/Red Crescent Movement, bilateral aid, in-kind aid, and private donations). FTS features a special focus on consolidated and flash appeals, because they cover the major humanitarian crises and because their funding requirements are well defined - which allows FTS to indicate to what extent populations in crisis receive humanitarian aid in proportion to needs... All FTS data are provided by donors or recipient organisations." [this data was featured on the UK Guardian newspaper's Global Development Data website]

+++ Go to the Macro data website.

04/04/2011 EU Trade Data: The European Commission's eurostat COMEXT database covers trade data from 1988 to 2009 (monthly or annual) for trade with the EU or its member countries. There are some restrictions on the maximum number of cells that can be downloaded, though. You may be better off going to COMTRADE and using the help by Mitch Abdon of the Stata Daily blog to combine the UN ComtradeTools and Stata:having installed the software which allows one to download Comtrade data (registration/subscription required for access) there are a number of simple steps to pull this data directly into Stata and save it. The entire process is run from within Stata once everything is installed. Since I had some minor trouble setting up and getting this tool to work I've written a simple Stata 10 do-file with additional information. [the COMEXT data is used in a recent ECB paper by Gabor Pula and Daniel Santabárbara]

+++ Go to the Macro data website.

04/04/2011 Micro Finance Data: Microfinance Information Exchange (MIX) provides MIX Market, "the premier source for microfinance data and analysis. Our mission is to promote microfinance transparency through integrated performance information on microfinance institutions (MFIs), investors, networks and service providers associated with the industry. MIX provides objective data and analysis with the goal of strengthening the microfinance sector." You can go down to the level of an individual MFI project (of which there are currently over 1,800 'registered' with MIX) and download the data on performance, borrowers, etc. or you pick and indicator and can view data for all MFIs over the past 5 years. [via DEVECONDATA by Masa Kudamatsu]

+++ Go to the Macro data website.

01/04/2011 Historical Data: The Center for Financial Stability (CFS) hosts the Historical Financial Statistics, which aims "to be a source of comprehensive, authoritative, easy-to-use macroeconomic data stretching back several centuries. Our target range of coverage is from 1492 to the present, with special emphasis on the years before 1950, which few databases cover in detail." (hm, why start with 1492 if most data are for other countries than North American ones?). The archive, edited by Kurt Schuler, was only started in late 2010, so there are for now a lot of empty spreadsheets in the 'Country' section of the website (which splits statistics into 'Country tables' and 'International tables'). [I found a link to HFS on GMU's David Youngberg's website]

+++ Go to the Macro data website.

31/03/2011 Seminal Cross-Country Panel: The last UPenn PWT has just been published (after 2012 PWT will be jointly maintained by Robert Feenstra at UC-Davis, and Marcel Timmer and Robert Inklaar at the University of Groningen): Penn World Table version 7. The data covers 189 countries and territories for 1950-2009, with 2005 as reference year. The official reference is "Heston, Robert Summers and Bettina Aten, Penn World Table Version 7.0, Center for International Comparisons of Production, Income and Prices at the University of Pennsylvania, March 2011."

+++ Go to the Macro data website.

31/03/2011 Alternative GDPpc measure: Michael Clemens (CGD) and Lant Prichett (HKS) have produced an interesting alternative measure to per capita income/GDP: 'income per natural' — the mean annual income of persons born in a given country, regardless of where that person now resides. The data is a cross-section for 2000 and the related paper is here. I copied that data into an excel file for ease of use.

+++ Go to the Macro data website.

30/03/2011 Demography: A bunch of data from the UN DESA - Population Division, including World Contraceptive Use 2010, International Migrant Stock, World Population Prospects, World Urbanization Prospects (very 'open data', these last three: you can pick a max of 5 countries... Muppets). [Thanks to Jackie Carter for the tweet]

+++ Go to the Macro data website.

30/03/2011 UNCTADstat: The UN body which covers trade and investment, UNCTAD, has created a snazzy website that combines all of its statistical databases: UNCTADstat has lots of data on trade (merchandise, services), FDI flows and stocks (inward FDI from 1970!), external finance (incl. remittances), labour force/employment, global commodity price indices (from 1960!) as well as some more recent rubrics such as the creative and information economies and maritime transport (from around 2000).

+++ Go to the Macro data website.

29/03/2011 Household data in LDCs: Bob Baulch at the Chronic Poverty Research Centre at the University of Manchester has compiled an annotated listing of Household Panel Data Sets in Developing and Transition Countries, featuring among many others the data used for his own work in Pakistan, Vietnam and Bangladesh. The listing is by country and includes information on the waves/years, sample size and major references. [via DEVECONDATA by Masa Kudamatsu]

+++ Go to the Macro data website.

29/03/2011 WHO data: The Global Health Observatory (GHO) database is the World Health Organization's main health statistics repository. You can find a range of health topics like mortality, the burden of disease, infectious diseases, risk factors and health expenditures. I had a quick look at the figures for 'Number of people (all ages) living with HIV' which provides full coverage of mortality rate estimates (i.e. extrapolation/interpolation, etc., distinguished by reporting confidence intervals) for 1990-2009 across a very large number of countries. [referred to in a paper by Paul Calu, World Bank, and Falilou Fall, Sorbonne]

+++ Go to the Macro data website.

28/03/2011 Human Trafficking: Seo-Young Cho (Goettingen), Axel Dreher (Heidelberg) and Eric Neumayer (LSE) have created the 3P Anti-trafficking Policy Index and a dedicated website. Sub-indices cover three policy dimensions: Prosecution, Prevention, Protection; score 1 (worst) 5 (best). Annual data are available for up to 177 countries over the 2000-2009 period.

+++ Go to the Macro data website.

28/03/2011 External Debt: The Joint External Debt Hub (JEDH - pronounced Jedi?) — jointly developed by the Bank for International Settlements (BIS), the International Monetary Fund (IMF), the Organization for Economic Cooperation and Development (OECD) and the World Bank (WB) — brings together external debt data and selected foreign assets from international creditor/market and national debtor sources. The JEDH replaces the Joint BIS-IMF-OECD-WB Statistics on External Debt and brings together 34 data series from the above institutions. Coverage starts in 1990 and can be up to quarterly, for all countries in the world, although this depends on the variable (e.g. 'International Reserves' seemed to have a very good [from 1990] and quarterly coverage but other variables start later, are less frequent and less broad in country-terms). [This dataset was featured in an article by Sarah Bracking in the Google magazine thinkquarterly]

+++ Go to the Macro data website.

26/03/2011 Net Foreign Aid flows: CGD's David Roodman has updated his Net Aid Transfer database at the beginning of this year. "NAT is built from the same underlying DAC data as ODA. The NAT data set includes totals by donor (for 1960–2009), by recipient (1965–2009), and by donor and recipient (1965–2009), all in current and constant dollars. Figures by donor are also available in national currencies. The data tables by donor and recipient are too large to fit in a Microsoft Excel 2003 file, and so are provided as comma-delimited text files in a zip archive." This paper Roodman has written in 2005 is also relevant.

+++ Go to the Macro data website.

26/03/2011 Geospatial gems: "In a world of secrets and closed access to data, it comes as a pleasant surprise to discover that there is a huge quantity of data available to anyone, free of charge. This data has complete world coverage, and an astonishing range of data types all gathered together in one package": Vector Map (VMap) Level 0, provided by mapAbility.com. The VMap Level 0 database provides worldwide coverage of vector-based geospatial data which can be viewed at 1:1,000,000 scale, i.e. 1cm=10km. "Need the national coastlines, elevation contours, roads and railways for any country you can think of? They are there, of course. Populated places, administrative boundaries, inland waterways? There too. But how about the more obscure data types - Lighthouse, Fish Farm, Cease-Fire Line, Oasis, Wharf, Communication Tower? All there as well." [via DEVECONDATA by Masa Kudamatsu]

+++ Go to the Macro data website.

26/03/2011 World Ports: The World Shipping Register provides free access to their World Sea Ports database. For each country its ports' longitude, latitude and time zone are provided, for some port the maximum draft is also provided. Given the geospatial information this data could be used to calculate distance to closest port.

+++ Go to the Macro data website.

25/03/2011 Updated tools in comparative political economy: The Database of Political Institutions (1975-2009), based originally on the 2001 dataset created by Thorsten Beck, George Clarke, Alberto Groff, Philip Keefer, and Patrick Walsh for the World Bank. This is the updated version from December 2010 (Stata 10 file). [thanks to Sarah Brierley who tweets from Accra @sabrierley]

+++ Go to the Macro data website.

25/03/2011 Bilateral trade, FDI, Co2, Migration and more: The Global Trade Policy Analysis group at the AgEcon Department of Purdue University provides a number of datasets related to trade but also climate change and geography. "The GTAP Data Base is a fully documented, publicly available global data base which contains complete bilateral trade information, transport and protection linkages among 113 regions for all 57 GTAP commodities for a single year (2004 in the case of the GTAP 7 Data Base)." Single academic user licenses for GTAP 7 are $520, but a large number of free datasets (including summaries of GTAP, Social Accounting Matrix [SAM] extraction, the Global [bilateral] FDI Dataset, Project on Bilateral Labor Migration, CO2 emissions) can be found here.

+++ Go to the Macro data website.

24/03/2011 Geospatial data on lakes and wetlands: The World Wildlife Fund (WWF) hosts the global lakes and wetlands database (GLWD) which has been developed in partnership with the Center for Environmental Systems Research, University of Kassel, Germany. It is available for download as three separate ArcView layers (two polygon shapefiles and one grid). Further shapefiles for lakes are available here from Natural Earth. [via DEVECONDATA by Masa Kudamatsu]

+++ Go to the Macro data website.

24/03/2011 Geospatial data including travel time and fire probability: The Global Environment Monitoring Unit (GEM), one of seven scientific units that make up the Institute for Environment and Sustainability (IES) at the European Commission's Joint Research Centre (JRC) [so I could have just said: The EC] provides a large number of geo-spatial datasets. Topics include land cover, biodiversity and fire (Global AVHRR fire probability map 1982-1999). One of the gems of this collection (no pun intended) is the global map of accessibility which charts travel time to major cities. Some components underlying the travel time maps are available here. [via DEVECONDATA by Masa Kudamatsu]

+++ Go to the Macro data website.

19/03/2011 Development in the Long-Run: The PBL Netherlands Environmental Assessment Agency provides the History Database of the Global Environment (interestingly, the acronym is HYDE). HYDE presents (gridded) time series of population and land use for the last 12,000 years ! It also presents various other indicators such as GDP, value added, livestock, agricultural areas and yields, private consumption, greenhouse gas emissions and industrial production data, but only for the last century.

+++ Go to the Macro data website.

18/03/2011 Natural Hazards: The U.S. National Oceanic and Atmospheric Administration's National Geophysical Data Center (NGDC) provides "geophysical data from the Sun to the Earth and Earth's sea floor and solid earth environment, including Earth observations from space". This includes data on natural hazards, such as the 'Global Significant Earthquake Database, 2150 B.C. to present' and 'The Significant Volcanic Eruption Database' among others. Other intriguing categories for data are 'Space Weather' and 'Bathymetry' (the study of underwater depth of lake or ocean floors). Download as ArcIMS interactive maps, tab-delimited data files or just plain-old html.

+++ Go to the Macro data website.

17/03/2011 Development Data Archive: The Guardian newspaper (UK) maintains the Global Development Datastore, a searchable database for data and visualisation tools. They also run a blog on data and development.

+++ Go to the Macro data website.

17/03/2011 Terrorism data: Some folk at the University of Maryland have created the Global Terrorism Database (GTD), "an open-source database including information on terrorist events around the world from 1970 through 2008 (with annual updates planned for the future). Unlike many other event databases, the GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 87,000 cases." This includes more than 38,000 bombings, 13,000 assassinations, and 4,000 kidnappings since 1970. Registration required. [This website was first featured on the ever-evolving DEVECONDATA by Masa Kudamatsu]

+++ Go to the Macro data website.

17/03/2011 Data on conflict in Africa: The Robert S. Strauss Center for International Security and Law, at the University of Texas at Austin hosts the Social Conflict in Africa Database (SCAD), "a resource for conducting research and analysis on various forms of social and political unrest in Africa. It includes over 6,000 social conflict events across Africa from 1990 to 2009, including riots, strikes, protests, coups, and communal violence." The entire database can be downloaded as Excel CSV file and contains very detailed information on location, actors, duration etc. of the conflict. This resource is part of the Stauss Center's 'Climate Change and African Political Stability' (CCAPS) program [I found out about this website via Masa Kudamatsu's DEVECONDATA]

+++ Go to the Macro data website.

17/03/2011 More data on conflict: The Political Instability Task Force (PITF) has compiled annual information on each of four types of political instability events for all countries with a total population of 500,000 or greater, covering the period 1955 to the most current year; these events include ethnic wars, revolutionary wars, genocides and politicides, and adverse regime changes (all of these are contained in separate 'problem sets' for download as excel files). The PITF website is hosted by the Center for Global Policy at George Mason University and the funding comes from the CIA. [Thanks to Masa Kudamatsu's DEVECONDATA blog for listing the link]

+++ Go to the Macro data website.

17/03/2011 Drugs and Money: The U.S. State Department International Narcotics Control Strategy Report (INCSR) covers both 'Drug and Chemical Control' as well as 'Money laundering and financial crime'. Archived reports go back to 1996 (all the way to 2010). Note that these are reports, not data - if one were willing to pay an RA to go through the material there's an incredible amount of information on these two topics, given that there are reports for each country and the more recent ones are written in a questionnaire style ("Ability to freeze terrorist assets without delay: YES") so could be easily coded. Note that financial crime does not seem to cover anything that went on in, Iceland, at Wall Street or in the City over the past few years...

+++ Go to the Macro data website.

16/03/2011 Refugee Statistics: The UN High Commissioner for Refugees (UNHCR) publishes a statistical yearbook, covering "Trends in Displacement, Protection and Solutions". It contains statistics on refugees, asylum-seekers, internally displaced persons (IDPs), returnees (refugees and IDPs), stateless persons, among others. From 2000 to 2009 these reports include excel files for download, from 1994-1999 the data tables are contained in pdf files.

+++ Go to the Macro data website.

16/03/2011 Global Data on Communicable Diseases: The World Health Organisation (WHO) offers the Global Health Atlas. "In a single electronic platform, the WHO’s Communicable Disease Global Atlas is bringing together for analysis and comparison standardized data and statistics for infectious diseases at country, regional, and global levels... [The database covers] the major diseases of poverty including malaria, HIV/AIDS, tuberculosis, the diseases on their way towards eradication and elimination (such as guinea worm, leprosy, lymphatic filariasis) and epidemic prone and emerging infections for example meningitis, cholera, yellow fever and anti-infective drug resistance."

+++ Go to the Macro data website.

16/03/2011 Global Data on Health Workers: The World Health Organisation (WHO) offers the Global Atlas of the Health Workforce, which features two datasets: the first, aggregated dataset "includes estimates of the stock (absolute numbers) and density (per 1000 population) of health workers for up to 9 occupational categories." In the second, disaggregated dataset "estimates of the stock of health workers are available for some countries for up to 18 occupational categories, reflecting greater distinction of some categories of workers according to assumed differences in skill level and skill specialization".

+++ Go to the Macro data website.

15/03/2011 Terrorism Reports: The US State Department publishes reports on an annual basis which provide rich information (plenty an RA could be employed to do grounded research on this) on global terrorism: "U.S. law requires the Secretary of State to provide Congress, by April 30 of each year, a full and complete report on terrorism with regard to those countries and groups meeting criteria set forth in the legislation. This annual report is entitled Country Reports on Terrorism. Beginning with the report for 2004, it replaced the previously published Patterns of Global Terrorism." The latter go back to 1995, so in total around 15 years of terrorism data are available here.

+++ Go to the Macro data website.

14/03/2011 Geo-spatial data on environment & climate change: Over a year ago I featured the Center for International Earth Science Information Network (CIESIN) at Columbia's Earth Institute, but only mentioned a small fraction of the data they provide. Apart from the 'gridded population of the world' the Institute features dozens of datasets under the headlines of Agriculture, Biodiversity & Ecosystems, Climate Change, Economic Activity, Environmental Assessment & Modeling, Environmental Health, Environmental Treaties, Indicators, Land Use (LU)/ Land Cover (LC) and LU/LC Change, Natural Hazards, Population, Poverty, Remote Sensing for Human Dimensions Research. The overarching theme for all datasets is environment and climate (change). Since not all data are accessible from the website there's a separate page for downloadable data.

+++ Go to the Macro data website.

13/03/2011 Hydrological data: The Global Runoff Data Centre at BfG (sadly not the Big Friendly Giant but the German Bundesanstalt für Gewässerkunde) provides access to data on global hydrological data. "The initial dataset of monthly river discharge data over a period of several years around 1980 was supplemented with the 'UNESCO monthly river discharge data collection 1965-85'. Today the database comprises discharge data of more than 7.000 gauging stations from all over the world. Since 1993 the total number of station-years has increased by a factor of around 10." 'Standard services' include Freshwater Fluxes into the World Oceans, Major River Basins of the World and Long-Term Mean Monthly Discharges. [this data features in the work by Abhishek Chakravarty at Essex.]

+++ Go to the Macro data website.

13/03/2011 Malaria data: The MARA/ARMA (Mapping Malaria Risk in Africa/Atlas du Risque de la Malaria en Afrique) project has published extensive data related to malaria, including the MARA LITe malaria prevalence data, malaria distribution maps and estimated populations at risk (as 'raw data' and maps); also available are entomological inoculation rates and reported presence/absence of six species of the anopheles gambiae group (to you and me: mosquitoes) in Africa and islands. The website also features a wealth of resources on malaria in Africa. [this data features in the work by Abhishek Chakravarty at Essex.]

+++ Go to the Macro data website.

12/03/2011 Geospatial Data: A vast number of geo-spatial datasets including the Gridded Population of the World and Global Earthquake Hazard Distribution are linked by the Socio-Economic Data and Applications Center (SEDAC) at Columbia University. [thanks to Yanos Zylberberg at PSE for the link]

+++ Go to the Macro data website.

12/03/2011 Subnational Crop Atlas of the World: The International Center for Tropical Agriculture (CIAT) has produced a Crop Atlas of the World: "The map shows derived estimates of the spatial distribution and productivity of crops for 10-km grids using a novel allocation approach involving the fusion of sub-national crop production statistics. The values in this digital [map] are the number of harvested hectares within each 10 km grid cell. This data includes area harvested in multiple season (therefore this is NOT the physical harvested area, but rather the total area harvested) [...] The sub-national crop production data comes from agricultural censuses and surveys and has scaled values, so as to obtain national production estimates that were compatible with the annual average FAO national crop statistics for 1999-2001. The prototype crop distribution database used in this study is available from the authors upon request but is currently being regenerated using newer and additional data sources (including revisions based on expert validation) and an enhanced allocation algorithm." If you have Google Earth you can look at these data maps.

+++ Go to the Macro data website.

11/03/2011 This time really all the WB data: So -wbopendata- allows you to download World Bank data directly into (and from) Stata... I added a little Stata 10 do-file to this which allows you to bring a data 'topic' (example here: energy) with multiple variables/'indicators' into Stata long format. Otherwise the data is in wide-format, which doesn't allow us to get started immediately without some difficulty.

+++ Go to the Macro data website.

11/03/2011 UN COMTRADE data in Stata: Mitch Abdon of the Stata Daily blog recently suggested a way of combining the UN ComtradeTools and Stata. Comtrade is the International Merchandise Trade Statistics (IMTS) of the UN, which records item-level trade for all countries in the world and contains around 1.8 bn observations from 1962 onwards. Access to this data is free, but for technical reasons a maximum of 50,000 observations per query (even more reason to use the Stata Daily application). Having installed the software which allows one to download Comtrade data (registration/subscription required for access) there are a number of simple steps to pull this data directly into Stata and save it. In fact, the entire process is run from within Stata once everything is installed. Since I had some minor trouble setting up and getting this tool to work I've written a simple Stata 10 do-file with additional information.

+++ Go to the Macro data website.

10/03/2011 All the WB data: The wbopendata command (in Stata: findit wbopendata) written by a group of World Bank folk headed by Joao Pedro Azevedo allows Stata users to download thousands of indicators from the World Bank databases, including: Development Africa Development Indicators; Doing Business; Education Statistics; Enterprise Surveys; Global Development Finance; Gender Statistics; Health Nutrition and Population Statistics; International Development Association - Results Measurement System; Millennium Development Goals; World Development Indicators; Worldwide Governance Indicators. These indicators include information from over 256 countries and regions, since 1960. Instead of downloading one variable at a time you can specify one of 16 'topics' (e.g. Aid Effectiveness, Science & Technology, Infrastructure) to obtain all relevant variables for all countries and time periods.

+++ Go to the Macro data website.

10/03/2011 Amazing Geo-Data: The G-Econ research project at Yale University is devoted to developing a geophysically based data set on economic activity for the world. The current data set (GEcon 3.3) is now publicly available and covers "gross cell product" for all regions for 1990, 1995, 2000, and 2005 and includes 27,500 terrestrial observations. The basic metric is the regional equivalent of gross domestic product. Gross cell product (GCP) is measured at a 1-degree longitude by 1-degree latitude resolution at a global scale. Updates will be posted as they become available. The project director is Professor William Nordhaus, Yale University. The GEcon 3.4 (Aug 2010) spreadsheet has over 27,000 entries (cells) and there are at least two time-series data points for GCP of various types (non-mineral, mineral) and more for cell population. [via Masa Kudamatsu's DEVECONDATA blog]

+++ Go to the Macro data website.

10/03/2011 Natural Resource Rents: The World Bank Wealth of Nations dataset provides country-level data on comprehensive wealth, adjusted net saving and non-renewable resource rents indicators. It presents a set of “wealth accounts” for over 150 countries for 1995, 2000, and 2005 which allows a longer-term assessment of global, regional and country performance in building wealth. Adjusted Net Saving (takes into account CO2 damages, natural resource depletion etc.) and non-renewable resource rent (oil, gas, tin, copper, etc.) indicators are calculated annually from 1970 to 2008.

+++ Go to the Macro data website.

10/03/2011 Journal Data Archives: Many academic journals in economics and development economics now require authors to post their data and/or code on dedicated websites (or less handy next to the article overview in the contents overview, below indicated as 'browse') for replication and exploration. This includes: The American Economic Journal: Applied Economics (browse), The American Economic Journal: Macroeconomics (browse), The American Economic Journal: Microeconomics (browse), The American Economic Review (browse), Econometrica (from Vol.72, 2004), The Journal of Applied Econometrics (from Vol.3, 1988), The Journal of Development Economics (browse), The Journal of Development Studies (browse), The Review of Economics and Statistics (from Vol.92, 2010) and The Review of Economic Studies (browse). The JAE also has a 'Replication Section'! Note that sadly some 'supplementary materials' for the above journals are merely online appendices and summary statistics, not data (e.g. for the JDS I couldn't find any data). Many of the above-linked sites are linked to subscriptions. Sometimes you can get lucky and find the data (ideally in 'raw' format, not when all the observations that would destroy the result have been dropped) on individual academics' websites. Especially top US departments (Harvard, MIT amongst many others) seem to push their members to make this available, with some UK/Europeans following suit.

10/03/2011 School Enrolment and Completion: The UNESCO Institute for Statistics publishes historical time series data for key indicators of school enrolment and completion (gross enrolment ratios, repetition and completion rates) covering pre-primary to tertiary education. They are reported on a roughly five-year basis since 1970 (some countries more frequently). As far as I could see most of this data ends in the late 1990s... but the other data provided by UNESCO (UIS Data Center) begins at the same period - not sure why they didn't bring these together.

+++ Go to the Macro data website.

09/03/2011 Remittance Flows: The World Bank publishes the Migration and Remittances Factbook (2011) as part of the OpenData initiative. This covers inflows and outflows of remittances from 1970 to 2009 (+2010 estimated) for basically all countries in the world (naturally: lots of missing observations, but from the mid-1970s onwards the data coverage is pretty impressive).

+++ Go to the Macro data website.

09/03/2011 A wealth of conflict data sources: The Integrated Network for Societal Conflict Research (INSCR) was established to coordinate and integrate information resources produced and used by the Center for Systemic Peace, based in Vienna, Virginia. They provide a wealth of datasets: Forcibly Displaced Populations (1964-2008), Major Episodes of Political Violence, (MEPV, 1946-2008), PITF State Failure Problem Set (1955-2009), High Casualty Terrorist Bombings (1992-2010), Memberships in Conventional Intergovernmental Organizations (1952-1997), Polity IV (1800-2009), Coups d'Etat (1946-2009), State Fragility Index and Matrix Time-Series Data (1995-2009), Crime in India: Riots, Murders, and Dacoity (1954-2006), India Sub-National Problem Set (1960-2004). The INSCR data resources cover all independent countries with a total population of 500,000 people in 2008 (163 countries in 2009). Most of the data are regularly updated and can be downloaded in SPSS and Excel format. [I found out about this resource through a paper by Olaf de Groot (DIW) and Anja Shortland (Brunel)]

+++ Go to the Macro data website.

09/03/2011 Arrrgh - Piracy data: The International Maritime Bureau's (IMB) Piracy Reporting Centre (err, PRC) logs incidents of piracy. The data is used by Olaf de Groot (DIW) and Anja Shortland (Brunel) in an aptly entitle paper on 'Gov-arrrgh-nance - Jolly Rogers and Dodgy Rulers' (to be presented at the RES 2011 conference at Royal Hollway next month; link to paper here). They write "The IMB provides narratives on all incidents of piracy reported (voluntarily) by captains and ship-owners as well as annual counts of incidents of piracy for each country" and make a number of suggestions/changes as to the way piracy incidents are coded. Data is from 1997 to 2009.

+++ Go to the Macro data website.

25/02/2011 Education in Pakistan: The Learning and Educational Achievement in Punjab Schools Survey (LEAPS) project is run by "the World Bank, Pomona College and Harvard University in collaboration with the Government of Punjab and highly trained local counterparts". "The LEAPS Survey consists of data from 823 schools in 112 villages in 3 districts of Punjab. [...] To measure learning outcomes, the LEAPS project administered detailed exams on English, Math, and Urdu to students in Grade III, then followed those same children and tested them again in Grade IV, Grade V, and Grade VI. Teachers were also tested and given extensive surveys so that child-learning outcomes could be linked to teacher qualifications, and parents were surveyed to provide information on educational contributions made at home."

+++ Go to the Micro data website.

25/02/2011 Agricultural production in Ethiopia: The Washington-based International Food Policy Research Institute (IFPRI) has an interesting data set for Ethiopia which combines a household survey with a plot-level survey. The title of the project was "Policies for sustainable land management in the Ethiopian Highlands dataset 1998-2000" and the data is in SPSS format.

+++ Go to the Micro data website.

23/02/2011 Lots of historical datasets: Joerg Baten, a professor for economic history at Tuebingen University (or as we folk from nearby Metzingen would say: Gogenhausen) provides a wealth of historical data on the website for his chair. One data hub provides height measures ("Data on heights and the biological standard of living are among the most important sources of information in social- and economic-historical research, especially for the pre-statistical period") for Germany, the US, Austria, and a number of other countries. The second data hub is entitled 'Firms and Capital Markets' and offers stock exchange data data from Germany, Russia, the US, England and China starting from the early 19th century. Users need to register to access the data and are also encouraged to deposit their own historical datasets.

+++ Go to the Macro data website.

23/02/2011 Historical stock market data: The Yale School of Management has a dedicated website for Historical Financial Research Data which includes the Shanghai Stock Exchange project (during the nineteenth and beginning of the twentieth centuries) and data for the famous South Sea Bubble: "The South Seas Bubble 1720 Project is a collection of stock prices for a large number of the traded companies in 1720. These include Dutch firms quoted in markets in the Netherlands, British firms quoted in the Netherlands, and some previously unstudied British firms quoted in London."

+++ Go to the Macro data website.

22/02/2011 Multi-country micro data on job flow and productivity: Eric Bartelsman (VU Amsterdam), John Haltiwanger (U Maryland) and Stefano Scarpetta (OECD) have created a unique dataset for sectoral productivity and job flow analysis in a number of developing and emerging economies. "The job flow measures are available at a country, sector, size, and year level of observation and the productivity measures are available at a country, sector, year level of observation. As described in detail in the documentation, available measures include not just first moments but higher moments including measures of dispersion and covariances. For example, the job flow measures permit decomposing net employment growth at a disaggregated level into job creation, job destruction as well as the contribution of entry and exit to job creation and job destruction. The data were produced from a series of projects funded by the OECD, the World Bank and other sources." The datasets and code (Stata, SAS) and detailed documentation are all downloadable in one zipped folder.

+++ Go to the Macro data website.

21/02/2011 International Data on Educational Achievements: The Lynch School of Education at Boston College provides two unique resources for comparative analysis of educational achievements: (i) the Trends in International Mathematics and Science Study (TIMMS), which "is the largest and most ambitious international study of student achievement ever conducted" and has data from 40 countries in 1995 and a partially overlapping sample for three more recent waves (next wave is 2011); (ii) the Progress in International Reading Literacy Study (PIRLS), which has waves in 2001, 2006 and 2011 (forthcoming), evaluating 150,000 fourth graders (9- and 10-year-olds) in thirty-five (2001) and fourty-odd (2006) countries. Some of these are middle-incoem countries (e.g. TTO, MAR, IND, IRN).

+++ Go to the Macro data website.

21/02/2011 Historical Data on Primary Education: Quite a number of years ago Aaron Benavot, now at SUNY Albany and Phyllis Riddle at St Vincent College, PA, wrote an article entitled The Expansion of Primary Education, 1870-1940: Trends and Issues, which provides new estimates of primary school enrollment rates for 126 nations and colonies from 1870 to 1940. The data is printed in the Appendix and can easily be imported into Excel. The article was published in the journal Sociology of Education, Vol.61(3), July 1988, pp.191-210. [via Masa Kudamatsu's DevEconData blog]

+++ Go to the Macro data website.

21/02/2011 Population and Demography: The Office of Population Research (OPR) at Princeton University is a rich source of data for demography and especially migration research (among other topics). Projects include the ongoing Mexican Migration Project and Latin American Migration Project as well as the Addis Ababa Mortality Surveillance Project. THe World Fertility Survey (for 41 LDCs) should also be of interest. Access to some of the data requires registration. [Thanks to Gunilla Petterson, who featured these data on her developmentdata.org site]

+++ Go to the Macro data website.

21/02/2011 Population and Demography: The Data & Information Services Center (DISC) Archive at University of Wisconsin-Madison provides access to raw data and documentation for a number of population/demography datasets for North and Latin America. [Thanks to Gunilla Petterson, who featured these data on her developmentdata.org site]

+++ Go to the Macro data website.

21/02/2011 Data on Slavery: The Data & Information Services Center (DISC) Archive at University of Wisconsin-Madison provides access to the raw data and documentation which contains information on the following slave trade topics from the eighteenth and nineteenth centuries: records of slave ship movement between Africa and the Americas, slave ships of eighteenth century France, slave trade to Rio de Janeiro, Virginia slave trade in the eighteenth century, English slave trade (House of Lords Survey), Angola slave trade in the eighteenth century, internal slave trade to Rio de Janeiro, slave trade to Havana, Cuba, Nantes slave trade in the eighteenth century, and slave trade to Jamaica. [Thanks to Gunilla Petterson, who featured the DISC site on developmentdata.org]

+++ Go to the Macro data website.

19/02/2011 Child Labour: The International Labour Organisation's (ILO) International Programme on the Elimination of Child Labour (IPEC) collects data on the extent, characteristics and determinants of child labour. The micro datasets (mostly cross-sections) are predominantly for African and Latin American countries (data for a total of 30 countries). Their website further contains additional documentation such as the questionnaires, publications and reports compiled from the data.

+++ Go to the Macro data website.

14/02/2011 Historic Trade Routes: Matthew Ciolek at Australian National University edits the site for the Old World Trade Routes (OWTRAD) Project: "This site supports online research in the field of dromography and provides a public-access electronic archive of geo/chrono-referenced data on land, river and maritime trade routes of Eurasia and Africa during the period 10,000 BCE - circa 1820 CE." The files are published in CSV, MapInfo and Google Earth (KML) formats, downloadable by region. There's also a link to the Trade Routes Resources blog [via Masa Kudamatsu's DevEconData blog]

+++ Go to the Macro data website.

11/02/2011 Tanzania: The Tanzania National Bureau of Statistics ('Statistics for Development') has a number of surveys on its website 'Tanzania National Data Archive'. You need to be registered to request data (top-right corner of the screen has the link to the registration). Examples include the Integrated Labour Force Survey 2006 and the Agriculture Sample Census Survey 2002-2003. Data aside the website also has a citations tab, which features articles by Stefan Dercon and Gabriel Demombynes (both with co-authors) among others.

+++ Go to the Micro data website.

09/02/2011 Religiosity: Robert Barro and Rachel McCleary have compiled a cross-country dataset on the share of religious people in the population. "Adherence fractions of population are shown for 10 religion groups and non-religion (incl. atheists) in 1970, 2000, and 1900 (from Barrett)." Data is available for download in excel format from Barro's Harvard data page. His working paper page offers a considerable number of papers on the topic of religion and growth. [via Masa Kudamatsu's DevEconData blog]

+++ Go to the Macro data website.

07/02/2011 Macro panel data: Fulvio Castellacci and Jose Miguel Natera have created a balanced panel dataset for cross-country analyses of national systems, growth and development (CANA) hosted by the Norwegian Institute of International Affairs. The originality of this dataset (which draws on a variety of sources) is in that the gaps in the data have been filled, using a methodology of multiple (and repeated) imputations by two political scientists, Honaker and King (2010). I have not looked at the Castellaci & Natera paper describing the data construction and robustness checks in detail, but am a priori quite sceptical about imputations: these macro variables are likely to be integrated, so imputations could be rather misleading. On the other hand, missing data is a serious problem for a lot of the dimensions they consider: (1) Innovation and technological capabilities; (2) Education and human capital; (3) Infrastructures; (4) Economic competitiveness; (5) Social capital; (6) Political and institutional factors. There are a total of 41 indicators for 134 countries over the period 1980-2008. The data is in excel format and well-documented. I'd say keep an eye out for reviews and applications of this dataset.

+++ Go to the Macro data website.

23/01/2011 Corruption data: Global Advice Network (funded by the Governments of Austria, Denmark, Germany, Netherlands, Norway, Sweden and the UK) provides the Business Anti-Corruption Portal, intended as information source for SMEs operating in developing countries. Within the 'Country Profiles' you can go to the 'Sources' page to pick out a wealth of WEF, Transparence Intl., World Bank, etc. reports and (importantly) also (micro-)data such as enterprise surveys with relevance for corruption and investment/business climate.

+++ Go to the Macro data website.

23/01/2011 Indonesian Micro data: Conducted by the World Bank in January/February 2006 (covering 2005 but with some recall data for 2002) the Indonesian Rural Investment Climate Survey (RICS) is an in-depth, quantitative survey of 2549 non-farm enterprises, 2782 households and 149 communities in 6 rural Kabupaten. The RIC Survey data provides the first representative snapshot of the investment climate in six different types of rural Kabupaten, allowing policymakers to identify and address the key constraints to investment and growth. Data is provided in SPSS and Stata format, together with full documentation. [Via Masa Kudamatsu at DEVECONDATA]

+++ Go to the Micro data website.

12/01/2011 Census data across countries: The Minnesota Population Center provides the Integrated Public Use Microdata Series (IPUMS International). "IPUMS-International is composed of microdata, which means that it provides information about individual persons and households. This makes it possible for researchers to create tabulations tailored to their particular questions [...] The data series includes information on a broad range of population characteristics, including fertility, nuptiality, life-course transitions, migration, labor-force participation, occupational structure, education, ethnicity, and household composition [...] The database currently describes approximately 325 million persons recorded in 158 censuses taken from 1960 to the present. The database includes censuses from 55 countries" (including LDCs such as Uganda, Rwanda, Cambodia, Kenya and many LAC countries). A large amount of documentation is provided, as well as supplemental data including GIS boudary files. Registration required (provide research project summary).

+++ Go to the Macro data website.

03/01/2011 Chinese Maps and GIS data: The Center for Geographic Analysis at Harvard University in collaboration with Shanghai's Fudan University provides a large number of historical GIS 'maps' for China: once mastered (no simple task) this type of Geographical Information Systems (GIS) data allows for spatial analysis of Chinese development. You need to register but access is free, data is in shapefiles or xls or Access (depending on the dataset). There are a large number of datasets from the days of the Legalists and Qin Shihuang (221 BC) to the 1990s (AD).

+++ Go to the Macro data website.

2010

16/12/2010 Resources for the study of conflict: The Households in Conflict Network, funded by The Leverhulme Trust and supported by the Institute of Development Studies at Sussex, the German Institute for Economic Research (DIW) in Berlin and the University of Antwerp, has a Resource & Data website where they provide Philip Verwimp's dataset on victims of genocide in Kibuye, Rwanda (Stata file). This aside the site contains a lot of information on this research topic.

16/12/2010 Local conflict data: ACLED (Armed Conflict Location and Events Dataset), compiled by the Centre for the Study of Civil War (CSCW) at the Peace Research Institute Oslo (PRIO), "is designed for disaggregated conflict analysis and crisis mapping. This dataset codes the location of all reported conflict events in 50 countries in the developing world. Data are currently being coded from 1997 to early 2010 and the project continues to backdate conflict information for African states to the year of independence. These data contain information on the date and location of conflict events, the type of event, the rebel and other groups involved, and changes in territorial control. Specifics on battles, killings, riots, and recruitment activities by rebels, governments, militias, armed groups, protesters and civilians are collected. Events are derived from a variety of sources, mainly concentrating on reports from war zones, humanitarian agencies, and research publications. These data can be used in any GIS, any mapping program, or statistical package." [Thanks to Anke Hoeffler at CSAE]

10/12/2010 World Bank projects: The Mapping for Results Platform (beta version) of the World Bank provides detailed information about "our work to reduce poverty and promote sustainable development around the world. This pilot website aims to visualize the location of our projects and to provide access to information about indicators, sectors, funding and results."

10/12/2010 Merging data: The UK Economic and Social Data Service (ESDS) has produced 'Countries and Citizens: Linking international macro and micro data'. This is "an interactive training resource with online tutorials, activities, study guides and videos, designed to show how to combine socio-economic data from country-level aggregate databanks (macro data) with individual-level survey datasets (micro data). It comprises five units, each of which was written by a subject specialist and has been designed as a self guided learning resource. Though specifically for postgraduates and researchers, it may also be of interest to undergraduates."

08/12/2010 Children and Women: UNICEF assists countries in collecting and analyzing data in order to fill data gaps for monitoring the situation of children and women through its international household survey initiative the Multiple Indicator Cluster Surveys (MICS). The first round of MICS was conducted around 1995 in more than 60 countries; second round of surveys was conducted in 2000 (around 65 surveys); the third round (50 countries) in 2005-06; the fourth round of Multiple Indicator Cluster Surveys (MICS) is scheduled for 2009-2011 and survey results are expected to be available from 2010 on. Data coverage: in MICS3, as in the previous rounds, three model questionnaires were developed: a household questionnaire, a questionnaire for women aged 15-49, and a questionnaire for children under the age of 5 (addressed to the mother or primary caretaker of the child). [via Sebastian Bauhoff @Harvard]

08/12/2010 Microdata repository: The Institute for Social & Economic Research at the University of Essex hosts Keeping Track - A guide to longitudinal resources. The site "aims to provide an up-to-date guide to major longitudinal sources of data. The central purpose of this site is to allow users to see what kinds of longitudinal data are available and to locate information about studies which may provide data useful to their research interests. The site covers data sets collected by governmental, academic, private social research, medical and private industrial sources. This site includes household panel surveys, studies following the health of individuals, birth cohort studies, studies following the quality of a product design, and administrative records. Users of this site can find out basic details of the purpose, methodology, timing, coverage, and availability of the longitudinal data sets covered here. The site also offers links to the web pages of individual studies, and provides contact details for people wishing to get more information about any particular study." [via Sebastian Bauhoff @Harvard]

30/11/2010 Data quality: The World Bank Development Economics Data Group provides the Bulletin Board on Statistical Capacity, an online database that measures and monitors the statistical capacity of developing countries. The database contains information encompassing various aspects of national statistical systems. It also includes a country-level composite statistical capacity indicator based on evaluation of countries against a set of criteria consistent with international recommendations.

30/11/2010 Statistics on 'Progress': Wikiprogress is the official platform for the OECD-hosted Global Project on "Measuring the Progress of Societies" and Wikiprogress.Stat allows users to upload their data and metadata, and to navigate through a robust database of progress indicators. Themes on the website include Ecosystems Condition, Human Well-Being, Economy, Social and Welfare Statistics and Peace. There's a wealth of indicators here (sometimes cross-sectional or limited to a few time-series observations) and the data sources are clearly identified. Available for download to Excel. [Thanks to Angela Costrini Hariche, OECD Development Centre and Statistics Directorate and Project Manager of Wikiprogress]

30/11/2010 Educational and Social Research: Emma Smith at the School of Education, University of Birmingham provides a number of resources and data links for educational and social research, including Afrobarometer, Asiabarometer, PISA and World Value Survey. Her website acts as a portal for all the sources of secondary data that are listed in her book ('Using Secondary Data in Educational and Social Research', OUP 2008), as well as providing links to new sources and current developments in the field of secondary data analysis.

10/11/2010 Nighttime Lights Time Series: The US National Oceanic and Atmospheric Administration’s (NOAA) National Geophysical Data Center (NGDC) provides data on nighttime light from 5 different satellites covering 1992 to 2009 (Version 4). "Each satellite observes every location on the planet (between 65 degrees S latitude and 65 degrees N latitude) every night at some time between 8:30 and 10:00pm. Using night lights during the dark half of the lunar cycle in seasons when the sun sets early removes intense sources of natural light, leaving mostly man-made light. Readings affected by auroral activity (the northern and southern lights) and forest fires are also removed both manually and using frequency filters." There are a total of 30 files, each a zipped folder of 300MB. [The above quote is taken from Henderson, Storeygard and Weil (2008) Measuring Economic Growth from Outer Space, who use a previous version of the data]

22/10/2010 Data on Slavery: Nathan Nunn at Harvard University provides the data for his papers on his personal website, which includes (among others) US state-level data on slavery (1790-1860) and slavery data for The Americas in 1750. The data is in Stata format.

22/10/2010 Geographical data: Diego Puga at the Madrid Institute for Advanced Studies (IMDEA) provides data on 'terrain ruggedness' (the Terrain Ruggedness Index was originally devised by Riley, DeGloria, and Elliot (1999) to quantify topographic heterogeneity in wildlife habitats providing concealment for preys and lookout posts) which is used in a paper of Diego's with Nathan Nunn. The data (which also includes some other geographical variables) is in Stata format.

22/10/2010 Subjective Well-Being: Betsey Stevenson at the Wharton School of UPenn has a bunch of data on subjective well-being, both US and cross-country, which resulted in a couple of papers with her colleague Justin Wolfers. Zipped data is in Stata 9 or 10 format (huge files!).

20/10/2010 Conflict data: The Heidelberg Institute for International Conflict Studies (HIIK) constructed the COSIMO database (project leader Frank R. Pfetsch), which records information on political conflicts between 1945 and today. At present, COSIMO 2.0 includes information on far more than 500 conflicts with over 2,500 phases. By the systematic recording of single conflict measures, the new conception enables the detailed description of the conflict development in violent and non-violent phases. In addition, the databank includes extensive information on the structure of state and non-state actors, that are recorded per year. At the moment the 2.0 version (renamed CONIS) is not available online, but you can email the project team. Version 1.3 is available in Excel format for 1945 to 1998.

08/10/2010 PISA studies: The OECD provides access to PISA data (Programme for International Student Assessment) for 2000 to 2009 (4 waves). The most recent data wave will be made availabe on 7 December 2010. The data is in SAS, SPSS or Text format and contains student, school and parent information/questions. This is for 30 OECD/high- and middle-income countries. There is a vast number of variables so you had better see for yourself. [via Gunilla Pettersson's developmentdata.org]

08/10/2010 Actionable Governance Indicators: The World Bank has consolidated thet data on 'actionable' governance indicators in a single web portal, the AGI data portal. Actionable governance indicators are narrowly defined and disaggregated indicators that focus on relatively specific aspects of governance and could provide guidance on the design of reforms and monitoring of impacts. This means it provides links to over 1,000 indicator taken from sources such as AfroBarometer, the Doing Business surveys or the Press Freedom Index by Reporters without Borders. [via Gunilla Pettersson's developmentdata.org]

01/10/2010 A Century of Latin American development: Funded by the IADB, the Oxford Latin American Economic History Database (OxLAD) contains statistical series for a wide range of economic and social indicators covering twenty countries in the region for the period 1900-2000. Its purpose is to provide economic and social historians worldwide with a systematic recompilation of available statistical information in a single on-line source. The website also provides other resources including a long list of references, many of them in Spanish, and detailed discussion of the methodology of data construction. Downloads are in csv format.

29/09/2010 Inequality data: The Society for the Study of Economic Inequality (ECINEQ) has links to a number of datasets for the analysis of inequality. These include the Cross-National Equivalent File (CNEF) which contains equivalently defined variables for the British Household Panel Study (BHPS), the Household Income and Labour Dynamics in Australia (HILDA), the Korea Labor and Income Panel Study (KLIPS) (new!), the Panel Study of Income Dynamics (PSID), the Swiss Household Panel (SHP), the Canadian Survey of Labour and Income Dynamics (SLID), and the German Socio-Economic Panel (SOEP).

29/09/2010 New Empirical Microeconomics: Innovations for Poverty Action (IPA) is a research group comprising many of the most prominent academics of what I'd call the 'new empirical micro'. The outfit was founded by Dean Karlan and brings together the usual suspects at the frontier of development micro (Banerjee, Duflo, Fischer, Kramer, Miguel, etc). Their data website links to some of the data used in published work, e.g. for the de Mel, McKenzie and Woodruff RCT with firms in Indonesia among many other (RCTs). A second interesting resource (primarily in order to get to see where the field is going) is the database of ongoing and complete IPA projects, which can be searched by sector, researcher or country.

29/09/2010 African Governance Indicators: The Intrastate Conflict Program at Harvard's KSG creates the Index of African Governance. This "measures the degree to which five categories of political goods [are] provided within Africa's fifty-three (forty-eight in prior Indexes) countries. By comprehensively measuring the performance of government in this manner, that is, by measuring governance, the Index is able to offer a report card on the accomplishments of each government for the years being investigated-2000 and 2002 (for baseline indications) and 2005, 2006, and 2007... For those analysts who would like separately to explore the performance of countries on various aspects of governance, the Index includes scores in each of the five categories."

29/09/2010 Patents and IP legislation: The NBER Patent Data Project has US patent data for 1976-2006 and there are also some firm-matches available in this database. The World Intellectual Property Organisation (WIPO) offers WIPO Lex, a "one-stop search facility for national laws and treaties on intellectual property (IP) of WIPO, WTO and UN Members".

28/09/2010 Analysing previously unpublished UK micro and macro data: The UK Data Archive at the University of Essex has recently launched its Secure Data Service. Funded by the ESRC, this is intended to promote excellence in research by enabling safe and secure remote access by bona fide researchers to data hitherto deemed too sensitive, detailed, confidential or potentially disclosive to be made available under standard licensing and dissemination arrangements. Upon registration you'll be able to analyse UK data with your statistical software of choice via a remote desktop (i.e. you don't get the data for download, but you can analyse it on the UK SDS server.

28/09/2010 Trade intensity and Business Cycles: The IADB website hosts the data used in the work on trade intensity and business cycles by César Calderón, Alberto Chong and Ernesto Stein (2006, JIE). From the abstract: "Using annual information for 147 countries for the period 1960-99 we find that the impact of trade intensity on business cycle correlation among developing countries is positive and significant, but substantially smaller than that among industrial countries. Our findings suggest that differences in the responsiveness of cycle synchronization to trade integration between industrial and developing countries are explained by differences in the patterns of specialization and bilateral trade."

28/09/2010 Entrepreneurship: The Global Entrepreneurship Monitor (GEM) research program is an annual assessment of the national level of entrepreneurial activity. Data is collected for 'activity', 'aspirations', and 'attitudes and perceptions' (multiple variables under each rubric). Started as a partnership between London Business School and Babson College, it was initiated in 1999 with 10 countries, expanded to 21 in the year 2000, with 29 countries in 2001 and 37 countries in 2002. GEM 2009 is set to conduct research in 56 countries. GEM data for 1999 - 2006 is currently in the public domain. Full GEM datasets are made available to the public three years after the end of an annual data collection cycle. As such, GEM 2007 data will be made available to the public in January 2011. The data is in SPSS format.

27/09/2010 Public Debt panel data: The Inter-American Development Bank provides the Public Debt around the World database, which includes complete time-series of central government debt for 89 countries over the 1991-2005 period and for seven other countries for the 1993-2005 period. The data (both in STATA and EXCEL format) is described in Dany Jaimovich and Ugo Panizza (2006) "Public Debt around the World: A New Dataset of Central Government Debt" which is included in the zipped folder.

27/09/2010 Bank Ownership & Performance: The Inter-American Development Bank provides data on Bank Ownership and Bank Performance covering 119 countries over the 1995-2002 period. The methodology used to generate the data is described in Micco, Panizza and Yanez (2004) "Bank Ownesrhip and Performance," IDB-RES working paper No. 518.

27/09/2010 Cross-country and micro conflict data: The Department of Economics at Royal Holloway, University of London hosts the Conflict Analysis Resources website. This not only comprises a large number of datasets related to the topic (Correlates of War, Termination of Civil War etc.) but also additional resources such as surveys of the literature and active researchers.

27/09/2010 Social Assistance programs database: A database of a different sort is provided by people at the Chronic Poverty Research Institute at Manchester University: in its 5th update/version the Social Assistance in Developing Countries Database "provide[s] a summary of the evidence available on the effectiveness of social assistance interventions in developing countries". If, for instance, you want to find out what the actual cash transfers of Progresa/Oportunitades amounted to, this document gives you a concise overview of the program.

13/09/2010 European Statistics: The United Nations Economic Commission for Europe (UNECE) provides data on Economics, Transport, Gender and Forestry (interesting mix!) on their website. Coverage varies, but the earliest date seems to be 1990. [via economicslinks]

13/09/2010 Financial Access Surveys: The IMF has a new database reporting the access to basic consumer financial services worldwide. At present this data covers 138 economies, nominally for the period 1998-2009, although most countries only have data from 2004 onwards. Annual information covers the reported use of banking services and access to banks' physical outlets. The data for all countries and time periods cam be downloaded as an Excel file. [via economicslinks]

13/09/2010 World Bank data set free: The major World Bank datasets (including WDI, Doing Business and Enterprise Surveys) are now all accessible (and in some cases directly downloadable) from one single website. [via economicslinks]

09/09/2010 Financial Development Resources: Huang Yongfu at Cambridge's Land Economy department has some links to datasets on Financial Development as well as other resources on the topic (researchers in the area, papers).

25/08/2010 Two important data depositories: BREAD (Bureau for Resarch in Economic Analysis of Development) provides links to a large number of survey as well as aggregate-level data. This includes for instance the district level data for India in the analysis of weather and mortality in rural versus urban India carried out by Robin Burgess and co-authors (presented at the Glasgow EEA, August 2010). The LSE's development department STICERD (The Suntory and Toyota International Centres for Economics and Related Disciplines) has a "virtual center" for fieldwork in Development Economics. This not only includes datasets and related materials (questionnaires etc.) but also resources related to methodology, including 'The Basics of Developing Questionnaires'.

18/08/2010 Firm-level data on management practices: The World Bank enterprise surveys division provide updated raw survey data (free registration) for Belarus, Bulgaria, Germany, India, Kazakhstan, Lithuania, Poland, Romania, Russia, Serbia, Ukraine, and Uzbekistan. These countries are part of the new Management, Organization and Innovation survey work. In total, 1,777 firms were surveyed. 'The purpose of the survey is to measure and compare management practices across countries; to assess the constraints to private sector growth and enterprise performance resulting from management practices; and to stimulate policy dialogue about management practices and innovation.' Interesting work in the area of cross-country analysis of management practices (and their impact on productivity) has been carried out by John Van Reenen (LSE) with Nick Bloom (Stanford) and various co-authors. On the latter's webpage there are links to a number of the large datasets they have created.

11/08/2010 Vietnamese and Eastern European firm data: Chris Woodruff (Warwick/UCSD) has links to firm-level manufacturing data from Hanoi and Ho-Chi-Minh-City from the mid-1990s. Detailed documentation is provided. There is also data for manufacturing firms surveyed in five Eastern European countries, Poland, Slovakia, Romania, Russia and Ukraine.

11/08/2010 Townsend Thai Data Project: The Townsend Thai Project (initiated and headed by Robert Townsend at MIT) data include both annual and monthly panels, in addition to the collection of environmental data. Originally the Townsend Thai survey focused on villages in four provinces, two in the Northeast and two in the Central region. The baseline survey was conducted in 1997. To date, the Townsend Thai project continues to resurvey the annual and monthly panels. In 2006, the annual surveys extended to include urban areas in the same four provinces. In 2003, an annual survey of villages in the South was added and in 2004, two provinces in the north were included in the annual survey. The project emerged as a means to understand the broader economic and social context in which policies are enacted and research is conducted. Its goal is to build a bridge between policy and research by providing rich data from which academics and policy-makers alike can better understand household activities and behavior, as well as their relationship to the broader regional and national economy.

11/08/2010 BICS data: Usually referred to as the BRICS (Brazil, Russia, India, China, South Africa), the fortunes of a group of emerging economies is of particular interest to many development economists. As part of the Pathfinder project the UK ESRC (research council for economics and other social sciences) has published Data Discovery - A rough guide to microdata in Brazil, China, India and South Africa. This details datasets from the four countries and discusses some of the issues involved in public access to data. Focus is on micro-data for health, education, firms, labour markets, housing and crime.

09/08/2010 Seminal inequality dataset: The dataset on income inequality compiled by Klaus Deininger and Lyn Squire for the World Bank is one of the most commonly used data to investigate any links between inequality and growth at the macro level. The data distributes unevenly for 138 countries and over the period of 1890-1996 (but much shorter and sporadic for the vast majority of countries). For some countries this is not merely the Gini, but also cumulative quintile shares, available for download in Excel format.

05/08/2010 Multi-dimensional Poverty Index: The people at OPHI (Oxford Poverty & Human Development Initiative) have developed a new poverty index, which is 'multi-dimensional' (MPI). Sabina Alkire and Maria Emma Santos designed the MPI using a technique for multidimensional measurement created by Sabina Alkire and James Foster. OPHI analysed poverty across 78% of the world’s people in 104 developing countries using the MPI and released the results in advance of the 2010 HDR. For now this is sort of a cross-section, available for download in Excel format.

05/08/2010 Venezuelan firm data: The data for one of the seminal papers in the FDI spillover literature (firm-level), Aitken & Harrison (1999, AER) is available on Ann Harrison's website at UC Berkeley. This covers over 10,000 Venezuelan firms in the period 1976-1989 with an average of 4 waves of data per firm (41,000 observations). Variables include KLEM with two types of labour, plus a number of expenditures.

05/08/2010 Political freedom datasets: Hard to comprehend, really, but I seem to have so far missed out on linking to one of the most frequently used resources when it comes to 'freedom in the world'. Freedom in the World Comparative and Historical Data by Freedom House provides country-level scores for political rights and civil liberties from 1973 onwards, plus a dataset on electoral democracy which they started collecting in the late 80s. All are available free for download, unlike the Political Risk Services Group's International Country Risk Guide (ICRG), which is $425. You should also have a look at the links provided by Freedom House in the 'Resources' tab. [Thanks to Nalan Basturk at the Erasmus School of Economics in Rotterdam for pointing this out]

14/07/2010 Cross-country Macro datasets: Bill Easterly at NYU provides a number of macro data series (mostly WDI, also PWT among other sources) called the 'Global Development Network Growth Database'. There are also dataseries for 'fixed factors' (geographical data) and government finance.

04/06/2010 (South) African Micro datasets: DataFirst is a Survey Data Archive and training facility at the University of Cape Town, South Africa. The Archive’s holdings include the datasets from all major South African surveys, as well as survey data from other African countries. But: Due to copyright restrictions, the datasets themselves are not downloadable from the site but survey data from surveys conducted by the University of Cape Town are available from DataFirst's website via the Public Access Catalogue.

04/06/2010 Micro datasets: An excellent new resource for household or firm-level data from LDCs is OpenMicroData. I do like their approach: 'OpenMicroData is run by a network of empirical researchers who believe that microdata should be freely available.' Good thinking, guys. So far I can see some of the CSAE African firm and hh datasets linked, as well some data from randomised experiments in education from Burkina Faso. The site has only been up for a few months. [Gunilla Patterson featured the new site on her excellent devdata website]

20/05/2010 Micro-panel on HIV in Malawi: The Malawi Diffusion and Ideational Change Project (MDICP) is a collaboration by people at UPenn and two medical colleges in Malawi. The focus of the study is on the roles of social interactions in (1) the acceptance (or rejection) of modern contraceptive methods and of smaller ideal family size; and (2) the diffusion of knowledge of AIDS symptoms and transmission mechanisms and the evaluation of acceptable strategies of protection against AIDS. The website provides a great deal of information about this and a sister project in Kenya, including papers, qualitative surveys and the quants data. [featured by Masa on Devecondata]

06/05/2010 Cross-Country Inequality Data from LAC: The Socio-Economic Database for Latin America and the Caribbean (SEDLAC) provides statistics on poverty and other distributional and social variables from 25 Latin American and Caribbean countries, based on microdata from households surveys. [Masa featured the new site on her excellent devdata website]

05/05/2010 Time Series Data website: Rob Hyndman at Monash University in Australia has a dedicated 'Time Series Data Library' which is organised by topic. Rob has a brilliant motto printed at the bottom of the site: "In God we trust. All others must have data." (W. Edwards Deming)

05/05/2010 Econometrics Links Data website: My favourite Royal Economics Society website, econometricslinks.org has a dedicated data website. Focus here is on time series and in particular finance data but there are a few good links also for development economists.

30/04/2010 Major overhaul at The World Bank: The World Bank has reorganised access to the major cross-country panel datasets it produces, all of which are now available (for browsing or download) from a single website. [Gunilla Patterson featured the new site on her excellent devdata website]

29/04/2010 Can't get enough of that cross-country education data: Mauro Caselli, Jörg Mayer and Adrian Wood have compiled a unique extension to the Barro-Lee (2001) and Cohen-Soto (2001) data on average adult years of schooling (attainment) using UNESCO data on literacy rates. Missing values are imputed based on a regression model investigating the link between average adult education and literacy rates in the available data and applied to countries where the attainment variable is missing but literacy rates are available. Of the 133 countries covered, no imputations were needed for 95, imputations for some but not all years for 19, and imputations for all years for 19. The link is for a zipped folder containing Excel and Stata files as well as detailed documentation. The data is applied in a paper by Jörg and Adrian investigating the global impact of China's industrialisation on other LDCs' structural change. [Thanks to Adrian Wood for making the data available.]

23/04/2010 Yet more cross-country education data: Christian Morrisson and Fabrice Murtin from the OECD have constructed a historical database (entry under 'A century of education') on educational attainment in 74 countries for the period 1870-2010 (decadal estimates), using the perpetual inventory methods before 1960 and then the Cohen and Soto (2007) database. This should be particularly interesting in combination with for instance the Maddison data.

21/04/2010 Barro-Lee update: The seminal dataset on educational attainment, compiled by Robert Barro and Jong-Wha Lee, is available from a new dedicated website. The data is available for download in full for 146 countries by 5-year age group or 15 years, 25years, and over in 5-year intervals for the period 1950-2010 (in xls, csv, or dta format). The site also links to some previous versions of the dataset and other resources, including Soto and Cohen (2006) and a few select academic papers [Thanks to Adrian Wood for the pointer.]

25/03/2010 Enterprises in Emerging Markets: The Business Environment and Enterprise Performance Survey (BEEPS) is a joint initiative of the European Bank for Reconstruction and Development (EBRD) and the World Bank. The survey was first undertaken on behalf of the EBRD and World Bank in 1999–2000, when it was administered to approximately 4000 enterprises in 26 countries of Eastern Europe and Central Asia (including Turkey) to assess the environment for private enterprise and business development. There now exist four rounds of this data, which is available in STATA format for the 2002-2009 panel and for individual years. The objective of the survey is to obtain feedback from enterprises in EBRD countries of operation on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time. The survey examines the quality of the business environment as determined by a wide range of interactions between firms and the state and as such facilities research and serves as an input into policy dialogue with countries in Central and Eastern Europe.

25/03/2010 Aid and Development: A new database for all metrics related to foreign aid has been launched with a conference in Oxford this week: AidData has compiled figures "from a range of official sources, including the OECD Creditor Reporting System (CRS) database, donor annual reports, project documents from both bilateral and multilateral aid agencies, and data gathered directly from donor agencies". Crucially, the database covers both commitments and disbursements (which like in the FDI case deviate considerably) and refers to grants, mixed loans and grants, loans at discretionary rates from multilateral agencies, loans/loan guarantees at market rates, lechnical assistance, and sector program aid transfers in cash or in kind. There's a blog and lots of dedicated tools and information about aid data. All of this is the follow-up to the PLAID Project (a partnership of the College of William and Mary and Brigham Young University) which has now merged with Development Gateway's Accessible Information on Development Activities (AiDA) [thanks to Nic van de Sijpe for the pointer].

24/03/2010 Mix of datasets (cont'd): Without wanting to sound patronising, I applaud anybody's attempts to make data more widely available, so congratulations to a new upstart called Google, offering access to some World Bank, Eurostat and US data on their website. Don't try and google "Google data" as you won't find it that way ;-) This resource is useful primarily for their data visualisation tool - for individual variable country series can be graphed as lines over time, bars or with the use of maps [thanks to Paddy Carter at Bristol for the pointer].

24/03/2010 Mix of datasets: The Inter-American Development Bank (IADB) has created DataGov, providing governance indicators from key public databases consolidated for all countries in the world. This site has changed quite a bit since I last had a look at it - everything is now in graphs using Flash (I imagine), but there's still the opportunity to download the data to excel [thanks to Paul Clist for reminding me].

20/03/2010 Cross-country Human Capital data: Marcelo Soto and Daniel Cohen have constructed a rival to the Barro & Lee gold standard of data on average years of schooling across 95 countries. From the abstract of their Journal of Economic Growth paper (Vol.12(1), 2007): "We present a new dataset for years of schooling across countries for the 1960–2000 period. The series are constructed from the OECD database on educational attainment and from surveys published by UNESCO. Two features that improve the quality of our data with respect to other series, particularly for series in first-differences, are the use of surveys based on uniform classification systems of education over time, and an intensified use of information by age groups." [thanks to my man Fabio Manca for pointing me to this resource].

22/02/2010 Macro Panel Data and Tools: The resource website Macro Data 4 Stata homogenises several commonly used macroeconomic datasets and imports them into Stata. The project is run by Giulia Catini, Ugo Panizza and Carol Saade and started uploading .dta files fairly recently. The library at present includes data from the Penn World Table and the Groningen Growth and Development Data Centre. The AAA Codes dataset looks particularly handy for anybody doing cross-country analysis [thanks to Aid-man Nic Van de Sijpe for pointing me to this resource].

05/02/2010 Rural Household Surveys: The Rural Income Generating Activities (RIGA) project has created an internationally comparable database of household income sources from existing household living standards surveys for low and middle-income countries. Most of the surveys used by the RIGA project were developed by national statistical offices in conjunction the World Bank as part of its Living Standards Measurement Study. The database is maintained by the FAO. At present the database incorporates 27 surveys covering 16 countries in Africa, Asia, Eastern Europe and Latin America. In addition RIGA provides a link to research papers that have used the data [thanks to Alberto Zezza at the FAO for letting me know].

23/01/2010 Yale Longitudinal Surveys: Chris Udry at Yale's Economic Growth Center (EGC) provides access to household survey data. The introduction to the surveys states that "The surveys would begin with a (clustered) random sample of approximately 5,000 households in 200 communities in rural and urban areas of each country. Every three years following the initial survey, a (stratified) random sample of each individual in the original 5,000 households would be followed for re-interviews." Other than the above document there is not much obvious documentation, but there is data for Ghana and Nigeria, some of it in Stata format (with do-files).

23/01/2010 Spatial data: Gridded Population of the World (version 3), constructed by the Center for International Earth Science Information Network (CIESIN) at Columbia University, provides spatial data on population around the world in 1990, 1995, 2000 with 2.5 arc-minute grid resolution [thanks to Masa at DEVECONDATA for reporting this link].

23/01/2010 Consumption measurement: If you are interested in calorie consumption, you need to convert the amounts of food consumption (collected from household surveys) to obtain the data. Annex 1 of the FAO (2001)'s Food Balance Sheets: A Handbook provides the conversion factors (how many kilo calories 100 grams of food contain) for a wide variety of foods for international use. For India consult Gopalan, Sastri, and Balasubramanian's book entitled Nutritive Value of Indian Foods (Hyderabad: National Institute of Nutrition, 1971) [thanks to Masa at DEVECONDATA from which both of these links are lifted].

23/01/2010 Globalisation: The KOF Swiss Economic institute at the Eidgenoessische Technische Hochschule (ETH) in Zurich offers the KOF Index of Globalization, which measures three main dimensions of globalization (economic, social, political) in addition to variables measuring actual economic flows, economic restrictions, data on information flows, data on personal contact and data on cultural proximity. Data are available on an annual basis for 208 countries over the period 1970-2007. This index is based on work by Axel Dreher (Goettingen, affiliated to KOF Swiss Economic Institute at ETH) and co-authors [thanks to the ETH for letting me know].

23/01/2010 Weather data: The NASA Goddard Space Flight Center provides various data for the Global Precipitation Climatology Project (GPCP). Most interesting should prove the Global Monthly Merged Precipitation Analyses of GPCP available 1979-present day.

23/01/2010 Data on Innovation and R&D: Bronwyn Hall's (UC Berkeley) website has a number of links to databases, research papers and methodology papers on conducting productivity analysis when analysis innovation. [Thanks to Christian Helmers for this links].

23/01/2010 How to merge databases: A paper by Thoma and co-authors entitled 'Methods and software for the harmonization and combination of datasets: A test based on IP-related data and accounting databases with a large panel of companies at the worldwide level' should be a great resource for anybody wanting to merge firm-level data for productivity analysis. [Thanks to Christian Helmers for pointing out this paper].

12/01/2010 China Micro data: Nancy Qian at Yale has links to a number of Chinese household surveys on her website, including the China Health and Nutrition Survey (CHNS) at University of North Carolina Population Center as well as the familiar CHIP data (China Household Income Project) available through ICPSR.

12/01/2010 Citizenship law data: Graziella Bertocchi and Chiara Strozzi at the Università degli studi di Modena e Reggio Emilia have constructed the Citizenship Laws dataset, which contains information on citizenship laws in 162 countries of the world with reference to the years 1948, 1975, and 2001. "The available information concerns the way in which countries regulate citizenship acquisition at birth, with a distinction among jus soli (i.e., by birthplace), jus sanguinis (i.e., by descent), and mixed regimes. We also collect information about naturalization requirements... The dataset also contains information for the main border changes which have affected the countries in our sample."

12/01/2010 CGD data archive: The Washington-based Center for Global Development (Roodman, Radelet, Subramanian, Birdsall, Clemens and many others) have a link to datasets on their publications website. Highlights include data on 'the fate of young democracies' (since 1960), net-aid transfers (1960-2007), and African Health Professionals Abroad (Gunilla Petterson worked on this dataset!).

12/01/2010 Commitment to aid: From the same source is David Roodman's Commitment to Development Index (CDI), which "rates 22 rich countries on how much they help poor countries build prosperity, good government, and security. Each rich country gets scores in seven policy areas, which are averaged for an overall score." The CDI wasfirst compiled in 2003.

12/01/2010 Agriculture data: CIMMYT, which stands for International Maize and Wheat Improvement Center (didn't you know?), have currently got three separate datasets on their website. First, some price series for wheat, maize, sorghum, barley, rice, oil, fertilizers, and freight rate for wheat. This is a good dataset to act as reference global market price, since there are monthly observations for e.g. CIF Rotterdam price, but coverage varies a lot. Another interesting dataset is for agricultural production in Mexico: Agricultural information (1980-2008) related to planted area, harvested area, production, and production value of 657 permanent and seasonal crops, per cycle and regime (entries are in Spanish, but it's not too difficult to guess what 'valor ($)' means... Finally, they report the FAO data but you can pick alternative regional aggregation. [Thanks to Doug Gollin for these links]

12/01/2010 China Micro data: Nancy Qian at Yale has links to a number of Chinese household surveys on her website, including the China Health and Nutrition Survey (CHNS) at University of North Carolina Population Center as well as the familiar CHIP data (China Household Income Project) available through ICPSR.

12/01/2010 Citizenship law data: Graziella Bertocchi and Chiara Strozzi at the Università degli studi di Modena e Reggio Emilia have constructed the Citizenship Laws dataset, which contains information on citizenship laws in 162 countries of the world with reference to the years 1948, 1975, and 2001. "The available information concerns the way in which countries regulate citizenship acquisition at birth, with a distinction among jus soli (i.e., by birthplace), jus sanguinis (i.e., by descent), and mixed regimes. We also collect information about naturalization requirements... The dataset also contains information for the main border changes which have affected the countries in our sample."

12/01/2010 CGD data archive: The Washington-based Center for Global Development (Roodman, Radelet, Subramanian, Birdsall, Clemens and many others) have a link to datasets on their publications website. Highlights include data on 'the fate of young democracies' (since 1960), net-aid transfers (1960-2007), and African Health Professionals Abroad (Gunilla Petterson worked on this dataset!).

12/01/2010 Commitment to aid: From the same source is David Roodman's Commitment to Development Index (CDI), which "rates 22 rich countries on how much they help poor countries build prosperity, good government, and security. Each rich country gets scores in seven policy areas, which are averaged for an overall score." The CDI wasfirst compiled in 2003.

12/01/2010 Agriculture data: CIMMYT, which stands for International Maize and Wheat Improvement Center (didn't you know?), have currently got three separate datasets on their website. First, some price series for wheat, maize, sorghum, barley, rice, oil, fertilizers, and freight rate for wheat. This is a good dataset to act as reference global market price, since there are monthly observations for e.g. CIF Rotterdam price, but coverage varies a lot. Another interesting dataset is for agricultural production in Mexico: Agricultural information (1980-2008) related to planted area, harvested area, production, and production value of 657 permanent and seasonal crops, per cycle and regime (entries are in Spanish, but it's not too difficult to guess what 'valor ($)' means... Finally, they report the FAO data but you can pick alternative regional aggregation. [Thanks to Doug Gollin for these links]

10/01/2010 Data on Patents and R&D: The World Intellectual Property Organisation (WIPO) publishes the World Intellectual Property Indicators which includes for instance data for "Patent applications by patent office (1883-2008)" (read: country) which can be downloaded as excel or CSV file. Similarly of great interest should be "Patent grants by patent office (1883-2008)" and other statistics on 'Patents in Force' and 'Patent Intensity'. WIPO also has further resources on trademarks and plant varieties(!) among others. A second resource for patent data is the European Patent Office (EPO) which has a number of free databases on its website. Thirdly, The OECD maintains $$ OECD.stat which has statistics on R&D, patents and other science & technology topics. Provision is limited to the OECD member states, the BRICS and a small number of other countries. [Thanks to Christian Helmers for these links]

08/01/2010 More data on agricultural price and trade policy: The OECD has a dedicated database for PSE (Producer and Consumer Support Estimates) which covers OECD member states as well as a small number of Eastern European 'Emerging' Economies and the BRICS countries for 1986 to 2008. A recent paper by Kym Anderson compares and contrasts the methodology applied in his own work constructing measures of agricultural trade policy with that of the OECD.

07/01/2010 IFPR surveys: The International Food Policy Research Institute (IFRPI) offers a wide range of household and community-level surveys on its data website. Chief among these is the set of Ethiopian Rural Household Surveys (ERHS), collected in 6 waves between 1989 and 2004, which is provided with all additional information, questionnaires etc. Note that despite the Amazon-style lingo ('Basket', 'Proceed to checkout') all you need to do is register on the site: then you can access/download all of the datasets featured. The datasets can also be accessed from the IFPRI Dataverse entry.

07/01/2010 Political Particularism: A dataset on electoral systems developed by Ugo Panizza and others at the Inter-American Development Bank provides indicators for the degree to which individual politicians can further their careers by appealing to narrow geographic constituencies on the one hand, or party constituencies on the other. The data covers 183 countries and runs from 1978 to 2001 (unbalanced panel). Excel and Stata files, as well as a working paper describing the data and highlighting its potential use in research exploring the connections between electoral systems and economic outcomes, are available for download.

07/01/2010 Labour: The International Labor Organization (ILO) maintains the LABORSTA database. This provides data for up to 200 countries under the rubrics of (un-)employment, wages, strikes and lockouts, as well as international labour migration (among others).

07/01/2010 Agricultural Policy: The Food and Agricultural Policy Research Institute (FAPRI), with research centers at the Center for Agricultural and Rural Development (CARD) at Iowa State University and the Center for National Food and Agricultural Policy (CNFAP) at the University of Missouri-Columbia, provides data on commodities and agricultural policy (at the product level) for a large number of developing and developed countries (time-series dimension differs widely across variables, products and countries).

07/01/2010 Overseas Development Aid: The OECD maintains the QWIDS Query Wizard for International Development Statistics, which helps when you are selecting and downloading aid-related statistics.

07/01/2010 Health: The WHO maintains WHOSIS (Statistical Information System) which has data on mortality, health services coverage, inequities in health care access among other rubrics. Time series begin in 1990 but are not annual.

05/01/2010 Distortions to Agricultural Incentives: The World Bank recently completed a big data compilation exercise for Distortions to Agricultural Incentives, with a team of researchers headed by Kym Anderson providing various Estimates of Distortions to Agricultural Incentives (1955-2007). A core database provides data for Nominal Rates of Assistance to producers (NRAs), together with a set of Consumer Tax Equivalents (CTEs), for farm products and a set of Relative Rates of Assistance to farmers in 75 focus countries. Note that the variable 'border price' (bp) does however not represent the... how can I say this... 'border price', but a hypothetical producer price in the absence of distortions (domestic producer price divided by (1+NRA) and expressed in USD). The border price (fob) is not contained in the main datafile but can be found in the individual country spreadsheet (rows 37-39 for primary products, or 44-46 for lightly processed products). I am grateful to Kym Anderson and Ernesto Valenzuela for clarification; they also point to an alternative data reporter at Adelaide University where they are both based.

05/01/2010 Agricultural Market Access: The Agricultural Market Access Database (AMAD) is a collection of available public data on WTO market access in agriculture. It contains data for over 50 countries. After registration, all files can be downloaded for free (self-extracting zip files) and there is documentation on how to do this.

05/01/2010 Firm-level data: The Development Economics Research Group (DERG) at the Department of Economics, University of Copenhagen has (since 2001) been involved in several enterprise surveys in Vietnam (SMEs) and Mozambique. The former has 3 waves since 2002 with the fourth (2009) one coming on-stream soon, the latter has 2002/2006 data.

05/01/2010 Climate change data: Sea-level rise and storm-surge intensification data compiled by Susmita Dasgupta at the World Bank. The former assesses consequences of continued SLR for 84 coastal developing countries, providing data on the impacted land area, population, GDP, agricultural area, urban area and wetlands if sea-levels rise by 1 to 5 meters (excel worksheets). The latter considers the potential impact of a large (1-in-100-year) storm surge by contemporary standards, and then compares it with its 10% intensification which is expected to occur in this century. Again the impact on land area etc. is provided.

05/01/2010 Gender-related data: The World Bank provides GenderStats, which basically pulls out the relevant variables from the WDI database. Hit "Create your own query" to access the database.

05/01/2010 External debt data: The World Bank provides (in collaboration with the IMF) the Quarterly External Debt Statistics - these come in two variants, the general (GDDS) and special (SDDS) dissemination. Currently, sixty countries have agreed to participate in the SDDS/QEDS database and forty-two Low-Income Countries (LICs) to provide data to the GDDS/QEDS database. Data begins in 1998.

04/01/2010 African and Indian Household surveys: Stefan Dercon at Oxford University provides links to a number of datasets he has helped collect, including a Rural Household Survey for Ethiopia (panel), the kegara Health and Development Survey (Tanzania, panel) and ICRISAT data, as well as Young Lives (see separate entry). Entirely unrelated, Stefan also provides this gem.

04/01/2010 CGE and DSGE models for African countries: Chris Adam at Oxford University provides the files for the social accounting matrix and GAMS/Matlab program code for his work on aid in African economies.

04/01/2010 Macro-modelling for South Africa: John Muellbauer and Janine Aaron at CSAE run a research project on 'Structural Macro-Modelling of the South African Economy' (SMMSAE). They provide a number of indicators and indices they constructed, including FLIB (financial liberalisation) and trade openness indicators in excel/CSV format. The SMMSAE website also links to the papers they have written on macro-modelling for SA.

03/01/2010 Data depository: The William Davidson Institute provides macro and micro data on emerging and transition economies, the Davidson Data Center and Network. When I checked out the website none of the browsing tools worked, but the keyword search delivered a lot of interesting leads. The database also contains links to other databases, such as the China Data Center at U Michigan.

03/01/2010 $$ Chinese macro and micro data: when researching provincial FDI I frequently made use of the China Data Center at U Michigan. Much of the more recent data (primarily statistical yearbooks for various topics as well as provincial statistical yearbooks) is downloadable as Excel worksheets, whereas the earlier data is available in pdf format. There are also the China Survey Data Network and various census datasets. Researchers at Universities may find that their institution has forked out for the annual subscription fee and that they can access these data without additional cost.

03/01/2010 Demographic data: The Minnesota Population Center is the "world’s leading developer" of historical and international census demographic data, most of which are focused on the US and Western Europe, although the IPUMS (Intergrated Public Use Microdata Series) International data covers 44 countries using 130 censuses.

03/01/2010 Income data: The Luxembourg Income Study (LIS) is a cross-national Data Archive and a Research Institute located in Luxembourg. The LIS archive contains two primary databases. The LIS Database includes income microdata from a large number of countries at multiple points in time, starting from the early 1980s. The newer LWS Database includes wealth microdata from a smaller selection of countries. Both databases include labour market and demographic data as well.

03/01/2010 Data on values and cultural change: The World Values Survey represents 5 waves of data from the early 1980s to the late 2000s, covering survey data from 87 nations. The data is provided in SPSS, STATA and SAS formats. Variables related to individuals' happiness, how they feel, what is important in their lives, qualities their children should learn etc.

02/01/2010 Education finance data: IMF data on public spending on education, from 1985-2000 for 147 countries (via Gunilla Petterson). This module presents the IMF data on public spending on education from 1985-2000 for 147 developing and transition economies. There are two indicators in the module: (1) total public spending on education as a percent of GDP; and (2) total public spending on education as a percent of total government spending. The underlying data, in millions of local currency, are provided. The breakdown of total education spending into current and capital spending are provided when available. The technical notes describe differences in coverage.

02/01/2010 Historic Commodity Price data (1835-1950) for 35 countries, which includes some developing and colonialised countries such as China, Cuba, Ceylon, among others. The data is provided by Chris Blattman and the link gives a number of papers and references with detailed information on the data (excel/CSV). I got this link off Masa Kudamatsu's blog.

02/01/2010 African Election data: Staffan I. Lindberg at University of Florida provides the Elections and Democracy in Africa (1989-2003) dataset. This includes variables providing information about the voter turnout, whether the incumbant accepted the election and other interesting pol-econ data, described in detail here. I got this link off Masa Kudamasu's blog.

02/01/2010 Law, Debt, Informal Economy and Labour Regulation data: Andrei Shleifer's website provides links to a number of datasets he has compiled and used with various co-authors. This includes 'Private Credit in 129 Countries' (JFE 2007, with S. Djankov and C. McLiesh), with data from 1978-2002 and data on the 'unofficial economy' (primarily cross-section data).

02/01/2010 Impact of Reforms in Russia: The Russia Longitudinal Monitoring Survey (RLMS) is a series of nationally representative surveys designed to monitor the effects of Russian reforms on the health and economic welfare of households and individuals in the Russian Federation. These effects are measured by a variety of means: detailed monitoring of individuals' health status and dietary intake; precise measurement of household-level expenditures and service utilization; and collection of relevant community-level data, including region-specific prices and community infrastructure data. Data have been collected sixteen times since 1992. The project is based at the University of North Carolina at Chapel Hill and directed by Barry Popkin.

02/01/2010 Taxes: World Tax Database, provided by the Ross School of Business, Michigan University. Variables include Tax Revenue and Tax Rates. Data is from 1974 to 1999.

2009

16/12/09 Twitter: this is likely going to prove a good resource for regular updates on data. So far, however, I could only identify UNdata, which regularly reports on updates of its data (includes COMTRADE, FAO among others).

16/12/09 Tourism: It sits a little awkward in my section on economic migration, but the UN WTO (World Tourism Organisation) provides data on headcount and spending of tourists from 1995-2008 for around 90 countries on the UNdata website.

16/12/09 GIS: Adam Storeygard, an Economics PhD at Brown, has a selection of GIS Global Spatial Datasets on a dedicated website. Categories include administrative boundaries, population and other demographic indicators, economic indicators, data related to agriculture, infrastructure, climate and terrain. He's also put together some Miscellaneous notes and resources on learning GIS for beginners, building on his own experience of working with GIS data.

16/12/09 Economic History: The Economic History Association has links to a number of databases for economic historians. In order to use these you just need to register with EH (free). Just looking at the data titles, this is a great resource: Italy - Florentine Domains and the City of Verona: 1427, French Slave and Long Distance Trading Profits During the 18th Century, Ottoman Economic/Social History: 1600-1900, to name just a few. Naturally, these data are primarily for (now) developed economies, but there are some links to colonial data, e.g. Developing Country Export Statistics: 1840, 1860, 1880 and 1900. Thanks to Mark Koyama for pointing out this and the other historical data sources.

16/12/09 More Economic History: Another astonishing resource for historical data is provided by the Global Price and Income Group at UC Davis. Looking at their datamap, SSA is blank, but there are quite a few sources for Latin America, South Asia and East Asia.

16/12/09 Yet more Economic History: Bob Allen's website at Nuffiled has links to historical wage and price data for a number of countries, cities and occupations respectively.

16/12/09 Data depository site: The Global Social Change Research Project, created by Gene Shackman, hosts a wealth of reports, presentations and datasets.

Back up to the Table of Contents

Classifications, Meta-data

World Bank country/iso codes: there is a link to an Excel spreadsheet at the bottom of the page which also provides information on 'country classification' (which income category, lending category, whether its a highly-impoverished poor country and if you were away that geography lesson which region the country is in).

Two particularly useful links provided by Gunilla Petterson (see below) are the full lists of UN country codes (the three-letter abbreviations - 'ISO ALPHA-3 code') and HTS/SIC/SITC/NAICS codes. Note that some of the datasets listed on this website requires subscription.

The resource website Macro Data 4 Stata by Giulia Catini, Ugo Panizza and Carol Saade provides an AAA Codes dataset with country isocodes across various institutions (UN, WB) and is particularly handy for anybody doing cross-country analysis [thanks to Aid-man Nic Van de Sijpe for pointing me to this resource].

Gunilla Petterson also offers links to national statistics offices, central banks and finance ministries which often publish up-to-date macro-data.

Bronwyn Hall at Berkeley provides links to a number of industry concordances for merging IP/patent data with productivity data.

The World Bank Development Economics Data Group provides the Bulletin Board on Statistical Capacity, an online database that measures and monitors the statistical capacity of developing countries. The database contains information encompassing various aspects of national statistical systems. It also includes a country-level composite statistical capacity indicator based on evaluation of countries against a set of criteria consistent with international recommendations.

Back up to the Table of Contents

Data Depositories, Other 'Link' Websites

$$ Many UK universities and colleges have subscriptions to the databases maintained by ESDS International (Economic and Social Data Service, based at Essex University). These contain (among others) Eurostat data, IEA (Energy), ILO (labour), IMF, OECD, World Bank WDI and UK National Statistics data. They also have some micro-data, most notably the Young Lives data (see last entry in the micro-household section).

The single most useful website for those searching for datasets for development is maintained by Gunilla Petterson, an economics PhD student who is based at the University of Sussex. Her developmentdata.org website has links to a vast number of datasets for development and is constantly updated.

Another extremely useful link is the DEVECONDATA blog maintained by Masayuki Kudamatsu, an economics lecturer based at the Institute of International Economic Studies (IIES) at the University of Stockholm. This not only provides links and regular updates to existing and new datasets for development but also provides crucial information on some of the nitty-gritty data isses. Note that some of the datasets listed on this website requires subscription. Masa also has other very useful links on his personal webpages, such as lecture notes, valuable resources for STATA and a list of regular conferences on Development Economics.

The Norwegian Social Science Data Services (NSD) have compiled The Macro Data Guide, "An International Social Science Resource" covering many sources with data arranged by country or topic. It seems that coverage is particular strong on topics of political science, including elections, parties, etc (but that's just my perception). For each dataset there is very useful background information on coverage, time span, topics, documentation and when the dataset was last accessed. Definitely a good starting point for any macro data search.

The (deep breath) European Commission Joint Research Centre's Institute for the Protection and Security of the Citizen have a very nice website of Statistical Sources gathering links to various datasets from a wide range of institutions (FAO, IMF, OECD, UN, World Bank). The FAO databases are particularly interesting, as are the SIPRI data (see 21/3/2012).

The World Bank has created a new Central Microdata Catalog for all the micro-level datasets "in catalogs maintained by the World Bank and a number of contributing external repositories." At the moment of writing this repository includes 378 datasets. Slowly, slowly this Open Data malarky is getting serious...

The major World Bank datasets (including WDI, Doing Business and Enterprise Surveys) are now all accessible (and in some cases directly downloadable) from one single website. [via economicslinks]

The wbopendata command (in Stata: findit wbopendata) written by a group of World Bank folk headed by Joao Pedro Azevedo allows Stata users to download thousands of indicators from the World Bank databases, including: Development Africa Development Indicators; Doing Business; Education Statistics; Enterprise Surveys; Global Development Finance; Gender Statistics; Health Nutrition and Population Statistics; International Development Association - Results Measurement System; Millennium Development Goals; World Development Indicators; Worldwide Governance Indicators. These indicators include information from over 256 countries and regions, since 1960. Instead of downloading one variable at a time you can specify one of 16 'topics' (e.g. Aid Effectiveness, Science & Technology, Infrastructure) to obtain all relevant variables for all countries and time periods. Update 11/03/2011: I added a little Stata 10 do-file to this which allows you to bring a data 'topic' (example here: energy) with multiple variables/'indicators' into Stata long format. Otherwise the data is in wide-format, which doesn't allow us to get started immediately.

The data aggregation website Quandl provides access to a vast number (they say 5m) data series for countries around the world. This resource picks up data from various well-known sources (e.g. IMF, World Bank) and links to them from their own easy-to-use website. Perhaps the best feature are dedicated programs for R, Python, Matlab, Excel, Maple, Julia, Clojure [they're making up these names, or is there really a stats program called Julia?] and also Stata. The latter can be installed by typing "ssc install quandl" in Stata (see helpfile for syntax). The only downside so far seems to be that you cannot download panel data like for instance in the World Bank WDI Stata command wbopendata. Maybe Felix Leung is already busy coding that feature...

The somewhat ominously-named Economics Web Institute provides a great deal of data for download, much of it focused on OECD economies, but also some LDC gems. You need to look through the individual links and perhaps download some stuff to see what's on offer, as there is typically not much info on coverage (countries, years, level of (dis-)aggregation). A link here led me to the Princeton-based International Networks Archive, which also provides lots of data, organised under headlines. Many of the datasets assembled seem dated, but are nevertheless worth a look. If links are broken then the 'Sources' information may offer some leads.

The Guardian newspaper (UK) maintains the Global Development Datastore, a searchable database for data and visualisation tools. They also run a blog on data and development.

The resource website Macro Data 4 Stata homogenises several commonly used macroeconomic datasets and imports them into Stata. The project is run by Giulia Catini, Ugo Panizza and Carol Saade and started uploading .dta files fairly recently. The library at present includes data from the Penn World Table and the Groningen Growth and Development Data Centre. The AAA Codes dataset looks particularly handy for anybody doing cross-country analysis [thanks to Aid-man Nic Van de Sijpe for pointing me to this resource].

Twitter is also likely prove a good resource for regular updates on data. So far, however, I could only identify UNdata, which regularly reports on updates of its data (includes COMTRADE, FAO among others).

The IQSS Dataverse Network claims to be the world's largest collection of social science research data. As far as I can see this represents primarily the data used in existing papers, although there are also some very interesting 'raw' data links. The project is based at the Institute for Quantitative Social Science at Harvard, so the recent interest in randomized experiments in development is represented quite strongly in this archive. When I accessed it there were over 35,000 studies linking to 640,000 files.

The World Bank Development Research Programs website links to their 11 research topics (e.g. Conflict, International Migration & Development, Macroeconomics & Growth) which have their own data website. Some of the datasets will be presented separately below.

This is not really a datasource, but a data analysis tool: the World Bank is currently developing a data mapper, which will allow subscribers to plot maps for WDI (or their own data) together with other gadgets/tables/graphs. The best thing about this tool is that the 'pictures' can be exported as image, pdf, CSV, which is useful for those who want to avoid spmap in STATA --- I've actually not been able to find any online documentation on this, which is natural since Jose de Bueba suggested this was currently in the development stage. Note that there is also the Data Visualizer, which is yet another gadget to present descriptive statistics (select WDI indicators) and the Online Atlas for the MDGs (Millenium Development Indicators).

The following dedicated apps for World Bank data have caught my eye: (1) the winner of the apps competition, StatPlanet, provides tables and maps for individual WDI and other WB data - the most appealing feature is that the maps can be stored as png file, just as if you'd done it with Stata's spmap. (2) Blind Data gives you a quick and easy visual check of the data coverage (years, # of countries) for WDI and other WB data. (3) MDG Maps, which does exactly what it says. (4) Development Timelines provides 'historical context to international development data', i.e. what events took place in the country (Education policy, Economy, Conflict, Other (domestic) and the 'International agenda').

UN Data, formerly the UN Common Database , let's you browse data by sub-organisation. This provides access to topics such as Commodity Trade, Energy, agriculture (FAO), Gender, Greenhouse Gas, labour (ILO), Industrial Commodities, Key Global Indicators, MDGs, National Accounts, Children, literacy (UNESCO), Demographics, health (WHO), population (World Population Prospects), Tourism and data for publications such as the Human Development Report or the UNHCR Yearbook.

Bill Easterly at NYU provides a number of macro data series (mostly WDI, also PWT among other sources) called the 'Global Development Network Growth Database'. There are also dataseries for 'fixed factors' (geographical data) and government finance.

$$ The OECD have really put in a lot of effort to make their data more accessible: the launch of the OECD iLibrary allows for convenient search of keywords in OECD databases, reports, working papers and other publications. Recall that the OECD has some data which goes beyond the 'rich country club', e.g. aid and foreign direct investment stastistics. The only downside is that the access to the databases is restricted, so it depends whether your organisation has a subscription.

BREAD (Bureau for Resarch in Economic Analysis of Development) provides links to a large number of survey as well as aggregate-level data. This includes for instance the district level data for India in the analysis of weather and mortality in rural versus urban India carried out by Robin Burgess and co-authors (presented at the Glasgow EEA, August 2010).

The Groningen Growth & Development Centre (GGDC) gets a few mentions on this page, but I won't cover all of their datasets. With their long tradition in growth accounting the Centre has a lot of productivity datasets which can be found here.

The National Bureau of Economic Research (NBER) has a number of datasets available at their website. These are divided into Macro, Industry, International Trade, Individual, Hospital, Demographics & Vital Statistics, Patent data and other. Most of these datasets are for developed countries.

UNEP has a convenient database offering access to a large number of datasets from the World Bank and UN organisations - the vast majority of these seem to be freely accessible. They have data on Geography & Environment, including some geospatial datasets, Infrastructure, Education, Agriculture, Economy, among others.

The electronic data center at Emory University provides a long list of data sources for economics and political science.

The US Inter-University Consortium for Political and Social Research (ICPSR) is a huge depository for data relevant for development economics. What I really like about ICPSR is their motto: "Please note that ICPSR does not provide publications, reports, or ready-made statistics. What we do supply are the numeric raw data used to create publications, reports, and figures." I wish some of the international organisations would subscribe to this approach... Your university/institution may need to be a member of ICPSR (check here) for you to get access to the data, but this is not necessarily true. Many of the datasets are in STATA or SAS format already.

Jon Temple's growth resources website at the University of Bristol. This has links to the Barro-Lee education data, and some political economy data. Further it presents working papers and published work on growth. These datasets will also be linked to separately below.

The United Nations Economic Commission for Europe (UNECE) provides data on Economics, Transport, Gender and Forestry (interesting mix!) on their website. Coverage varies, but the earliest date seems to be 1990. [via economicslinks]

The Center for International Data at UC Davis has some trade datasets as well as productivity data for South Korea and Taiwan.

The Inter-American Development Bank (IADB) has a data-portal with over 1,000 searchable statistics and indicators for countries in Latin America and the Caribbean (trade, social indicators, macro, capital flows, regional labour markets, among others), but also some global statistics (including DataGob, the global governance indicator database, which makes me smile).

The LSE's development department STICERD (The Suntory and Toyota International Centres for Economics and Related Disciplines) has a "virtual center" for fieldwork in Development Economics. This not only includes datasets and related materials (questionnaires etc.) but also resources related to methodology.

The William Davidson Institute provides macro and micro data on emerging and transition economies, the Davidson Data Center and Network. When I checked out the website none of the browsing tools worked, but the keyword search delivered a lot of interesting leads. The database also contains links to other databases, such as the China Data Center at U Michigan.

Global Advice Network (funded by the Governments of Austria, Denmark, Germany, Netherlands, Norway, Sweden and the UK) provides the Business Anti-Corruption Portal, intended as information source for SMEs operating in developing countries. Within the 'Country Profiles' you can go to the 'Sources' page to pick out a wealth of WEF, Transparence Intl., World Bank, etc. reports and (importantly) also (micro-)data such as enterprise surveys with relevance for corruption and investment/business climate.

DataFirst is a Survey Data Archive and training facility at the University of Cape Town, South Africa. The Archive’s holdings include the datasets from all major South African surveys, as well as survey data from other African countries. But: Due to copyright restrictions, the datasets themselves are not downloadable from the site but survey data from surveys conducted by the University of Cape Town are available from DataFirst's website via our Public Access Catalogue.

An excellent new resource for household or firm-level data from LDCs is OpenMicroData. I do like their approach: 'OpenMicroData is run by a network of empirical researchers who believe that microdata should be freely available.' Good thinking, guys. So far I can see some of the CSAE African firm and hh datasets linked, as well some data from randomised experiments in education from Burkina Faso. The site has only been up for a few months. [Gunilla Patterson featured the new site on her excellent devdata website]

The Global Social Change Research Project, created by Gene Shackman, hosts a wealth of reports, presentations and datasets.

The Economic History Association has links to a number of databases for economic historians. In order to use these you just need to register with EH (free). Just looking at the data titles, this is a great resource: Italy - Florentine Domains and the City of Verona: 1427, French Slave and Long Distance Trading Profits During the 18th Century, Ottoman Economic/Social History: 1600-1900, to name just a few. Naturally, these data are primarily for (now) developed economies, but there are some links to colonial data, e.g. Developing Country Export Statistics: 1840, 1860, 1880 and 1900.

Another astonishing resource for historical data is provided by the Global Price and Income Group at UC Davis. Looking at their datamap, SSA is blank, but there are quite a few sources for Latin America, South Asia and East Asia.

Rob Hyndman at Monash University in Australia has a dedicated 'Time Series Data Library' which is organised by topic. Rob has a brilliant motto printed at the bottom of the site: "In God we trust. All others must have data." (W. Edwards Deming)

Plamen Nikolov, an economics PhD candidate at Harvard, has put together a handout coa number of micro and macro datasets for development economics. The micro datasets have a distinct focus on household and health-related datasets. [Thanks to Plamen for the pointer]

My favourite Royal Economics Society website, econometricslinks.com, has a dedicated data website. Focus here is on time series and in particular finance data but there are a few good links also for development economists.

The UK Data Archive at the University of Essex has recently launched its Secure Data Service. Funded by the ESRC, this is intended to promote excellence in research by enabling safe and secure remote access by bona fide researchers to data hitherto deemed too sensitive, detailed, confidential or potentially disclosive to be made available under standard licensing and dissemination arrangements. Upon registration you'll be able to analyse UK data with your statistical software of choice via a remote desktop (i.e. you don't get the data for download, but you can analyse it on the UK SDS server.

Back up to the Table of Contents

Datasets for Specific Journal Articles

The IQSS Dataverse Network claims to be the world's largest collection of social science research data. As far as I can see this represents primarily the data used in existing papers. The project is based at the Institute for Quantitative Social Science at Harvard, so the recent interest in randomized experiments in development is represented quite strongly in this archive. When I accessed it there were over 35,000 studies linking to 640,000 files.

Chris Baum at Boston College provides access to a set of 'Instructional STATA datasets for econometrics', which primarily include data to replicate results in Greene, Hayashi, Stock & Watson and other textbooks.

Many academic journals in economics and development economics now require/request authors to post their data and/or code on dedicated websites (or less handy next to the article overview in the contents overview, below indicated as 'browse') for replication and exploration. This includes: The American Economic Journal: Applied Economics (browse), The American Economic Journal: Macroeconomics (browse), The American Economic Journal: Microeconomics (browse), The American Economic Review (browse), Econometrica (from Vol.72, 2004), The Journal of Applied Econometrics (from Vol.3, 1988), The Journal of Development Economics (browse), The Journal of Development Studies (browse), The Review of Economics and Statistics (from Vol.92, 2010) and The Review of Economic Studies (browse). The JAE also has a 'Replication Section'! Note that sadly some 'supplementary materials' for the above journals are merely online appendices and summary statistics, not data (e.g. for the JDS I couldn't find any data). Many of the above-linked sites are linked to subscriptions. Sometimes you can get lucky and find the data (ideally in 'raw' format, not when all the observations that would destroy the result have been dropped) on individual academics' websites. Especially top US departments (Harvard, MIT amongst many others) seem to push their members to make this available, with some UK/Europeans following suit.

Back up to the Table of Contents