Micro-data

What are micro-data?

Micro-data contain information on individual persons or firms. In contrast, in aggregated data the unit of observation is a region (such as a municipality). With these data one can only learn something about the average person or firm (within a region). Thus, the aggregation leads to a loss in potentially interesting variation (=information). In the case of macro data, which aggregates information on the level of nations, one can only study variation across or within countries over time. From a reseracher's perspective, micro-data provide more oppourtunites and freedom. Today, the majority of empirical research within social science is based on micro-data. Of course, there are important exceptions such as macroeconomics. (However, even within macroeconomics there is a recent trend towards using more and morce microdata).

What types of micro-data exist?

There different types of micro-data. One useful distinction is by the original purpose of the data. In many cases, data are produced for a specific measurement task. The most prominent example are survey data, which result from collect information from a sample of individuals in a systematic way, . In contrats, so-called "administrative data are designed, collected, and processed for purposes other than economic measurement [... and statistical analysis]'' (Jarmin, JEP, 2019). Administrative data have become very popular among economists (Halla, mimeo, 2020). Examples are social security data, tax registers or health-insurance records. Of course, this distinction is not always clear. For instance, census data (see also below) probably fit in both categories.

Another important distinction is between cross-sectional and panel data. If information on the units of interest (i.e., persons or firms) is collected at the one point (or period) of time, we refee to this as a cross-section. Panel data (or longitudinal data) are multi-dimensional. They typically contain multiple observations of the same units over time. There are also panel data with other or more dimensions. Cross-sectional data can be understood as special cases of panel data with one dimension (=one point in time). The same holds for so-called time series data, which track one unit over time.

Researchers typically prefer panel data over one-dimensional data, since they allow comparions within units over time. The downside of panel data are its higher cost.

Talk about census and micro-cenus data here.

Large data acrhives:

IPUMS: The Integrated Public Use Microdata Series (IPUMS) is the world's largest individual-level population database. It not only provides microdata samples from United States (IPUMS-USA), but also data from international census records (IPUMS-International). The records are converted into a consistent format, are nicely doumented and made available free of chargethrough a web-based data dissemination system. Currently (March, 2020), you can acces 443 census and survey from almost 100 countries. Among these are random samples of individual data from the Austrian census in 1971, 1981, 1991, 2001, and 2011.

ICPSR: The Inter-university Consortium for Political and Social Research (ICPSR).

Important micro-data sets

Household panel survey data

Household-based panel (=longitdudinal) survey data have been for a long-time the working horse in applied microeconometrics.

The most important national panel surveys are:

  • British Household Panel Survey (BHPS, 1991 - today)
  • German Socio-Economic Panel (SOEP, 1989 -today)
  • Household, Income and Labour Dynamics in Australia Survey (HILDA, - today)
  • Panel Study of Income Dynamics (PSID, 1968 - today). This US study is probably the world’s longest running household panel survey.

There are very few international panel surveys:

  • European Community Household Panel (ECHP, 1994- 2001). Unfortunatley, this panel study was only running for 8 years. As from 2003, the EU-SILC survey covers many of the topics initally included in the ECHP.
  • Survey of Health, Ageing and Retirement in Europe (SHARE, 2004 - today). This is a multidisciplinary operated panel of respondents above 50 years of age with a focus on data health, socio-economic status and social and family network.

International cross-sectional surveys

  • Eurobarometer (EB, 19xx - today)
  • International Social Survey Programme (ISSP, 1985 - today)
  • World Values Survey (WVS, 19xx - today). This a large-scale, cross-national, repeated cross-sectional survey study on basic human values, such as beliefs, preferences, attitudes, values and opinions of individuals all over World.
  • Demographic and Health Surveys, (DHS, 1985 - today). These are nationally-representative household surveys in low-income countries with a focus on indicators in the areas of population, health, and nutrition. Currently, more than 400 surevy from over 90 countries are available.

Very specialized surveys

  • Time use surveys collect information on how people spend their time. See, for instance, the American Time Use Survey (ATUS, - today)
  • Consumer expentiure surveys collect information on the expenditues of consumers with a diary. See, for instance the US. Consumer Expenditure Survey (CEX, 1996 - today ).

Firm data

  • Orbis is a company database covering around 300 million companies from all countries. . It is the flaship product from Bureau Van Dijk, a commerical provider of private company data .

Forbes Data

  • Forbes Billionaires
  • Forbes CEO Compensation List

Data from specialzied fields:

  • Education Economics
    • PISA,
    • PIRLS
    • TIMSS
  • Finance
  • Sports Economics

Data search:


Databases


Resources to get started with Stata:


Academic advise (by clever people):