The World Housing Inequality Database (WHID) is a cross-country harmonized database with housing inequality estimates available at different levels of aggregation. All estimates are constructed using high-quality administrative data based on the universe of properties in a given country.
WHID estimates are publicly available.
The first version of the database includes the following countries: United States, Belgium, and Spain.
Giovanni Paolo Mariani (ULB) is a co-founder and collaborator in this project.
I gratefully acknowledge funding and logistic support from the Fonds Thiepolam (Fondation ULB).
Update (April 2025):
The data for Belgium is already publicly available here.
See this paper for details about the Belgian data.
Update (October 2025):
The data for Spain is ready and available here.
See this paper for details about the Spanish data.
Update (December 2025):
The data for the United States is ready and available here.
See this paper for details about the Spanish data.
The WHID will be hosted in this URL: www.worldhousingdata.com (the link will become active in early January).
Housing is critical to understanding wealth, income, and consumption inequality. Also inequality in opportunities. Housing is the most important and evenly distributed asset throughout the income or wealth distribution. It is, therefore, crucial to understand wealth inequality. According to OECD estimates, housing consumption accounts for 10 to 30% of household consumption in OECD countries. It is, therefore, crucial to understand income and consumption inequality. Finally, owning or renting a house in a neighborhood is the only way to benefit from (or be harmed by) neighborhood effects, impacting outcomes as vital as social mobility. As the "door of entry to neighborhoods," housing is critical to understanding inequality in opportunities.
Cross-country comparability. Compared to other data sources, cadastral data is relatively homogeneous across (at least Western) countries. That translates into high cross-country comparability.
Administrative boundaries are not problematic. Cadastral data is typically geolocated, which implies that arbitrary or changing administrative boundaries do not pose a problem in analyzing the data at any desired level of aggregation.
Top coding is not a problem. Cadastral data typically includes information on the universe of real estate in a given location. Therefore, censoring at the top is not a problem, and imputation methods are less often required.
Analysis over time. Because the year of construction is a variable typically included in cadastres, it is possible to construct a panel of real estate at any level of aggregation at any point in time, with caveats.