This day-long workshop will explore new methods by which conventionally excluded demographics (e.g. women, diverse ethnic, racial and linguistic minorities, etc.) can be included when tracing individuals through different historical records using semi- and fully-automated methods. Three modules build (1) a shared vocabulary, (2) hands-on experience, and (3) insight into current challenges and opportunities in future, emphasising the gains of collaboration between specialists in history, social science, and economics. It will be of interest to researchers who want to design longitudinal experiments which embrace heterogeneity on topics of labour dynamics, human capital formation, and socioeconomic inequality.
After building a shared vocabulary, in the first part of the workshop we will discuss how to test initial databases for bias and design a linking approach which suits the historical context and minimises the impact of irregularities, inconsistencies, and divergences over time and between sources. Together we will evaluate a few typical sets of inclusion/exclusion criteria drawn from the literature which deploy linked cohorts as evidence, and consider the cohorts’ attrition and relevant inclusion rates. In the second part of the workshop, participants will work in interdisciplinary teams to test several (simplified) standard and innovative linking techniques on sample microdata provided by the facilitator. Each team will evaluate their results using metrics of their choice, drawn from the previous module or their own experiences. In the final module, we will explore pushing past the boundaries of conventional microdata, and tackle a selection of current questions, challenges, and concerns raised by participants. We will draw on our shared vocabulary and experience to suggest fruitful paths forward.
The workshop draws on the facilitator’s expertise in linking women and people with ordinary names through 19th century English census microdata to study the impact of gender, class, and immigration on social mobility. Its aim is to demystify automatic linking procedures by providing junior researchers with common vocabulary and experience, and equip them to assess and improve the historical validity of evidentiary cohorts.