DAIOE tracks how advances in artificial intelligence change the potential impact of AI on different occupations over time. It’s a panel index (2010–) that updates annually and distinguishes between nine major AI subdomains (e.g., language modeling, image generation, speech recognition). Think of it as a moving barometer of where AI capabilities could matter most—not as a forecast of job loss or gain. The net employment effect, e.g., depends on how organisations adopt and apply AI.
Because DAIOE is dynamic and subdomain-specific, it’s better aligned with real technical progress than static, one-shot measures. In essence, it "unpacks" AI--enabling analysis of how exposed occupations are to different types of AI. By combining DAIOE with other data that have occupation codes, such as industry-level data or linked employer-employee data, researchers and policymakers can trace exposure by occupation, sector, skill level, or country (e.g., Denmark, Portugal, Sweden in our firm-level applications), and test whether AI exposure is associated with up-skilling or reallocation inside firms and industries. In our firm-level evidence, total employment associations are small on average, but exposure is linked to a higher share of high-skilled white-collar work and a lower share of low-skill clerical roles. Meanwhile, results differ meaningfully across different types of AI, underlining the importance of "unpacking" exposure to AI.
Coverage: 2010– (currently, 2023)
AI subdomains: 9, related to games, language, and images
Occupations: ISCO-08, SOC2010 and SSYK2012
Granularity: virtually all occupations; sub-indices by subdomain.
Add-ons: one overall and another generative AI version.
Total number of indices: 11 (1 overall; 1 genAI; 9 sub-indices)
Meaning: Exposure = potential applicability of AI capabilities to occupational content (not adoption or automation probability).
DAIOE combines (i) AI capability progress by subdomain with (ii) occupational work content. First, we assemble yearly measures of AI performance across nine applications (e.g., language, vision, speech), drawing on standardised evaluation benchmarks and leaderboards. Second, we map those AI abilities to the skills and abilities that define each occupation. Finally, for each occupation–year, we take a weighted aggregation (a “match” between capability growth and occupational requirements) to form the exposure index and its nine sub-indices as well as two off-springs (the genAI- and non-genAI versions).
Capturing AI capability progress
We track yearly progress in nine AI subdomains—games (abstract strategy, real-time), vision (recognition, comprehension, generation), language (reading comprehension, language modelling, translation), and speech recognition—using 140 benchmarks that have been used to test model performance in AI research (sources: EFF and Papers With Code). For each benchmark, we derive a frontier curve, indicating the state-of-the-art at a given time. Metrics are harmonised so changes are comparable, then summarised each year to capture shifts in the technical frontier.
From capabilities to occupations
Subdomain progress is mapped to 52 O*NET abilities via the Felten et al. (2018) 9×52 matrix. Ability-level exposure is aggregated to occupations with O*NET importance×level weights, yielding an annual exposure change and nine parallel sub-indices. To pay attention to interpersonal roles, we apply a calibrated social-skills discount and then cumulate the annual exposure changes from 2010– to form the DAIOE panel.
Interpretation
DAIOE measures the potential applicability of AI to occupational content, over time—not adoption or automation probabilities. Real-world effects, e.g., depend on the development and adoption of AI applications, complementary investments, organisational choices, and policy.
♟️ Abstract strategy games — tests long-horizon planning & search.
🎮 Real-time video games — adaptive control under time pressure.
🖼️🔎 Image recognition — identifying objects & patterns in images.
🧩🖼️ Image comprehension — reasoning about what an image depicts.
🖌️🖼️ Image generation — creating or editing visual content.
📖 Reading comprehension — extracting and using information from text.
✍️🤖 Language modelling — generating and transforming text.
🌐🔤 Translation — converting meaning across languages.
🗣️🎙️ Speech recognition — turning audio into text reliably.
The DAIOE is developed, explained, validated, and applied in the most current version of the paper by Engberg et al. (2025).
Suggested citation: Engberg, E., Görg, H., Lodefalk, M., Javed, F., Längkvist, M., Monteiro, N., Kyvik Nordås, H., Pulito, G., Schroeder, S., & Tang, A. (2024). "AI Unboxed and Jobs: A Novel Measure and Firm-Level Evidence from Three Countries" (IZA DP 16717).
This Google Drive folder contains the index files and documentation (2010 to latest available year).
We provide DAIOE in five different "flavours", each one according to a specific occupational classification: international (ISCO-08), US (O*NET-SOC 2010 and O*NET-SOC 2010), and Sweden (SSYK 96 and SSYK 2012).
Each "flavour"--aka folder--contains the DAOIE in three file formats, for convenience, namely: Excel, .xlsx; Stata, .dta; and general, .csv.
Each file has the following columns/variables:
Col A: the occupational code
Col B: the occupational title
Col C: the year
After this follows two groups of the DAIOE index. The first group is the actual index, in Col's D---N. The index is a score which enables comparisons of AI exposure between occupations and over time; a one-unit change in the index does not have a specific interpretation.
The second group is a version with the percentile rankings of occupations within the year, enabling comparisons of where an occupation is in the distribution of AI exposure, and this is in Col's O---Y.
Both groups of the DAIOE have the same structure. In the index value version (percentile ranking version), we provide the following:
Col D (O): the general or overall DAIOE index (ranking).
Col's E---M (P---X): the nine sub-domain DAIOE indices (ranking).
Col N (Y): the generative AI DAIOE index (ranking).
Note: A few initial rows of each file are empty, corresponding to armed forces occupations or legislators, where data are lacking for computing the DAIOE, because they are not observed in O*NET.
No—DAIOE captures potential applicability of AI capabilities to occupational content. Whether this changes jobs, e.g., depends on adoption, complements, and organisational choices. Thus, the DAIOE is "agnostic" about job displacement, augmentation, and other potential effects.
DAIOE captures potential applicability of AI in occupations. If, when and how AI will be applied in an occupation depends, e.g., on the cost of aquiring, adapting and using the technology in that occupation, the institutional setting (e.g., laws and regulations), etc.
DAIOE is dynamic and grounded in measured AI benchmark progress. It tracks nine subdomains over time and explicitly adjusts for social interaction. Thus, it "unpacks" AI--enabling analysis of how different areas within AI expose occupational content to its abilities.
Cognitive, non-physical roles with lower social-interaction intensity—often white-collar—tend to rank higher.
Across Denmark, Portugal, and Sweden, we find no systematic change in total headcount, but consistent up-skilling: higher DAIOE relates to higher high-/low-skill ratios and a shift from low-skill clerical toward high-skill white-collar work. Blue-collar associations average near zero, with heterogeneity by subdomain.
We compute a social-skills index from O*NET (importance × level across six social skills) and discount exposure accordingly.
Games (abstract strategy; real-time video); Vision (recognition, comprehension, generation); Language (reading comprehension, language modelling, translation, speech recognition).
Yes. Combine occupation-level DAIOE with local workforce structures (firms, industries, regions) to assess exposure distributions.
We provide a generative-AI sub-index that combines language modelling and image generation.