How to match IBES vintage without the landmines?
A short tutorial for researchers who are interested in merging IBES vintage.
Why this tutorial exists
Comparing two IBES Unadjusted Detail History drops (say a 2023 vintage vs. a 2025 vintage) looks trivial. In practice, it’s where many projects quietly go off the rails. Small, harmless changes in values, currency flags, names, and time-of-day stamps can make the same forecast event look “different” across vintages. And a naive merge will “discover” millions of spurious non-matches. That, in turn, can conclude that IBES anonymizes forecasts.
The fix is simple:
(1) put both vintages on the same information set (an as-of window), and
(2) merge on a stable event key that identifies the event, not its attributes.
This workflow is grounded in what we know about how IBES evolved in recent years:
Since 2017Q1, IBES anonymization has primarily affected non-US forecasts, with negligible impact on US EPS forecasts. Recommendations were largely unaffected, which is why they can help diagnose or reverse-engineer anonymized IDs. (See Figure 1 for the time-series pattern and Tables 2-3 for US vs. non-US splits in the JFR paper.)
Across two distant vintages (e.g., 2015 vs. 2021), large-scale ID reshuffling is not the norm: average reassignments are only a few percent, concentrated where analyst IDs were anonymized and later backfilled. (See Table 5 and discussion.)
For cross-vintage identity checks, use Unadjusted files. Using Adjusted detail to match vintages mechanically creates “differences” after splits and will overstate non-matches. (Recommendations section and footnote discussion.)
Together, these facts justify the design choices below and explain why a careful merge often yields >99% overlap after basic hygiene. (The small residual is real and diagnosable, rather than a keying artifact.)