The documentation for the data set from the paper
can be found in the documentation description file.
Earnings call data
The core earnings call data is here.
There are two mapping files. The first is the one we use in the paper. The second is from WRDS and contains more detailed information (company names, and SIC codes).
This mapping file, from ciqCompanyId (earnings calls, Capital IQ) to gvkey (Compustat), is the one we use in the paper, and was provided to us by SP Global. The file can be downloaded here.
Another mapping from ciqCompanyId to gvkey, which also contains dates indicating when the mapping is valid, as well as company name, and SIC codes. This mapping file is obtained from WRDS, and was augmented by us to add SIC codes. The file can be downloaded here. We did not use this file in the paper.
The topic-word distributions from the Presentation and Q&A topic models:
List of regulatory words:
We thank the Columbia Data Science Institute for their financial support for this project. We also thank S&P Global Market Intelligence for allowing us to share the regulatory measures derived from their earnings call transcripts.