Dataset and Codes
Citation Format: Yue LQ, Zheng J, Mao K (2024). Firms’ Rhetorical Nationalism: Theory, Measurement, and Evidence from a Computational Analysis of Chinese Public Firms. Management and Organization Review 20, 161–203. https://doi.org/10.1017/mor.2024.6
Dataset of our paper: (updated on 12/26/2023)
This dataset contains nationalism score data for approximately 41,000 MD&A sections in the annual reports of Chinese listed companies from 2000 to 2020.
You are welcome to download the data. We kindly request that you cite our paper as the source when incorporating our dataset into your work.
Dropbox: [download link] Google Drive: [download link]
Our constructed nationalism dictionary: [download link] (updated on 11/18/2023)
The dictionary file contains the selected words for each nationalism dimension and also the document frequency for each word.
Package for analyzing nationalism of annual report:
We provide a Python package to extract the MD&A section from a given annual report and calculate its nationalism score. [download link] (updated on 11/18/2023)
After installing the necessary packages in the requirements.txt, you can utilize our package to generate the nationalism score for any annual report in PDF format using just three lines of code.
Note: we apply the tika package to extract textual content from a pdf file. You may need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. Please refer to [link] for details.
Following is a demo showing how you can utilize our package to analyze a given annual report.