Dataset and Codes

Paper Link
- Citation Format: Yue LQ, Zheng J, Mao K (2024). Firms’ Rhetorical Nationalism: Theory, Measurement, and Evidence from a Computational Analysis of Chinese Public Firms. Management and Organization Review 20, 161–203. https://doi.org/10.1017/mor.2024.6
Dataset of our paper:
- This dataset contains nationalism score data for approximately 61,800 MD&A sections in the annual reports of Chinese listed companies from 2000 to 2024.
- You are welcome to download the data. We kindly request that you cite our paper as the source when incorporating our dataset into your work.
- Dropbox: [download link] Google Drive: [download link]
Our constructed nationalism dictionary: [download link]
- The dictionary file contains the selected words for each nationalism dimension and also the document frequency for each word.

Package for analyzing nationalism of annual report:

We provide a Python package to extract the MD&A section from a given annual report and calculate its nationalism score. [download link]
After installing the necessary packages in the requirements.txt, you can utilize our package to generate the nationalism score for any annual report in PDF format using just three lines of code.
- Note: we apply the tika package to extract textual content from a pdf file. You may need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. Please refer to [link] for details.

Following is a demo showing how you can utilize our package to analyze a given annual report.

Page updated

Google Sites

Report abuse