U.S. Census Bureau Statistics 1985-2019
The U.S. Census Bureau posts and updates their foreign trade data at regular intervals with about a two-month delay (for example, on May 9th, 2019, they will post the March 2019 data). Due to this delay, I decided to cap my dataset with December 2018 to have each year in my set be complete.
U.S. Department of Commerce Report FT 990
The Highlights of Exports and Imports report that the U.S. Department of Commerce and U.S. Bureau of the Census published included trade statistics for all of the countries that the United States traded with. Each volume is about 110 pages and includes domestic and foreign trade statistics for the months that the specific volume covers.
Therefore, I went through each volume and searched for the Chinese import and export data and logged the data month by month into a .csv that I created. It is important to note that for all of the volumes from 1970-1984, there was no "China" specific data, but data categorized as "Communist Areas in Asia."
Though "Communist Areas in Asia" and China are clearly not interchangeable, I did some research and have concluded that the data should be purely statistics with China. "Communist Areas in Asia" from 1970-1984 included Vietnam, China, Mongolia, and North Korea, but the US did not have any relations with these countries besides China until after 1987. Mongolia and the US did not have any relations until 1987, Vietnam had a trade embargo until 1994, and to this day, there are no trade relations with North Korea.
Creating a complete dataset from 1970-2018
After finding all of the data from the US Census Bureau and from the FT 990 reports, I created my own .csv files and entered the monthly data into two files – 1970to1984.csv and 1985toNow.csv. I also used the Consumer Price Index information from the United States Department of Labor Bureau of Labor Statistics (CPI Databases) to adjust all of the data for inflation so the data can be read in the value of 2018 USD.
Click here to learn more about how I cleaned and changed the dataset to use it in R.
Though I cannot post the .csv files of the complete dataset, here is a compiled list of the data and links that I gathered.
As I went through the archived data (1970-1984) I realized that it was organized by some terms that I didn't fully understand, but that differentiated the data in an important way. In an effort to have data that could be analyzed uniformly, I decided to utilize the F.A.S. metric for both export and import data. All of the export data was F.A.S. but import data was either F.A.S., C.I.F., or Customs Value Basis.
F.A.S. (Free Alongside Ship): This term means that the seller delivers when the goods are placed alongside the vessel at the named port of shipment. The seller is required to clear the goods fore export. The buyer has to bear all costs and risks of loss or damage to the goods from that moment. This term can be used for sea transport only.
C.I.F. (Cost Insurance and Freight): The F.A.S. value plus aggregate charges.
The United Nations Department of Economic and Social Affairs Statistics division had an expert group on International Merchandise Trade Statistics give a country presentation on the United States of America. In this presentation from December 3-6, 2007, they demonstrated that Customs Value Basis and F.A.S. were the same measurement, so the years of my data that only provided Customs Value Basis statistics so that all of my import data is comparable.
Ultimately, all of my data (both import and export data) is measured in either F.A.S. value or Customs Value Basis, and the two of these are comparable measurements.
Some of the archived data utilized "Z" values, which I changed to 0 because R cannot read character values. The majority of the 0 values in 1970 and 1971 are actually "Z" values, which are defined as "less than one half of rounded unit" and each unit is $1 million. Thus, most imports and exports were not exactly $0, but were between $0 and $1 million and were not significant enough to include in the FT 990 volumes.
Below is a snapshot of what the data looked like with "Z" values.