Data Biography
Going into this project, I assumed that I would be overwhelmed with data and that I would have to sift through hordes of information about Chinese-American trade to find the statistics that I needed.
I was wrong. The United States Census Bureau has a public database of country specific trade, including Chinese data (US Census Bureau Trade in Goods with China). This was a vital dataset for my project, but it only dates back to January 1985.
To find the additional 14 years, I searched through annual publications of the U.S. Foreign Trade Highlights of Imports and Exports (FT 990). It was an arduous process finding the few lines of data that I needed within each volume for several volumes per year, but Hathi Trust Digital Library had most of the volumes that I needed and was available to the public for free.
I started looking for data on December 14th, 2018 and completed my dataset on April 16th, 2019. I am incredibly grateful to Barbara Levergood for her help and patience and the Bowdoin College Library and Interlibrary Loan for assistance and access to information.
Seasonal adjustments
When compiling the archived data, I had the choice of logging the seasonally adjusted statistics or the non-seasonally adjusted statistics. I decided to create my set without the seasonal adjustments because I could not find a source that explained the methodology behind the U.S. Census Bureau’s modifications. In the data visualizations below, you can see the clear seasonality of imports and exports within a year.
Both graphics demonstrate the import and exports of 1970-1979, but the first figure uses the sum of imports for each year and the sum of exports of each year and the second figure plots the monthly data per year.
The fluctuations that are apparent in the second graph can be explained by predictable seasonal events and influences, which economists refer to as the seasonality of data.
The Bureau of Labor Statistics describes seasonal factors such as "the size of the labor force, the levels of employment and unemployment, and other measures of labor market activity undergo fluctuations due to seasonal events including changes in weather, harvests, major holidays, and school schedules" (BLS). Four major categories that encompass the factors of seasonality are calendar, timing decisions, weather, and expectation (Granger 1978, 33).
Economists Dagum and Bianconcini also provide examples of seasonal factors: in winter, rates of construction and productivity are lower whereas retail sees a seasonal peak. These scholars also emphasize the importance of analyzing both seasonally adjusted and unadjusted data:
Because seasonality ultimately results mainly from noneconomic forces (climatic and institutional factors), external to the economic system, its impact on the economy as a whole cannot be modified in a short period of time. Therefore, it is of interest to decision makers to have the seasonal variations removed from the original series to obtain a seasonally adjusted series. Seasonal adjustment means the removal of seasonal variations in the original series jointly with trading day variations and moving holiday effects. The main reason for seasonal adjustment is the need of standardizing socioeconomic series because seasonality affects them with different timing and intensity (Dagum and Bianconcini 2016, 64)
As I created data visualizations out of the import and export statistics, I realized the importance of seeing the seasonality and the variations across years and decades. Despite this, I also recognize that plotting and comparing the annual sums provides a holistic understanding of the trade relationship. Thus, I utilize both sets of data in my research in an effort to be wholly transparent.
Adjusting for inflation
The timeline of my dataset spans from 1970-2018 and includes 48 years of information. The worth of $1 USD in 1970 is drastically different from $1 USD in 2018, so in order to have comparable statistics, I needed to take into account the rate of inflation.
Here are two graphs that demonstrate how adjusting for inflation changes the data: