CO2 Emissions by Year and Political Affiliation

Abigail Driggers, Erica Flores, Juliamaria Coromac

PSYC 500 Final Project

INTRODUCTION

Background

  • 1970 Clean Air Act

      • The Clean Air Act (CAA) was originally issued to protect human health against known dangerous pollutants. The U.S. Environmental Protection Agency (EPA) became responsible for regulating 6 initial pollutants (carbon monoxide, nitrogen dioxide, sulfur dioxide, particulate matter, hydrocarbons and photochemical oxidants).

      • Section Section 202(a)(1) requires "any air pollutant" from motor vehicles or motor vehicle engines "which in his judgment cause[s], or contribute[s] to, air pollution which may reasonably be anticipated to endanger public health or welfare." [to be regulated by the EPA]

    • 2007 Massachusetts vs EPA

      • In 2003, the EPA announced it did not have authority under the CAA to regulate greenhouse gases, likely due to no direct affect to human health, a key defining point of the original bill, as well as "uncertainty about whether man-made carbon dioxide emissions cause global warming". Essentially, the EPA was saying there was no definitive rationale proving CO2 should be classified as a dangerous pollutant.

      • In 2007, 12 states (Massachusetts included) filed a lawsuit against the EPA demanding the EPA further research the toxicity of CO2. The supreme court ruled the EPA must file an endangerment finding that describes their research and conclusion of whether CO2 can be classified as a dangerous pollutant.

    • 2009 Findings

      • Under the Obama Admin in 2009, the EPA released their endangerment finding that classified CO2 as a dangerous pollutant and therefore concluded that under the CAA, they had the right and duty to regulate CO2.

      • Although originally CO2 was only to be applied to section 202(a)(1), and therefore, only motor vehicles, a wrinkle in the CAA allowed this to be applied to power plants as well. If a pollutant is covered by any part of CAA, this automatically makes that pollutant subject to a Prevention of Significant Deterioration (PSD) provision, which mandates new or modified stationary pollution sources that emit more than 100tonnes of that pollutant per year MUST also be regulated.

      • It is important to note this amendment was enacted in January 2011.

    • Political Affiliation

      • Although climate change was once a bi-partisan issue, in recent decades, climate policy has become much more polarized. In today's society, although there are individuals in both parties advocating for climate change, it is more commonly viewed as a "left issue".

      • This is perhaps due to the fact that in recent decades, the Republican Party has experienced significant increases in funding from large fossil fuel companies (ex. Koch Industries) to fuel election campaigns. These companies have also funded large scale propaganda of climate denial.

      • Individuals in the positions to regulate this law are often controlled by the interest of these large corporate groups. Many fossil fuel companies leverage funding as a tool to get elected officials to behave on their behalf.

      • Seeing the complexities within legislation, it is important to consider how the power and influence of fossil fuel interests may affect implementation of this amendment. This leads to our scientific questions and hypothesis.

Scientific Questions and Hypothesis

  1. Generally, which states (red or blue) have decreased carbon emissions post 2011 (after enactment of the amendment) compared to pre 2011?

  2. Do blue states have a sharper rate of decrease in carbon emissions post 2011 compared to red states?

We hypothesize all states will have a decrease in CO2 emissions post 2011. Additionally, we also hypothesize this rate of decrease to be more aggressive for blue states compared to red states.


Data Curation and Ethics

  • State, year, and CO2 Emissions were obtained from US Energy Information Administration (governmental data)

      • This is public data at the state level (no individual privacy issues)

    • The analyses (of which the results may be polarizing) must be responsibly conducted and interpreted objectively

    • The consequences of our findings are also an ethical concern: do the benefits of knowing how to make better policy changes outweigh the risks of increasing the polarization of politics?

Data Preparation and Exploration

Data Frames and Variables

  • Data Frame: co2

    • Obtained from the U.S. Energy Information Administration

    • Consists of state, year, and CO2 emission data

    • Time frame: 1992-2017

    • Data types of the variables

      • state = object

      • year = float64

    • co2 shape: (51 rows, 29 columns)

    • co2 size depicted below

  • red_blue_status

    • Manually calculated by considering electoral votes for the elections between 1992-2016

    • Included states with votes that went for the same party every election, or had only one vote against their majority over the time period

      • 12 states did not meet this criteria (8 majority red, 4 majority blue)

        • Arkansas, Colorado, Florida, Iowa, Kentucky, Louisiana, Missouri, Nevada, Ohio, Tennessee, Virginia, and West Virginia

    • Data types of the variables

      • state = object

      • general status = object

    • red_blue_status shape: (51 rows, 2 columns)

    • red_blue_status size depicted to the right

  • df

    • Merged result of co2 + red_blue_status used for final future analyses

    • Data types of the variables

      • state = object

      • year = int64

      • CO2 emission = float64

      • general status = object

    • Size and shape are depicted on the right

Reshaping Data Frames: Melted and Merged

Melted: co2→ co2_melt

  • co2_melt data frame is organized by state, year, and CO2 Emissions, respectively

Merged: co2_melt + red_blue_status = df

  • Merging co2 + red_blue_status data frames gives the new df data frame

  • Formatted the column "Year" from an object to an integer for our future analyses

Distinguishing df

df_red

  • Consists of only red state data

df_blue

  • Consists of only blue state data

Pre_2011

  • Both parties involved with CO2 emissions before the amendment

  • Broken into two more specific data frames:

    • Pre2011_red

    • Pre2011_blue

Post_2011

  • Both parties involved with CO2 emissions after the amendment

  • Broken into two more specific data frames:

    • Post2011_red

    • Post2011_blue

Exploratory Data Analysis

  • Utilized bar plots that extracted data from df, df_red, df_blue to highlight the distribution of total instances per million metric tons of CO2 emissions

  • A violin plot, using df, further visualized the distributions of CO2 emissions among different states and their corresponding political status

  • CDFs for 1990-2017 = used df_red and df_blue

  • CDFs for Red and Blue States = pre2011_red, pre2011_blue, post2011_red, post2011_blue

  • Utilized pre2011 and post2011 data in order to compare the CDFs for CO2 emissions

  • Medians of red and blue states are very similar - it is unclear whether there is a significant difference

  • There are some significant outliers with high CO2 emissions, especially among red states

  • There is a small decrease in the median of blue states after 2011

  • = sns.boxplot(x='General Status', y = 'CO2 Emission', data=df, palette="Set2", hue="Pre 2011?", hue_order=[True, False])

= plt.hist(df_red['CO2 Emission'], bins=n_bins_red, color="red")

= plt.hist(df_blue['CO2 Emission'], bins=n_bins_blue, color="blue")

x_red_pre, y_red_pre = cdf(pre2011_red['CO2 Emission'])

x_blue_pre, y_blue_pre = cdf(pre2011_blue['CO2 Emission'])

x_red_post, y_red_post = cdf(post2011_red['CO2 Emission'])

x_blue_post, y_blue_post = cdf(post2011_blue['CO2 Emission'])

  • 2 Clear Outliers: Texas and California

  • Large variation

  • No clear linear trend

Summary Statistics

Model Building and Validation

Assessing Normality

  • Normality of CO2 Emissions distribution for Red States and Blue States

    • Plots of the cumulative distribution functions (CDFs) of CO2 Emissions for red states and blue states were created and overlain on a CDF of a normal distribution with the same mean and standard deviation

    • Neither the CDF for red state CO2 emissions, nor for blue state CO2 emissions, fit the normal CDF. Neither are normally distributed.

Linear Regression: Relationship between Year and CO2 Emissions (Research Q2)

    • Modeling the Relationship Between Year and CO2 Emission: Linear Regression (Research Question 2)

      • Four linear regressions were created using Numpy's "polyfit" function. The slopes and intercepts were plotted as lines over a scatterplot with x = Year and y = CO2 Emission.

      • The four regressions were of year versus CO2 for red states (2005-2010), red states (2011-2017), blue states (2005-2010), and blue states (2011-2017). The regressions for the same political affiliations were plotted on the same scatterplot for ease of comparison.

      • For red states, the 2005-2010 slope of -1.8 indicated an average decrease of 1.8 million metric tons in CO2 emission for each state per year. Likewise, the slope from 2011-2017 (-0.67) indicated a decrease of about 670,000 metric tons in CO2 emission for each state per year. The slope after January 2011 was slightly less sharp.

      • For blue states, the 2005-2010 slope of -2.2 indicated an average decrease of 2.2 million metric tons in CO2 emission for each state per year. Likewise, the slope from 2011-2017 (-0.49) indicated a decrease of about half a million metric tons in CO2 emission for each state per year. The slope after January 2011 was slightly less sharp

      • However, the 95% confidence intervals for each slope (red 2005-2010: [-18.26795294 13.86990881] , red 2011-2017: [-13.31279495 12.42881736] , blue 2005-2010: [-12.09093415 7.74270588] , blue 2011-2017: [-7.7580779 6.74270999] ) all cover a wide range of possible values for the slopes, including 0. This indicates that there is not enough information to find a relationship between year and CO2 emission.

      • The failure to find a significant linear relationship between year and CO2 emission makes answering Research Question 2 difficult. However, the slopes data indicate that neither red nor blue state CO2 emissions decreased more sharply after January 2011, as expected. In fact, they actually decreased less sharply, especially for blue states.

Bootstrap Hypothesis Testing: Median Differences Pre and Post Jan2011 (Research Q1)

    • Significance of Median Differences Before and After January 2011: Bootstrap Hypothesis Testing (Research Question 1)

      • Two bootstrap hypothesis tests were conducted: the first to find whether the difference of medians for red states 2005-2010 and 2011-2017 was significant, and the second to find whether the difference of medians for blue states 2005-2010 and 2011-2017 was significant.

        1. Bootstrap Hypothesis Test for Red States (Difference of Medians)

          • Null hypothesis: The median CO2 Emission for red states is the same for the time period 2005-2010 and the time period 2011-2017.

          • Empirical difference of medians (pre Jan 2011 median - post Jan 2011 median): 4.42 million metric tons

          • The dataframes for CO2 emissions for red states from 2005-2010 and 2011-2017 were concatenated, and the median of the concatenated array was found. Both of the original arrays were shifted so that their medians were equal to the median of the concatenated array.

          • The CDFs of the original and shifted arrays were all plotted

          • 10,000 bootstrap replicates were drawn for each time period from the shifted arrays, using np.median as the function.

          • Replicates of the difference of medians were created

          • A p-value was calculated by dividing the number of times the replicate median difference was greater than the empirical median difference by 10,000

          • It was found that the difference of medians was significant, p < .001

        2. Bootstrap Hypothesis Test for Blue States (Difference of Medians)

          • The same process was conducted for blue states, with null hypothesis that "The median CO2 Emission for blue states is the same for the time period 2005-2010 and the time period 2011-2017".

          • Empirical difference of medians (pre Jan 2011 median - post Jan 2011 median): 11.54 million metric tons

          • It was found that the difference of medians was significant, p < .001

Discussion

Summary

The objective of this project was to explore data on CO2 emissions in the United States and answer research questions on how an amendment effected in January 2011 (considering CO2 to be a dangerous substance with regulation procedures) effected CO2 emission. Specifically, we aimed to answer the research questions "Generally, which states (red or blue) have decreased carbon emissions post 2011 (after enactment of the amendment) compared to pre 2011?" and "Do blue states have a sharper rate of decrease in carbon emissions post 2011 compared to red states?". In order to do this, we curated a dataset on CO2 emission by state and year from the US Energy Information Administration, and created a dataset on state red or blue status based on their voting history (red vs. blue) in presidential elections between 1992 and 2017. The CO2 emission dataset was cleaned and melted so that year was a column. The two datasets were merged on state name, and the type of values in "year" were changed to floats. Subsets of the data that were interesting to our research question were created. Next exploratory data analysis was conducted, where the distributions of CO2 emissions were visualized for red and blue states (PDFs, CDFs, and histograms). Various boxplots, violin plots, and scatterplots were created for further visualization, and descriptive statistics were calculated.

Next, models for the data were created and validated. Normality was checked for the distribution of CO2 emissions for red and blue states (neither distribution was normal; the Poisson distribution would likely fit the data much better). A linear regression was still conducted, however, between year and CO2 emission for four sets of data: red states from 2005-2010 and from 2011-2017 and blue states from 2005-2010 and 2011-2017. All four slopes were negative, with the slope from 2011-2017 in both distributions being slightly less sharp. However, the confidence intervals for the slope of all four regressions included zero, indicating that we could not conclude with 95% confidence that there was a relationship between year and CO2 emission. This was likely due to the large variability in CO2 emission.

Finally, two bootstrap hypothesis tests were conducted to check for a significant difference in median for red states and for blue states between the time period 2005-2010 and 2011-2017. The difference was found to be significant in both bootstrap hypothesis tests (p<0.05), leading us to reject our null hypothesis that the medians were the same.

Insights and Implications

Contrary to our predictions, the slopes of the regression lines pre- and post- the 2011 policy change did not become more sharper negative for either red or blue states. This may indicate that policy changes like this do not have the desired effect, since the CO2 decrease actually leveled out some, which was the opposite of the desired effect. However, the slopes that were found from the linear regression analysis were likely not a good model, because the data were not distributed normally and the confidence intervals included 0. Thus, policymakers should not take this as evidence that bills such as these are not effective. More data must be gathered, or a different model should be used to answer this research question. Because of the difficulties with the linear regression, the answer to research question one is still not yet clear.

The results of our bootstrap hypothesis test have much clearer implications. According to the p-values of the difference of medians for red and blue states pre- and post- the 2011 policy change (both p<0.0001), it is highly unlikely that the difference of medians observed could have been due to chance. Thus, we reject the null hypothesis that the medians for the two time periods are the same for either red or blue states. The medians significantly decrease after the 2011 policy change, with the decrease being somewhat higher for blue states. This is not clear evidence for the efficacy of the 2011 bill, since CO2 emissions have been decreasing since 2005, but it does indicate that blue states may be more impacted by policy changes such as these. If policymakers specifically want to effect change in red states, they may want to channel their efforts into other strategies, such as advertising or awareness efforts. They may also attempt policy changes more specifically targeted to the sources of high CO2 emission in red states. In summary, it is important for policymakers to recognize that this type of policy change may be more effective in blue states (which already have lower mean CO2 emissions) than red states.

Overall, it is not clear how effective the 2011 policy change was. Though both red and blue states medians decreased afterwards, it is not known what would have happened without the 2011 bill. Blue state medians did decrease slightly more, possibly indicating that the 2011 bill was more effective in that area.

Limitations and Future Directions

There were many other factors involved beside the January 2011 amendment that could have accounted for any changes in CO2 emission that were observed. Although the results of our bootstrap analysis suggest median carbon emissions decreased post 2011, this may not be necessarily attributed to this amendment. For example, the fracking boom and era of natural gas has caused a decrease in CO2 unrelated to this amendment (decline of CO2 emissions overall since 2005). Because the linear model did not fit the data well, we did not determine whether the slope of CO2 emission decrease changed significantly post-2011. However, future studies may be able to test these predictions with a non-linear model, such as a Poisson distribution.

Our classification of red and blue states left a few "swing" states unaccounted for in the data. Future studies should observe how these states fall in terms of CO2 emissions and emission changes over time.

The results of this study are inconclusive about the future directions policymakers should take in terms of policy changes such as that of January 2011, but they do indicate that red and blue states seem to have different CO2 emissions and reactions (in CO2 emissions) to these types of policy changes - perhaps suggesting that the goal of decreasing red state CO2 emissions and blue state CO2 emissions should be approached differently. Additionally, more studies should be performed to confirm whether the January 2011 policy change was effective, to what extent, and what other factors (such as state industrialism, population, etc.) contribute to success.

THANK YOU