Background
Australia has revised target of reducing emissions by 43 per cent below 2005 levels by 2030. Australia's quarterly emissions have fallen by 21% since 2005. Australia has reduced its emissions at a faster rate than many similar developed countries.
Source: Prime Minister of Australia
Datasets
This project utilizes greenhouse gas emissions data sourced from the Organisation for Economic Co-operation and Development (OECD).
Link to datasets: OECD Stats - Greenhouse gas emissions by source
Tools
Microsoft Excel to load and transform data.
Microsoft Power BI to create visualization.
R to create model and predict CO2 value.
Analysis
After analyzing data Australia GHG data, below are charts showing total emissions, LULUCF (Land Use, Land-Use Change and Forestry), and pollutants, from 1990 to 2020
Based on analysis above, I built a linear regression model to predict the maximum amount of CO2 emission the country has to keep each year in order to achieve the GHG emission target in 2030. As we know from the report, there are 6 gasses that contribute to total GHG emission, plus LULUCF factor. Among the 6 gasses, the main contributor is CO2.
GHG Emissions
First, calculate the target GHG emission for year 2030. It is 43% below emission in 2005. Hence, the emission value is 57% of 2005 emission (608646.542 thousands tons).
2030 GHG emission = 57% * 608646.542 = 346928.5289 thousands tons
Calculate the difference between 2030 emission and 2020 emission (487591.186 thousands tons).
487591.186 - 346928.5289 = 140662.66 thousands tons
From 2020 to 2030, there are 10 years interval i.e. 2021, 2022, ... 2030. So, I divide the difference value above with 10. 140662.66 / 10 = 14066.266
For LULUCF value, I used constant value and it is same with the 2020 LULUCF value, -40145.861 thousands tons, assuming that Australia Government is consistent to keep the forest area and green space in the country.
Calculate the GHG emission excluding LULUCF, by substracting the GHG emission including LULUCF above with LULUCF value.
Here are the complete values of GHG emission based on calculation above:
Predicting CO2 Emission with R
After retrieving the values of GHG emission for 2021 to 2030, next step is building the linear regeression model with R to predict the CO2 emission for each of the years. This values are useful for all sectors to measure their CO2 emission limit in order to achieve the country's GHG emission target in 2030.
I used 3 libraries here:
readxl to read Excel files,
tidyverse for general analysis,
chemCal to use inverse.predict() function, because the CO2 values (x) will be predicted based on GHG values (y).
Plot the CO2 and GHG emission (values in thousands tons):
ggplot(data = NULL, aes(x = co2_value, y = emission_value)) + geom_point() + geom_smooth(method = 'lm')
Based on plot above, we can seen that CO2 and GHG emission can be modeled with Linear Regression.
Calculate the correlation value between CO2 and GHG:
cor(emission_value, co2_value)
> 0.9974014
The correlation value shows that CO2 and GHG emission have strong positive correlation. It is 0.9974014 and almost +1.
Build the linear regression model:
emission_co2_model <- lm(emission_value ~ co2_value)
summary(emission_co2_model)
Next step, enter the values of GHG emission (without LULUCF) from table above, I assign them as ynew variable:
ynew <- c(513670.8, 499604.5, 485538.2, 471472, 457405.7, 443339.5, 429273.2, 415206.9, 401140.7, 387074.4)
Predict the values of CO2, xpred variable, based on the GHG emission values above:
xpred <- t(sapply(ynew, function(y) inverse.predict(emission_co2_model, y)[1:3]))
Here are the predicted CO2 values (in thousands tonnes) with 95% confidence interval for the next 10 years (2021 to 2030):
Since we want the maximum limit of CO2, sum the prediction values and the confidence values for each year. Here is the final table:
Visualize the prediction values of CO2 with Power BI:
Link to github repository