On this page, you will find examples of my work. It does NOT contain all the code necessary for the project. If you are still curious after looking at this page, congratulations!
Let's start out easy with some typical data wrangling processes. First we need to load some data, filter it for our study area and remove those pesky NAs! We also seperate our study area in two zones to counteract the effect of different sampling efforts.
A <- readMat('GLODAPv2.2020_Atlantic_Ocean.mat')
A <- A %>%
filter(G2latitude >= (min(ABasin$lat)) & G2latitude <= (max(ABasin$lat))) %>%
filter(G2longitude >= (min(ABasin$lon)) & G2longitude <= (max(ABasin$lon))) %>%
mutate_all(.funs = funs(ifelse(. == "NaN", NA, .))) %>%
filter((!is.na(G2salinity))) %>%
filter((!is.na(G2theta))) %>%
mutate(source = "GLODAP")
# Creating the zone variable based on latitude
A$zone <- ifelse(A$G2latitude >= 39, "zone1",
ifelse(A$G2latitude < 39, "zone2", NA))
Let us see what we are working with!
A <- A %>%
mutate(layer = ifelse(G2pressure >= 100 & G2sigma0 <= 27.2, "NACW",
ifelse(G2sigma0 > 27.2 & G2sigma1 <= 32.35,"MW",
ifelse(G2sigma1 > 32.35 & G2sigma2 <= 37, "LSW",
ifelse(G2sigma2 > 37, "NADW", NA))))) %>%
filter(!is.na(layer)) # Delete the NA values in layer (surface points)
ord <- c("NACW", "MW", "LSW", "NADW")
A$layer <- factor(A$layer, levels = ord)
You need carbon system variables such as pH, total alkalinity, or dissolved inorganic carbon BUT you lost most of them?! Fear not! You just need two of them to compute the rest. Isn't that amazing?
A <- A %>% # Adding flags for seacarb
mutate(flag = ifelse(G2talkf == 2 & G2tco2f == 2, 15, # Flag 15 Alk & DIC
ifelse(G2talkf == 2 & G2tco2f == 0 & G2phtsinsitutpf == 2, 8, # Flag 8 pH & Alk
ifelse(G2talkf == 0 & G2tco2f == 2 & G2phtsinsitutpf == 2, 9, NA)))) # Flag 9 pH & DIC
A <- bind_cols(A[, 1:ncol(A)-1], # To delete the flag column
carb(flag = A$flag,
var1 = ifelse(A$flag == 15, A$G2talk / 10^6,
ifelse(A$flag == 9, A$G2phtsinsitutp,
ifelse(A$flag == 8, A$G2phtsinsitutp, NA))),
var2 = ifelse(A$flag == 15, A$G2tco2 / 10^6,
ifelse(A$flag == 9, A$G2tco2 / 10^6,
ifelse(A$flag == 8, A$G2talk / 10^6, NA))),
A$G2salinity,
A$G2theta,
A$G2pressure / 10, # Pressure in bar!
Patm = 1.0,
Pt = A$G2phosphate / 10^6, # Nutrients in µmols/Kg
Sit = A$G2silicate / 10^6,
pHscale = "T", kf = "pf", k1k2 = "l", ks = "d", b = "u74"))
A <- mutate(A, xcCO3 = CO3 - (CO3 / OmegaAragonite)) # Compute xcCO3
A <- filter(A, !is.na(flag))
Data is nothing without errors... so let's compute them!
A_errors <- seacarb::errors(flag = A$flag,
var1 = ifelse(A$flag == 15, A$G2talk / 10^6,
ifelse(A$flag == 9, A$G2phtsinsitutp,
ifelse(A$flag == 8, A$G2phtsinsitutp, NA))),
var2 = ifelse(A$flag == 15, A$G2tco2 / 10^6,
ifelse(A$flag == 9, A$G2tco2 / 10^6,
ifelse(A$flag == 8, A$G2talk / 10^6, NA))),
A$G2salinity,
A$G2theta,
A$G2pressure / 10, # Pressure in bar!
Patm = 1.0,
Pt = A$G2phosphate / 10^6, # Nutrients in µmols/Kg
Sit = A$G2silicate / 10^6,
evar1 = ifelse(A$flag == 15, 2/10^6,
ifelse(A$flag == 9, 0.0055,
ifelse(A$flag == 8, 0.0055, 0))),
evar2 = ifelse(A$flag == 15, 2/10^6,
ifelse(A$flag == 9, 2/10^6,
ifelse(A$flag == 8, 2/10^6, 0))),
pHscale = "T", kf = "pf", k1k2 = "l", ks = "d", b = "u74")
# rename columns, and join to the main data.frame
A_errors <- A_errors %>%
rename(eH = H, epH = pH, eCO2 = CO2, efCO2 = fCO2, epCO2 = pCO2, eHCO3 = HCO3, eCO3 = CO3, eDIC = DIC, eALK = ALK, eOmegaAragonite = OmegaAragonite, eOmegaCalcite = OmegaCalcite)
A <- bind_cols(A, A_errors)
After some more data wrangling, we can finally compute a robust linear model. In this case we compute the model for pH data. What do we need? The mean pH per year and the mean error per year.
RLMmodel_pH <- MASS::rlm(pH ~ G2year, data = final_pH, weights = (1 / final_pH_error$epH))
tableRLMmodel_pH <- summary(RLMmodel_pH)
trend_pH <- tableRLMmodel_pH[["coefficients"]][2, 1] # This is the rate of change
etrend_pH <- tableRLMmodel_pH[["coefficients"]][2, 2] # etrend stands for "error in the trend"
pvalue_pH <- pnorm(
abs(
broom::tidy(RLMmodel_pH)
[2, "statistic", drop = TRUE]), lower.tail = FALSE) * 2
After all this code, finally some results.
Presented here is the pH trend between 1971 and 2016 in subsurface waters (100 m - 500 m) of the Portuguese Margin. The measurements are coloured according to their source, GLODAP in red, ICES in green, and old cruise data in blue. The black points represent the means of each year. The pH value decreased at a steady rate of -0.0017 ± 0.0002 per year. The associated p-value is < 0.05. Through adding the pH values from the "rescued" data, we were able to fill knowledge gaps and increase the time frame of the linear model by about one decade.
You still have not enough of R code? Are you curious about the intermediate steps of the script and want to see more trends for xcCO3 and anthropogenic carbon? You are lucky, feel free to explore more.