SEANOE database: Sea scientific open data edition: https://doi.org/10.17882/80022
We conducted a series of Linear Mixed-Effects Regression (LMER) models with parameter estimates using restricted maximum likelihood on the data of published literature regarding in situ coral reef ecosystem calcification rates, in R98 using the lmer function in the lme4 package (version 1.1-21)99. The LMER models included fixed and random effects, and followed a standard and widely accepted statistical approach to provide a framework for data interpretation that can be replicated from our metadata and updated as more field data become available. By integrating multiple quantitative and qualitative controls, the LMER model provides deeper insight than conventional linear models99–101. Due to the frequent occurrence of missing values for explanatory variables throughout the dataset, we adopted a backward-selection process in the LMER, which increased the number of datapoints included in each subsequent model following the removal of a parameter. The backward-selection process used Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) as a guide102, whereby one variable was removed at a time between each subsequent model sequence until a final model was reached that could not be improved by removing any further variables. After each model sequence and the successive removal of a covariate, the data frame was reassigned so there were effectively more data points in subsequent models as covariates became fewer.
We tested whether Gnet was significantly influenced by any of the explanatory variables, including Pnet, latitude (degrees), wave action (exposed, moderate, or protected), duration of study (days), heat type of season (summer – autumn and winter – spring ), study methodology, reef state, Ωar, temperature, calcifiers (% benthic cover) and depth (m). A random intercept term for location was included in the model to account for site-specific variability. Data on nutrients were collected but not included in the LMER due to low reporting (n ≤ 10). Additionally, due to the under-reporting of variance in sampled Gnet, we were unable to include a weighting for Gnet in the model, such as following an inverse-variance method. Latitude (in decimal degrees) was converted to absolute values to represent relative distance from the equator. Reef state was reduced to a categorical factor with two levels (i.e. healthy/unspecified or suffering a level of degradation), as reported in the various publications. The coefficient of calcifiers was log-transformed because this provided a better correlation with Gnet (-0.95) than without transformation (-0.75). Default parameters were used in the lme4 package, with the full statistical model taking the form:
Tomer, A. S., McKenzie, T., Majtenyi-Hill, C., Cabral, A., Yau, Y. Y. Y., Henriksson, L., Bonaglia, S., Call, M., Chen, X., Correa, R. E., Davis, K., Jeffrey, L., Sadar-Noori, M., Tait,
D., Webb, J., Maher, D. T., Zhao, S., Cardenas, M. B., & Santos, I. R. (2024). Global data for SGD and CO2 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.10491455
Liu, M., Raymond, P. A., Lauerwald, R., Zhang, Q., Trapp-Müller, G., Davis, K. L., Moosdorf, N., Xiao, C., Middelburg, J. J., Bouwman, A. F., Beusen, A. H. W., Peng, C.,
Lacroix, F., Tian, H., Wang, J., Li, M., ZHU, Q., Cohen, S., Hoek, W. J. van ., … Regnier, P. (2024). Supplementary Data for Global riverine land-to-ocean carbon export
constrained by observations and multi-model assessment (Version 1). figshare. https://doi.org/10.6084/m9.figshare.24883290.v1
mlr3: Machine Learning in R - Next Generation
Efficient, object-oriented programming on the building blocks of machine learning. Provides 'R6' objects for tasks, learners, resamplings, and measures. The package is
geared towards scalability and larger datasets by supporting parallelization and out-of-memory data-backends like databases. While 'mlr3' focuses on the core
computational operations, add-on packages provide additional functionality.
DALEX: moDel Agnostic Language for Exploration and eXplanation
Any unverified black box model is the path to failure. Opaqueness leads to distrust. Distrust leads to ignoration. Ignoration leads to rejection. DALEX package xrays any
model and helps to explore and explain its behaviour. Machine Learning (ML) models are widely used and have various applications in classification or regression.
Models created with boosting, bagging, stacking or similar techniques are often used due to their high performance. But such black-box models usually lack direct
interpretability. DALEX package contains various methods that help to understand the link between input variables and model output. Implemented methods help to
explore the model on the level of a single instance as well as a level of the whole dataset. All model explainers are model agnostic and can be compared across
different models. DALEX package is the cornerstone for 'DrWhy.AI' universe of packages for visual model exploration
Wilson, Stephanie; Moody, Amy; McKenzie, Tristan; Cardenas, M Bayani; Luijendijk, Elco; Sawyer, Audrey; Wilson, Alicia M; Michael, Holly; Bochao, Xu; Knee, Karen; Cho, Hyung-Mi; Weinstein, Yishai; Paytan, Adina; Moosdorf, Nils; Chen, Chen-Tung Arthur; Beck, Melanie; Lopez, Cody; Murgulet, Dorina; Kim, Guebuem; Charette, Matt; Waska, Hannelore; Ibánhez, J Severino; Chaillou, Gwénaëlle; Oehler, Till; Onodera, Shin-ichi; Saito, Mitsuyo; Rodellas, Valenti; Dimova, Natasha; Montiel, Daniel; Dulai, Henrietta; Du, Jinzhou; Petermann, Eric; Chen, Xiaogang; Davis, Kay L; Lamontagne, Sebastien; Sugimoto, Ryo; Wang, Guizhi; Li, Hailong; Torres, Américo I; Demir, Cansu; Bristol, Emily; Connolly, Craig; McClelland, Jim; Silva, Brenno Januario; Tait, Douglas R; Kumar, BSK; Viswanadham, R; Sarma, Vedula V S S; Silva-Filho, Emmanoel Vieira; Shiller, Alan; Lecher, Alanna; Tamborski, Joe; Bokuniewicz, Henry; Rocha, Carlos; Reckhardt, Anja; Böttcher, Michael Ernst; Jiang, Shan; Stieglitz, Thomas; Charbonnier, Céline; Anschutz, Pierre; Hernandez-Terrones, Laura M; Babu, Suresh; Szymczycha, Beata; Sadat-Noori, Mahmood; Niencheski, Luis Felipe Hax; Null, Kimberly; Tobias, Craig R; Song, Bongkeun; Anderson, Iris; Santos, Isaac R (2023): Global coastal groundwater and subterranean estuary nutrients [dataset]. PANGAEA, https://doi.org/10.1594/PANGAEA.955032