Scaling curves for differentiable parameter learning
Tsai, WP., Feng, D., Pan, M. et al. From calibration to parameter learning: Harnessing the scaling effects of big data in geoscientific modeling. Nat Commun 12, 5988 (2021). https://doi.org/10.1038/s41467-021-26107-z
Evaluating multiscale LSTM models based on cross-validation tests on sparse in situ soil moisture networks across the conterminous United States
Liu, J., Rahmani, F., Lawson, K., & Shen, C. (2022). A multiscale deep learning model for soil moisture integrating satellite and in situ data. Geophysical Research Letters, 49, e2021GL096847. https://doi.org/10.1029/2021GL096847
Bayesian Spillover Graphs for learning temporal relationships and uncovering indirect spillovers and quantifying risk.
Deng, G. and Matteson, D.S., 2022. Bayesian Spillover Graphs for Dynamic Networks. arXiv preprint arXiv:2203.01912.
Modeling animal-related outages by accounting for multiple outage-prone species activity patterns and their unique relationships with seasonality and habitat availability. R scripts for data processing and analysis are available on GitHub at https://github.com/mefeng7/Bird_Outages_MA
Feng, M.L.E., Owolabi, O.O., Schafer, T.L., Sengupta, S., Wang, L., Matteson, D.S., Che-Castaldo, J.P. and Sunter, D.A., 2021. Analysis of animal-related electric outages using species distribution models and community science data. arXiv preprint arXiv:2112.12791
Role of Variable Renewable Energy Penetration on Electricity Price and its Volatility
Owolabi, O.O., Schafer, T.L., Smits, G.E., Sengupta, S., Ryan, S.E., Wang, L., Matteson, D.S., Sherman, M.G. and Sunter, D.A., 2021. Role of Variable Renewable Energy Penetration on Electricity Price and its Volatility Across Independent System Operators in the United States. arXiv preprint arXiv:2112.11338
A survey of critical risk indicators (CRIs) in diverse domains (climate, ecology, hydrology, finance, space weather, and agriculture), how they influence risks to electric grid reliability, and their convergence to explore possible systemic risk. Jupyter notebooks for processing domain CRIs available on GitHub.
Che-Castaldo, J.P., R. Cousin, S. Daryanto, G. Deng, M.-L.E. Feng, R.K. Gupta, D. Hong, R.M. McGranaghan, O.O. Owolabi, T. Qu, W. Ren, T.L.J. Schafer, A. Sharma, C. Shen, M.G. Sherman, D.A. Sunter, L. Wang, D.S. Matteson. (2021). Critical Risk Indicators (CRIs) for the electric power grid: A survey and discussion of interconnected effects. Environment Systems and Decisions, 41(4), pp.594-615. https://link.springer.com/article/10.1007/s10669-021-09822-2
Predicted change in abundance of 1101 mammal species in year 2070 due to global climate change.
Davidow, M., C. Merow, J.P. Che-Castaldo, T.L.J. Schafer, M.-C. Duker, D. Corcoran, D.S. Matteson. (2021). Clustering future scenarios based on predicted range maps. Preprint at https://arxiv.org/abs/2101.07408
Copula quadrant density fit to pair of anomaly scores
Davidow, M., & Matteson, D. S. (2021). Copula Quadrant Similarity for Anomaly Scores. https://arxiv.org/abs/2101.02330
Novel kurtosis weighted embedding for unsupervised anomaly detection
Davidow, M., & Matteson, D. S. (2020). Factor Analysis of Mixed Data for Anomaly Detection. https://arxiv.org/abs/2005.12129
We estimate annual relative abundance estimates (Osprey (Pandion haliaetus) pictured in figure) across four relative abundance indices using eBird and Breeding Bird Survey (BBS) datasets and two different modeling approaches within each dataset. A. Indices share similar directions in their multi-year trends (2005-2018). B. However, there are differences in the directions and magnitudes of the finer resolution, inter-annual changes for each index. Based on these inconsistencies, we suggest that multiple datasets and modeling methods should be considered when estimating species population dynamics at finer scales and resolutions.
M.-L. E. Feng and J. Che-Castaldo. (2021). Reliability of relative bird abundance metrics at fine-scale temporal resolutions temporal resolutions. Accepted: PLOS ONE.
This figure shows dynamics of the mid-quote and inventory of FIIs, MFs, and other traders (Other) at a one-minute frequency during the two crash days: May 19 and May 22, 2006.
Jagannathan, R., Pelizzon, L., Schaumburg, E., Getmansky Sherman, M., & Yuferova, D. (2020). Recovery from Fast Crashes: Role of Mutual Funds. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3239440
This figures depict the total expected default loss per gross notional cleared for different networks per dollar of total gross notional notional cleared across networks.
Kubitza, C., Pelizzon, L., & Sherman, M. G. (2021). Loss Sharing in Central Clearinghouses: Winners and Losers (No. 066). University of Bonn and University of Cologne, Germany. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3278582
This figures shows observed and simulated stream water temperature for Black River at Elyria, Ohio. Our model obtained exceptional performance for modeling stream temperature, which is a critical variable of interest to the aquatic ecological communities and thermal power production.
Rahmani, F.*, K. Lawson*, WY. Ouyang**, A. Appling, S. Oliver and CP. Shen, Exploring the exceptional performance of a deep learning stream temperature model and the value of streamflow data, Environmental Research Letters, doi: 10.1088/1748-9326/abd501 (2020) https://doi.org/10.1088/1748-9326/abd501
Ma, K., Feng, D., Lawson, K., Tsai, W.P., Liang, C., Huang, X., Sharma, A. and Shen, C., 2020. Transferring hydrologic data across continents--leveraging US data to improve hydrologic prediction in other countries. Earth Space Sci. Open Arch. 28. https://doi.org/10.1002/essoar.10504132.1
The dataset ‘DMSP Particle Precipitation AI-ready Data’ accompanies the manuscript “Next generation particle precipitation: Mesoscale prediction through machine learning (a case study and framework for progress)” submitted to AGU Space Weather Journal and used to produce new machine learning models of particle precipitation from the magnetosphere to the ionosphere. Note that we have attempted to make these data ready to be used in artificial intelligence/machine learning explorations following a community definition of ‘AI-ready’ provided at https://github.com/rmcgranaghan/data_science_tools_and_resources/wiki/Curated-Reference|Challenge-Data-Sets
McGranaghan, R.M., Ziegler, J., Bloch, T., Hatch, S., Camporeale, E., Lynch, K., Owens, M., Gjerloev, J., Zhang, B. and Skone, S., 2020. DMSP Particle Precipitation AI-ready Data. doi:10.5281/zenodo.4281122
Space weather is the impact of solar energy on society and a key to understand-ing it is the way that regions of space between the Sun and the Earth’s surface are con-nected. One of the most important and most challenging to model is the way that energy is carried into the upper atmosphere (100-1000 km altitude) - We have produced a new model, using machine learning, that better captures the dynamics of this energy from a large volume of data.
McGranaghan, R.M., Ziegler, J., Bloch, T., Hatch, S., Camporeale, E., Lynch, K., Owens, M., Gjerloev, J., Zhang, B. and Skone, S., 2020. Next generation particle precipitation: Mesoscale prediction through machine learning (a case study and framework for progress). arXiv preprint arXiv:2011.10117.
The magnetosphere, ionosphere and thermosphere (MIT) act as a coherently integrated system (geospace), driven in part by solar influences and characterized by variability and complexity. Among the most important and yet uncertain aspects of the geospace system is energy and momentum coupling between regions. Our presentation helps illustrate the trends in the application of data science for the geospace system.
McGranaghan, R., Camporeale, E., Lynch, K., Gjerloev, J., Bloch, T., Hatch, S., Zhang, B., Riley, P., Owens, M., Shprits, Y., Zhelavskaya, I., Skone, S., 2020. Novel approaches to geospace particle transfer in the digital age: Progress through data science (other). Geophysics. https://doi.org/10.1002/essoar.10501929.
Ouyang, W., Lawson, K., Feng, D., Ye, L., Zhang, C., & Shen, C. (2021). Continental-scale streamflow modeling of basins with reservoirs: a demonstration of effectiveness and a delineation of challenges. arXiv preprint arXiv:2101.04423.
This paper presents new data on female representation in the academic finance profession. In our sample of finance faculty from the top-100 U.S. business schools during 2009–2017, only 16.0% are women.
Getmansky Sherman, M., & Tookes, H. (2020). Female representation in the academic finance profession. Forthcoming in the Journal of Finance. Available at SSRN 3438653. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3438653
Temporal changes in global cropland soil organic carbon (SOC) simulated by the DLEM-Ag model. Blackline and light grey area represent simulated estimations of crop SOC from different simulation experiments. The line represents the estimation of simulation experiment ALL , which considers all environmental factors including climate, atmospheric CO2, nitrogen deposition, land cover and land management practices (top). Temporal changes in accumulated crop SOC as influenced by multiple global changes in climate (CLM), atmospheric CO2 (CO2 ), nitrogen deposition (Ndep), land management practices (LMPs: nitrogen fertilizer, irrigation, harvest, rotation, etc.) and land conversion (LC) from forests, grassland, wetland, etc. to croplands. Dark grey area means the accumulated SOC storage in global croplands during 1901-2010 (bottom)
Ren, W., K. Banger, B. Tao, J. Yang, Y. Huang, and H. Tian (2020), Global pattern and change of cropland soil organic carbon during 1901-2010: Roles of climate, atmospheric chemistry, land use and management, Geography and Sustainability, https://doi.org/10.1016/j.geosus.2020.03.001
Despite significant advances, continual learning models still suffer from catastrophic forgetting when exposed to incrementally available data from non-stationary distributions. Rehearsal approaches alleviate the problem by maintaining and replaying a small episodic memory of previous samples, often implemented as an array of independent memory slots. In this work, we propose to augment such an array with a learnable random graph that captures pairwise similarities between its samples, and use it not only to learn new tasks but also to guard against forgetting. Empirical results on several benchmark datasets show that our model consistently outperforms recently proposed baselines for task-free continual learning.
Tang, B., & Matteson, D. S. (2020). Graph-Based Continual Learning. https://arxiv.org/abs/2007.04813 https://openreview.net/forum?id=HHSEKOnPvaO
New confirmed cases of COVID-19 by day for New Jersey, along with the fitted model from our learned-delay analysis. Under our definition, New Jersey began social distancing on March 16th, and a clear changepoint is observed around 12 days later. A 12-day delay between infection and confirmed test is consistent with other states. Note that the number of new cases per day appears to have merely plateaued as a result of the intervention.
Wagner, Aaron B and Hill, Elaine L and Ryan, Sean E and Sun, Ziteng and Deng, Grace and Bhadane, Sourbh and Martinez, Victor Hernandez and Wu, Peter and Li, Dongmei and Anand, Ajay and Matteson, David S (2020). Social Distancing Has Merely Stabilized COVID-19 in the US. Stat. 2020; 9:e302. https://doi.org/10.1002/sta4.302
Lan Wang , Bo Peng , Jelena Bradic , Runze Li & Yunan Wu (2020) A Tuning- free Robust and Efficient Approach to High-dimensional Regression, Journal of the American Statistical Association, 115:532, 1700-1714, DOI: 10.1080/01621459.2020.1840989
The plot shows an example of simulated data with two added outliers (colored in red and orange). The anomaly scoring shows local adaptive of the algorithm; the first (left) outlier (above a low volatility region) has the highest score while the second (right) outlier (above a high volatility region) has a lower score. The cyan lines indicate posterior mean of the signal component; vertical blue lines indicate predicted change points and gray bands indicate 95% point-wise credible bands for the data excluding the anomaly process component.
Wu, Haoxuan and Matteson, David S (2020). Adaptive Bayesian Changepoint Analysis and Local Outlier Detection. SBIES 2020, https://arxiv.org/abs/2011.09437
Wu, Y., & Wang, L. (2020). Resampling‐based confidence intervals for model‐free robust inference on optimal treatment regimes. Biometrics.
This paper finds evidence that the U.S. was central to the global financial system into 2018, but that the U.S.-China trade war of 2018–2019 diminished its centrality, and the Covid-19 outbreak of 2019–2020 increased the centrality of China.
Billio, M., Lo, A. W., Pelizzon, L., Getmansky Sherman, M., & Zareei, A. (2021). Global realignment in financial market dynamics: Evidence from ETF networks. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3779127
Zhang, W., Griffin, M., & Matteson, D. S. (2020). Modeling Nonlinear Growth Followed by Long-Memory Equilibrium with Unknown Change Point. https://arxiv.org/abs/2007.09417