Data Assimilation +
Machine Learning =
Data Learning
What is a Data Learning model? Why to choose Data Learning models? How to use Data Learning models?
#AI4Good
Hi, my name is Rossella Arcucci, I am an Assistant Professor in Data Science and Machine Learning at Imperial College London (ICL) where I lead the Data Assimilation and Machine Learning (Data Learning) Group.
I am the elected AI Speaker of the AI Network of Excellences at ICL where I represent approx. 270 academics working on different aspects of AI.
I am an elected member of the World Meteorological Organization, where I am part of the WMO working group on Data Assimilation and Observing systems.
I represent the Imperial AI community at The World Economic Forum (Global AI Action Alliance;) and at the Confederation of Laboratories for Artificial Intelligence Research in Europe (CLAIRE).
Investigator of EU and EPSRC grants for a total value of +15.2M; I am board member of the new ICL AI initiative named Imperial-X;
I collaborate with the Leonardo Centre at Imperial College Business School, where I contribute to the development of integrative, just and sustainable models of economic and social development by discovering, testing and diffusing new logics of business enterprise.
Degree and master’s degree in mathematics. I have finished a PhD in Computational and Computer Science in February 2012. I received the acknowledgement of Marie Sklodowska-Curie fellow from European Commission Research Executive Agency in Brussels the 27th of November 2017.
The models I have developed have produced impact in many applications such as engineering (to optimise the placement of sensors and reduce the costs), geoscience (to improve accuracy of forecasting), finance (to estimate optimal parameters of economic models), social science (to merge twitter and pooling data to better estimate the sentiment of people), climate changes and others. I have developed accurate and efficient models with data analysis, fusion and data assimilation with machine learning and deep learning for incomplete, noisy or Big Data problems, always including uncertainty quantifications and minimizations.
At DataLearning working group we have weekly meetings with invited speakers, join us!
Data Learning Events
Organised Events:
Workshop on Machine Learning and Data Assimilation for Dynamical Systems, International Conference on Computational Science (ICCS), every year:
The second edition has been a virtual event: https://www.youtube.com/watch?v=DZlNe9bfFK0&t=27s
The Third edition has been a virtual event:
https://www.youtube.com/watch?v=LQSoxz2txZA&list=PLBF13Iq67RMc4QmOSpD-QYrCFNzllyMis
Weekly invited speakers at Data Learning working group (every Tuesday at 16:00 UK time):
https://sites.google.com/view/rossella-arcucci/home/calendar-datalearning?authuser=0
If you are interested in attending to our meetings, you are very welcome, join us!
Selected invited Talks
26 February 2021 - Invited Speaker - Talk Data Learning: Integrating Data Assimilation and Machine Learning at Euro-Mediterranean Centre on Climate Change (CMCC) - virtual event - https://www.youtube.com/watch?v=86eCVRJjMto&t=2885s
06 October 2020 - Invited Speaker - Talk Artificial Neural Network at the service of Data Assimilation (and vice versa) at the European Center Medium Weather Forecast (ECMWF) - ESA Workshop on Machine Learning for Earth System Observation and Prediction - virtual event - https://vimeo.com/465348878
20 Nov 2019 - Invited Speaker - Talk Machine learning and data assimilation for expert systems at Ocean of Knowledge 2019, Royal Society, London, UK - https://www.youtube.com/watch?v=l-603fZUlB8&t=33s
Some Applications of Data Learning
The first Data Learning paper
Caterina Buizza, César Quilodrán Casas, Philip Nadler, Julian Mack, Stefano Marrone, Zainab Titus, Clémence Le Cornec, Evelyn Heylen, Tolga Dur, Luis Baca Ruiz, Claire Heaney, Julio Amador Díaz Lopez, K.S. Sesh Kumar, Rossella Arcucci, Data Learning: Integrating Data Assimilation and Machine Learning, Journal of Computational Science, Volume 58, 2022, 101525, ISSN 1877-7503, https://doi.org/10.1016/j.jocs.2021.101525
R Arcucci, D Xiao, F Fang, IM Navon, P Wu, CC Pain, YK Guo, A reduced order with data assimilation model: Theory and practice, Computers & Fluids 257, 105862, https://doi.org/10.1016/j.compfluid.2023.105862
Applications of Data Learning - selected publications
Air Pollution, Air quality:
R. Arcucci, L. Mottet, C. Pain and Y. Guo - Optimal reduced space for Variational Data Assimilation - Journal of Computational Physics, Vol 379, pag: 51-69 https://doi.org/10.1016/j.jcp.2018.10.042
R. Arcucci, C. Pain, Y. Guo, Effective variational data assimilation in air-pollution prediction, Big Data Mining and Analytics, Vol 1, Issue 4 pag: 297 - 307, 2018 https://doi.org/10.26599/BDMA.2018.9020025
J. Mack, R. Arcucci, M. Molina, Y. Guo - Attention-based Convolutional Autoencoders for 3DVariational Data Assimilation - Journal Computer Methods in Applied Mechanics and Engineering Vol. 372 https://doi.org/10.1016/j.cma.2020.113291
C. Quilodran Casas, R. Arcucci, Y. Guo - Urban Air Pollution Forecasts Generated from Latent Space Representations - ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations, 2020, https://openreview.net/forum?id=VY1hqB5Z7V
R. Arcucci, L. Mottet, C. A. Quilodran Casas, F. Guitton, C. Pain and Y. Guo - Adaptive Domain Decomposition for Effective Data Assimilation - EuroPar 2019, Lecture Notes in Computer Science, vol 11997 https://doi.org/10.1007/978-3-030-48340-1_45
C. Quilodran Casas, R. Arcucci, P. Wu, C. Pain, Y. Guo - A Reduced Order Deep Data Assimilation model - Journal Physica D: nonlinear phenomena, Vol 412 https://doi.org/10.1016/j.physd.2020.132615
E. Lim, R. Arcucci, M. Molina Solana, C. Pain, Y. Guo - Hybrid Data Assimilation: an Ensemble-Variational Approach - 15th International Conference on Signal-Image Technology and Internet-Based Systems (SITIS), IEEE. https://doi.org/10.1109/SITIS.2019.00104
R. Arcucci, C. Quilodran Casas, D. Xiao, L. Mottet, F. Fang, P. Wu, C. Pain and Y. Guo - A Domain Decomposition Reduced Order Model with Data Assimilation (DD-RODA) - Advances in Parallel Computing Volume 36, pag. 189-198, https://doi.org/10.3233/APC200040
E. Aristodemou, R. Arcucci, L. Mottet, A. Robins, C. Pain, Y. Guo, Enhancing CFD-LES air pollution predictions using data assimilation - Journal of Building and Environment, Volume 165, https://doi.org/10.1016/j.buildenv.2019.106383
C. Quilodran, R. Arcucci, C. Pain and Y. Guo - Adversarially trained LSTMs on reduced order models of urban air pollution simulations - Machine Learning and the Physical Sciences at NeurIPS 2020 https://arxiv.org/abs/2101.01568
Fan, H., Cheng, S., de Nazelle, A. J., & Arcucci, R. (2023). An Efficient ViT-Based Spatial Interpolation Learner for Field Reconstruction. In International Conference on Computational Science (pp. 430-437). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-36027-5_34
Wildfires:
Cheng, S., Guo, Y., & Arcucci, R. (2023). A generative model for surrogates of spatial-temporal wildfire nowcasting. IEEE Transactions on Emerging Topics in Computational Intelligence. https://doi.org/10.1109/TETCI.2023.3298535
Lever, J., Cheng, S., & Arcucci, R. (2023, June). Human-Sensors & Physics Aware Machine Learning for Wildfire Detection and Nowcasting. In International Conference on Computational Science (pp. 422-429). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-36027-5_33
Zhong, C., Cheng, S., Kasoar, M., & Arcucci, R. (2023). Reduced-order digital twin and latent data assimilation for global wildfire prediction. Natural Hazards and Earth System Sciences, 23(5), 1755-1768. https://doi.org/10.5194/nhess-23-1755-2023
J Lever & R Arcucci, Sentimental wildfire: a social-physics machine learning model for wildfire nowcasting - Journal of Computational Social Science, 1-39 https://doi.org/10.1007/s42001-022-00174-8
J Lever & R Arcucci, Towards Social Machine Learning for Natural Disasters, International Conference on Computational Science, 756-769, 2022. Springer, Cham. https://doi.org/10.1007/978-3-031-08757-8_62
Cheng S., Jin Y., Harrison S.P., Quilodran-Casas C., Prentice I.C., Guo Y.-K., Arcucci R., Parameter Flexible Wildfire Prediction Using Machine Learning Techniques: Forward and Inverse Modelling. Remote Sens. 2022, 14, 3228. https://doi.org/10.3390/rs14133228
Lever J., Arcucci R., & Cai J. (2022, June). Social Data Assimilation of Human Sensor Networks for Wildfires. In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments (pp. 455-462), https://doi.org/10.1145/3529190.3534735
S Cheng & R Arcucci, Machine learning based surrogate modelling and parameter identification for wildfire forecasting, ICLR, AI for Earth and Space Science, https://hal.archives-ouvertes.fr/hal-03680833
Cheng, S., Prentice, I. C., Huang, Y., Jin, Y., Guo, Y. K., & Arcucci, R.. (2022). Data-driven surrogate model with latent data assimilation: Application to wildfire forecasting. Journal of Computational Physics, 111302, https://doi.org/10.1016/j.jcp.2022.111302
Medicine:
R. Arcucci, L. Moutiq and Y. Guo - Neural Assimilation - Lecture Notes in Computer Science book series Vol 12142, pp 155-168, https://doi.org/10.1007/978-3-030-50433-5_13
R Arcucci, CQ Casas, A Joshi, A Obeysekara, L Mottet, YK Guo, C Pain - Merging Real Images with Physics Simulations via Data Assimilation - European Conference on Parallel Processing, 255-266, https://doi.org/10.1007/978-3-031-06156-1_21
Chen, Y., Liu, C., Huang, W., Cheng, S., Arcucci, R., & Xiong, Z. (2023). Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation. arXiv preprint arXiv:2306.04811. https://doi.org/10.48550/arXiv.2306.04811
Wan, Z., Liu, C., Zhang, M., Fu, J., Wang, B., Cheng, S., ... & Arcucci, R. (2023). Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias. arXiv preprint arXiv:2305.19894. https://doi.org/10.48550/arXiv.2305.19894
Li, J., Liu, C., Cheng, S., Arcucci, R., & Hong, S. (2023). Frozen Language Model Helps ECG Zero-Shot Learning. arXiv preprint arXiv:2303.12311. https://doi.org/10.48550/arXiv.2303.12311
Weather:
Bonavita, M., Schneider, R., Arcucci, R., Chantry, M., Chrust, M., Geer, A., ... & Vitolo, C. (2023). 2022 ECMWF-ESA workshop report: current status, progress and opportunities in machine learning for Earth System observation and prediction. npj Climate and Atmospheric Science, 6(1), 87. https://doi.org/10.1038/s41612-023-00387-2
Schneider, R., Bonavita, M., Geer, A., Arcucci, R., Dueben, P., Vitolo, C., ... & Mathieu, P. P. (2022). ESA-ECMWF Report on recent progress and research directions in machine learning for Earth System observation and prediction. npj Climate and Atmospheric Science, 5(1), 1-5, https://doi.org/10.1038/s41612-022-00269-z
M. Bonavita, R. Arcucci, A. Carrassi, P. Dueben, A. J. Geer, B. Le Saux, N. Longepe, P. Mathieu, L. Raynaud - Machine Learning for Earth System Observation and Prediction - Bulletin of the American Meteorological Society, BAMS-D-20-0307, https://doi.org/10.1175/BAMS-D-20-0307.1
Covid-19:
R Arcucci, CQ Casas, A Joshi, A Obeysekara, L Mottet, YK Guo, C Pain, Merging Real Images with Physics Simulations via Data Assimilation, European Conference on Parallel Processing, 255-266, https://doi.org/10.1007/978-3-031-06156-1_21
S. Cheng, R. Arcucci, C. Pain, Y. Guo, Optimal vaccination strategies for COVID-19 based on dynamical social networks with real-time updating, https://arxiv.org/abs/2103.00485
C. Quilodran Casas, V. Santos Silva, R. Arcucci, C. E. Heaney, Y-K. Guo, C. C. Pain - Digital twins based on bidirectional LSTM and GAN for modelling COVID-19 - Neurocomputing, 470, 11-28, https://doi.org/10.1016/j.neucom.2021.10.043
P. Nadler, R. Arcucci and Y. Guo - A Neural SIR Model for Global Forecasting - ML4H: Machine Learning for Health Workshop at NeurIPS 2020, PMLR 136:254-266, 2020, http://proceedings.mlr.press/v136/nadler20a/nadler20a.pdf
S.Wang, X. Yang, L. Li, P. Nadler, R. Arcucci, Y. Huang, Z. Teng, Y. Guo - A Bayesian Updating Scheme for Pandemics: Estimating the Infection Dynamics of COVID-19 - IEEE Computational Intelligence Magazine vol. 15, no. 4, pp. 23-33, Nov. 2020, https://doi:10.1109/MCI.2020.3019874.
P. Nadler, S. Wang, R. Arcucci, X. Yang, Y. Guo - An epidemiological modelling approach for COVID-19 via data assimilation - European Journal of Epidemiology, Vol. 35, 2020 https://link.springer.com/article/10.1007/s10654-020-00676-7
Economic Models:
P. Nadler, R. Arcucci and Y. Guo - An Econophysical Analysis of the Blockchain Ecosystem - International Conference on Mathematical Research for Blockchain Economy Conference on Mathematical Research for Blockchain Economy 2020, https://doi.org/10.1007/978-3-030-53356-4_3
P. Nadler, R. Arcucci and Y. Guo - A Scalable Approach to Econometric Inference - Advances in Parallel Computing Volume 36, pag.59-68 https://doi.org/10.3233/APC200025
P. Nadler, R. Arcucci and Y. Guo - Data Assimilation for Parameter Estimation in Economic Modelling - 15th International Conference on Signal-Image Technology and Internet-Based Systems (SITIS), IEEE https://doi.org/10.1109/SITIS.2019.00106
P. Khandelwal, P. Nadler, R. Arcucci, W. Knottenbelt, Y. Guo - A Scalable Inference Method For Large Dynamic Economic Systems - ML for Economic Policy 2020 at NeurIPS 2020 http://www.mlforeconomicpolicy.com/papers/MLEconPolicy20_paper_16.pdf
Energy or control systems:
A. Dmitrewski, M. Molina-Solana, and R. Arcucci. - CntrlDA: A building energy management control system with real-time adjustments. Application to indoor temperature. - Building and Environment (2022): 108938. https://doi.org/10.1016/j.buildenv.2022.108938
L.G.B. Ruiz, M.C. Pegalajar, R. Arcucci, M. Molina-Solana - A time-series clustering methodology for knowledge extraction in energy consumption data - Expert Systems with Applications Vol. 160 https://doi.org/10.1016/j.eswa.2020.113731
Engineering:
T. Dur, R. Arcucci, L. Mottet, M. Molina Solana, C. Pain, Y. Guo - Weak Constraint Gaussian Process for optimal sensor placement - Journal of Computational Science vol 42, pag.101-110 https://doi.org/10.1016/j.jocs.2020.101110
G. Tajnafoi, R. Arcucci, L. Mottet, C. Vouriot, Molina Solana, C. Pain, Y. Guo - Variational Gaussian Processes for optimal sensor placement - Applications of Mathematics, pp. 1-31, https://doi.org/10.21136/AM.2021.0307-19
Chen, J., Anastasiou, C., Cheng, S., Basha, N. M., Kahouadji, L., Arcucci, R., ... & Matar, O. K. (2023). Computational fluid dynamics simulations of phase separation in dispersed oil-water pipe flows. Chemical Engineering Science, 267, 118310. https://doi.org/10.1016/j.ces.2022.118310
Social Science:
J Lever & R Arcucci, Towards Social Machine Learning for Natural Disasters, International Conference on Computational Science, 756-769, 2022, https://doi.org/10.1007/s42001-022-00174-8
Lever, J., & Arcucci, R. (2022). Towards Social Machine Learning for Natural Disasters. In International Conference on Computational Science (pp. 756-769). Springer, Cham. https://doi.org/10.1007/978-3-031-08757-8_62
R. Hendrickx, R. Arcucci, J. Amador Dıaz Lopez, Y. Guo, and M. Kennedy - Correcting public opinion trends through Bayesian data assimilation, https://arxiv.org/abs/2105.14276
Ocean:
R. Arcucci, L. Carracciuolo, R. Toumi, Toward a preconditioned scalable 3DVAR for assimilating Sea Surface Temperature collected into the Caspian Sea, Journal of Numerical Analysis, Industrial and Applied Mathematics 12(1-2), pag: 9-28, 2018 http://jnaiam.org/exit.php?url_id=215&entry_id=128
R. Arcucci, D. Basciano, A. Cilardo, L. D'Amore, F. Mantovani - Energy Analysis of a 4D Variational Data Assimilation Algorithm and Evaluation on ARM-Based HPC Systems - Springer Nature 2018, LNCS 10778, pp. 37-47, 2018 https://doi.org/10.1007/978-3-319-78054-2_4
R. Arcucci, C.Carracciuolo, G.Scotti, G. Laccetti - A Decomposition of the Tikhonov Regularization Functional oriented to exploit hybrid multilevel parallelism, Journal of Parallel Programming, pp 1-22, 2017 https://doi.org/10.1007/s10766-016-0460-3
R. Arcucci, L. D'Amore, J.Pistoia, R.Toumi, A.Murli - On the Variational Data Assimilation Problem Solving and Sensitivity Analysis, Journal of Computational Physics Volume 335, pp. 311-326, 2017 https://doi.org/10.1016/j.jcp.2017.01.034
Microfluidics:
Zhuang Y., Cheng S., Kovalchuk N., Simmons M., Matar O. K., Guo Y. K., & Arcucci R. (2022). Ensemble latent assimilation with deep learning surrogate model: application to drop interaction in a microfluidics device. Lab on a Chip. https://doi.org/10.1039/D2LC00303A
Xia, Z., Ma, K., Cheng, S., Blackburn, T., Peng, Z., Zhu, K., ... & Arcucci, R. (2023). Accurate identification and measurement of the precipitate area by two-stage deep neural networks in novel chromium-based alloys. Physical Chemistry Chemical Physics, 25(23), 15970-15987. https://doi.org/10.1039/D3CP00402C
Nathanael, K., Cheng, S., Kovalchuk, N. M., Arcucci, R., & Simmons, M. J. (2023). Optimization of microfluidic synthesis of silver nanoparticles: a generic approach using machine learning. Chemical Engineering Research and Design, 193, 65-74. https://doi.org/10.1016/j.cherd.2023.03.007
Zhu, K., Cheng, S., Kovalchuk, N., Simmons, M., Guo, Y. K., Matar, O. K., & Arcucci, R. (2023). Analyzing drop coalescence in microfluidic devices with a deep learning generative model. Physical Chemistry Chemical Physics. https://doi.org/10.1039/D2CP05975D
Nuclear:
Gong, H., Cheng, S., Chen, Z., Li, Q., Quilodrán-Casas, C., Xiao, D., & Arcucci, R. (2022). An efficient digital twin based on machine learning SVD autoencoder and generalised latent assimilation for nuclear reactor physics. Annals of nuclear energy, 179, 109431. https://doi.org/10.1016/j.anucene.2022.109431