AI4Science is an emerging field aimed at leveraging and developing the interaction of AI data-driven approaches and scientific domains like biology, climate, medicine, robotics, and many others. Several large and ambitious programs are currently being launched in different countries, see e.g. the recent reports by the US DOE [DOE 2023], the Australian NSA [Aust-NSA 2022-2024] or the EU [EU 2024]. AI4Science involves a major change in our way to address scientific problems.
As part of this context, Physics-Aware Machine Learning aims at leveraging the potential of machine learning and AI methods for the modeling of complex physical phenomena in combination with the integration of physics principles as prior knowledge. This is a fast-growing topic which has a high potential in many applicative domains such as climate science, earth science, fluid dynamics or surface engineering for example. The spectrum of open problems and possible development remains large and in this proposal we suggest to constitute a working group with a data-driven perspective according to 3 important research axes that could pave the way for the development of new paradigms for physics-aware based research: the development of Foundation models, the conception of few-shot learning approaches and the quantification of uncertainties. The ambitious objective is to propose a framework for scientific monitoring and forecasting on the recent evolution in this area and possibly to participate to structure the community.
Foundation models, refers to a class of ML-based models that are pre-trained on very large and diverse data and then applied to a wide range of different downstream tasks. Up to now, their main successes pertain to semantic data (text, vision). The recent availability of large data sets in a wide range of scientific applications (from climate to medicine and engineering) makes it possible to adapt the principle of these models for general physics modeling. Initial developments in fields like weather forecasting [Lam 2023] or material science [Batatia 2024] already show impressive results. In physics, this direction is still in an early state, but recent developments [Subramanian 2023, McCabe 2023, Herde 2024] already suggest that learning from multiple steady-state or time dependent PDEs could enhance the prediction performance on individual PDE which can represent the next paradigm for data-driven PDE modeling. After a few years of exploration, we believe that developing foundation models for physics will be the next big move in the field of physic-aware deep learning. These models could incorporate prior physics in the form of general conservation laws for example. Given the growing importance of the topic in diverse scientific domains, we believe that it is essential to mobilize the French community in order to develop collective efforts towards the development of foundation models.
Adaptation to new domains and few shot learning. Generalization to new domains and new dynamics is an important bottleneck for the large development of data driven models in physics, notably when the amount of data is reduced due in particular to costly data acquisitions. In this context, few-shot learning provides interesting perspectives to address this issue [Chen 2024, Penwarden 2023]. One appealing direction is the fine-tuning of pre-trained models for new tasks, making a direct link with the development of foundation models to develop fast adaptation approaches to new tasks. More generally, we will consider the perspective of constructing hybrid models with the general integration of physical knowledge as a context for the development of few-shot learning approaches.
Uncertainty quantification. Furthermore, an area that has received comparatively little attention in deep learning as a whole is the assessment of uncertainties in the predictions of data-driven models [Abdar 2021, Mouli 2024]. These uncertainties can be due to noise in the data (aleatoric or data uncertainty) or to lack of physical knowledge (epistemic or model uncertainty), the latter being much more difficult to estimate. However, the identification of both sources of uncertainty is crucial for the applicability of such models in scientific ML, as it provides the necessary confidence level in their predictions. This will also have important implications for concrete learning and active learning algorithms. Therefore, the quantification of uncertainties in data-driven models represents the last objective of our working group.
In summary, the following three axes will be explored within this GT:
Foundation models,
Few-shot learning,
Uncertainty quantification.
General references
[Aust-NSA 2022-2024] Artificial Intelligence & foundation models report, CSIRO, Australia, https://www.csiro.au/en/research/technology-space/ai/AI-foundation-models-report
[DOE 2023] DOE report 2023, https://www.anl.gov/ai/reference/AI-for-Science-Energy-and-Security-Report-2023
[EU 2024] EU reports on AI4Science 2024, https://scientificadvice.eu/advice/artificial-intelligence-in-science/
Scientific references
[Abdar 2021] Abdar M. et al. (2021) A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information fusion (76):243-297.
[Batatia 2024] Batatia I., Benner P, Chiang Y, et al. A foundation model for atomistic materials chemistry. In: ; 2024. http://arxiv.org/abs/2401.00096
[Chen 2024] Chen W., (2024). Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning. 1–21. http://arxiv.org/abs/2402.15734
[Herde 2024] Herde M, Raonić B, Rohner T, et al. Poseidon: Efficient Foundation Models for PDEs. In: ; 2024. http://arxiv.org/abs/2405.19101
[Lam 2023] Lam R, Sanchez-Gonzalez A, Willson M, et al. Learning skillful medium-range global weather forecasting. Science (80- ). 2023;382(6677):1416-1422. doi:10.1126/science.adi2336
[McCabe 2023] McCabe M. et al. (2023). Multiple Physics Pretraining for Physical Surrogate Models. http://arxiv.org/abs/2310.02994
[Mouli 2024] Mouli SC, Maddix DC, Alizadeh S, et al. Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs. In: ICML. ; 2024. http://arxiv.org/abs/2403.10642
[Penwarden 2023] Penwarden M. et al. (2023) A metalearning approach for Physics-Informed Neural Networks (PINNs): Application to parameterized PDEs, Journal of Computational Physics,
[Subramanian 2023] Subramanian S. et al. (2023) Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior, arXiv:2306.00258.