Location: CRAN Polytech, project DATA
Funding: Andra (French national agency on nuclear waste management)
Research theme: physics-included statistical learning of reactive transport phenomena
Motivation/Subject: Deep geological disposal is one of the main solutions considered for the long-term management of high-level radioactive waste, composed of long-lived radioactive isotopes. In France, the Cigéo project aims to construct an underground repository 500 meters deep and covering an area of 250 km², located between the Meuse and Haute-Marne departments. This site was chosen for its favorable geological properties (Callovo-Oxfordian clay formation), which are expected to confine radionuclides and prevent their migration. However, it is crucial to ensure that long-term radioactivity levels do not exceed those naturally present in the environment. This requires accurate predictions of reactive transport phenomena.
This thesis focuses on the precise modeling of reactive transport in porous media, taking into account diffusion, advection, and chemical reactions in the soil. The French National Agency for Radioactive Waste Management (Andra) uses a numerical simulator to solve the governing partial differential equations (PDEs). However, these equations are complex and computationally expensive to solve, especially for large-scale and long-term simulations. These limitations have motivated the exploration of more efficient alternatives, particularly machine learning methods guided by physical laws. Such approaches incorporate physics directly into the learning process, reducing data requirements and computational cost. One promising method is Physics-Informed Neural Networks (PINNs), which incorporate PDEs into the training of neural networks. Recently, PINNs have been successfully applied to simple cases of reactive transport, allowing the modeling of system dynamics without explicit meshing. However, they still face challenges, including limited generalization to complex systems, sensitivity to hyperparameter tuning, difficulty in handling heterogeneous media, and high computational demands. Furthermore, these methods often lack robust uncertainty quantification, which is critical for assessing prediction reliability.
Another approach, explored in a recent Andra-funded thesis, uses Gaussian processes to model nonlinear geochemical systems. Gaussian processes are known for their flexibility and ability to provide probabilistic predictions with uncertainty estimates. Active learning algorithms have been developed to identify the most informative data subsets, thereby reducing data needs while maintaining accuracy. Despite their advantages, Gaussian processes face scalability issues, as their computational complexity grows cubically with dataset size, making them impractical for large datasets.
To address this, Marie-Edith Savino proposed the use of B-splines, piecewise-defined functions, that enable local function approximation. B-splines offer reduced computational complexity and improved numerical stability, making them well-suited for modeling complex geochemical systems with satisfactory accuracy. However, unlike Gaussian processes, B-splines do not provide predictive uncertainty, which limits the assessment of their performance relative to the full Andra simulator. Moreover, both statistical methods (Gaussian processes and B-splines) only partially incorporate physical equations. Integrating this physical knowledge is essential to reduce data requirements and lower computational costs.
Approach: This applied thesis therefore pursues three main objectives:
— the development of several physics-informed learning techniques for simplified reactive transport simulations in porous media, with uncertainty quantification;
— the design of active learning algorithms tailored to these methods;
— a comparative evaluation of these approaches on datasets produced by Andra.
Summary of the results: