Trainee Resource
Core Courses
Introduces students to the fundamental mathematical principles of data science that underlie the algorithms, processes, methods, and data-centric thinking. Introduces students to algorithms and tools based on these principles.
Recommended background: CMSE 802 or equivalent experience. Differential equations at the level of MTH 235/255H/340+442/347H+442. Linear algebra at the level of MTH 390/317H. Probability and statistics at the level of STT 231.
Offered every spring semester.
Model reduction and scientific machine-learning (CMSE 890-002, Fall 2024)
The course focuses on scientific ML methods designed to construct reduced models of multiscale systems with an emphasis on the direct connections to computational mathematics and natural science applications. The potential topics include
Model reduction theories such as the bottom-up Mori-Zwanzig and the top-down GENERIC formalism, and the design of proper physics-compatible parametric forms preserving physics constraints. Hands-on materials include continuum laws of balance equations with a connection to physics-informed learning, sparse identification of nonlinear bases, and symmetry-preserving neural networks (connected with mechanical engineering).
The variational inference approaches with applications to probabilistic modeling and uncertainty quantification in the presence of high-dimensional randomness. Hands-on materials include Hamiltonian dynamics and the Liouville equation with a connection to the popular Bayes posterior sampling methods such as MCMC and the Stein’s variational gradient dynamics (connected with math and physics).
Generative models such as the energy-based (EBM) and flow-based models and the graph-embedding NNs with applications to the reduced modeling of quasi-equilibrium multiphysics systems. Hands-on materials include their connection to Langevin dynamics (LD) and examples of training EBM using LD for a 2D lattice with complex interactions (connected with chemistry and biology).
Dynamic models such as the autoregressive models, non-parametric kernel embedding, and recurrent NNs with applications for learning the reduced dynamics of multiscale systems. Hands-on materials include examples of complex systems such as Lorenz models, the Kuramoto–Sivashinsky equation, and molecular dynamics (connected with chemical engineering).
(3 credits) Lead Instructor: Prof. Huan Lei; Offered from Fall 2024
Computational Inverse Problems: From Regularization to Machine Learning (Spring 2025)
In this course we will discuss the fundamentals of inverse problems encountered in science and engineering. We will explore traditional approaches for solving these problems, including linear regression, Tikhonov regularization, Lasso, iterative methods, Fourier techniques, and the Bayesian Method. Additionally, we will also learn contemporary Machine Learning (ML) techniques, such as neural networks and generative priors, used in various reconstruction algorithms. Emphasis will be placed on understanding the theory and mathematics behind Standard and ML methods for inverse problems, as well as on practical implementation details. Our primary focus will be on imaging applications, specifically in the fields of natural image processing and medical imaging.
(3 credits) Offered from Spring 2025
Generating and using high fidelity data for ML/AI: practical and theoretical perspectives (Spring 2026)
This course covers the practical and theoretical aspects of generating high fidelity data for machine learning and artificial intelligence purposes (for, e.g., multiscale and multilevel models of physical systems) by in silico high fidelity computational modeling. The students will learn: how to efficiently sample parameter spaces; create workflows to instantiate, run, execute, and analyze large numbers of simulations and large volumes of data; reduce that data to manageable outputs using a variety of proven techniques; train ML/AI models with these reduced data outputs; and how to verify and validate the results using established techniques such as the method of manufactured solutions. This will all be done within the context of workflows that promote reproducible research in scientific computing (i.e., FAIR research principles). In addition, students will read and discuss papers from multiple application fields as case studies that demonstrate these principles in use. All of the computation will be done using MSU’s Institute for Cyber-Enabled Research and the MSU Data Machine, an NSF-funded, data science-oriented supercomputer.
List of major topics:
Efficient sampling high-dimensional model parameter spaces
Creating modern research computing workflows for simulation ensembles (e.g., tools for workflow automation such as SnakeMake)
Dimensionality reduction techniques and data reduction techniques as applied to multiphysics simulation data
Training of ML/AI models using high-performance computing tools (e.g., PyTorch, TensorFlow)
Verification and validation of ML/AI models (robustness, boundedness, out-of-distribution data detection, method of manufactured solutions)
Computation on modern HPC platforms
(3 credits) Lead Instructor: Prof. Brian O'Shea; Offered from Spring 2026
Effective Communication
Internships
DOE National Laboratory/Facility Office of Science Graduate Student Research (SCGSR) program POCs
The Office of Science Graduate Student Research (SCGSR) Program Application Page (Due date Nov 8th, 5pm EST)
Los Alamos National Laboratory (LANL) Graduate Research Assistant Program
LANL Information Science and Technology Institute (ISTI) Summer Schools
Acknowledgement Statement
Please note that Trainees with fellowship support need to acknowledge the AIDMM-NRT program in their posters/presentations/publications and include an NSF disclaimer from the samples written below. Please use the text format below:
For MSU funded (international students):
This work is supported in part by Michigan State University and the National Science Foundation Research Traineeship Program (DGE-2152014) to (your name).
For NRT funded (domestic students):
This work is supported in part by the National Science Foundation Research Traineeship Program (DGE-2152014) to (your name).
Disclaimer:
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding organizations.
Useful forms
W-9 Form (US citizen) Request for Taxpayer Identification Number and Certification
W-8BEN Form (International student) Certificate of Foreign Status of Beneficial Owner for United States Tax Withholding and Reporting (Individuals)