Modern scientific discovery and engineering innovation are bottlenecked by the complexity of numerical modeling. Simulating extreme physical phenomena—such as fusion energy, aerodynamic fluid flows, or climate patterns—requires modeling and solving sophisticated partial differential equations (PDEs) and massive mathematical optimization problems.
Historically, this has created a severe dual dependency:
The Human Bottleneck: Relying on specialized human experts to spend months manually tuning grids, designing discretization schemes, and debugging fragile simulation code.
The Black-Box Failure: Turning to modern neural network surrogates which, while flexible, operate as uninterpretable black boxes that offer no mathematical guarantees, fail under out-of-distribution scenarios, and violate fundamental laws of physics.
AutoNumerics changes this paradigm by introducing Agentic AI for Computational Science and Engineering (CSE). Operating in the "Space of Language," AutoNumerics bridges high-level human intent, abstract mathematical reasoning, and classical numerical analysis. Its core significance lies in its ability to completely automate the lifecycle of scientific computing while maintaining total transparency and rigorous mathematical grounding. It does not replace classical physics or mathematics; it builds them autonomously, transforming computational engineering into an accessible, trustworthy, and scalable asset.
The AutoNumerics pipeline is an automated, PDE-agnostic system engineered to transform high-level natural language problem descriptions into high-performance, transparent numerical solvers [pdf]. Rather than treating physics as a statistical pattern to be learned by network weights, AutoNumerics mimics an expert team of computational mathematicians to construct explicit, verifiable code.
Structural Property Extraction: The system autonomously extracts the intrinsic mathematical properties of a given PDE
Classical Numerical Grounding: The framework selects the most rigorous classical numerical scheme suited for the exact physics
Closed-Loop Self-Verification: formulation + implementation + verification + explanation
AutoNumerics releases the need for manual, trial-and-error code development. By generating transparent, state-of-the-art numerical solvers rather than black-box models, it provides engineers with human-readable, fully auditable code. This guarantees physical consistency—such as the exact conservation of mass, momentum, and energy—making it safe for mission-critical engineering.
Extending agentic computing to operation research and decision-making, OptimAI bridges the immense gap between informal, text-based engineering goals and rigid, mathematically complex optimization backends [pdf].
Natural Language to Formal Formulation: OptimAI features an autonomous Formulator Agent that interprets informal project descriptions and translates them into precise mathematical optimization models (explicitly identifying decision variables, objective functions, and constraints).
Dynamic Strategic Planning: A specialized Planner Agent maps out the algorithmic execution strategy and dynamically selects the optimal computational backend or solver interface before execution.
Environment-Aware Choice Scheduling: Utilizing an advanced Upper Confidence Bound (UCB) debugging setting, the framework actively samples alternative mathematical formulations and backend configurations. It balances exploring new solver pathways with exploiting known successful strategies to eliminate syntax and logic failures.
In industrial settings, formulating optimization problems correctly is often harder than solving them. OptimAI democratizes mathematical optimization, allowing non-experts to accurately model complex supply chain, structural design, or operational challenges in plain language. Its closed-loop execution loop ensures that the generated code is instantly optimized for the specific hardware backend, dramatically accelerating operational velocity.
To ensure that the AutoNumerics ecosystem can scale reliably across diverse scientific enterprises, our framework includes robust evaluation and data-handling capabilities:
HardNumerics (The Stress-Test Framework): To guarantee reliability, our systems are validated against HardNumerics, a benchmark specifically designed to expose failure modes in AI-generated code. Instead of using basic textbook equations, it stress-tests solvers across axes of extreme physics—including irregular geometries, moving boundaries, shocks, and stiff multiscale dynamics—ensuring that AutoNumerics is battle-tested for real-world chaos.
ReSearch (Intention-Driven Data Discovery): Unlocking scientific discovery requires data. ReSearch acts as an intelligent data discovery feature that translates high-level scientific queries into high-precision dataset matching [pdf]. By resolving the massive metadata heterogeneity found across modern satellite and Earth Science data products, it ensures that data-intensive modeling pipelines are fed with the exact physical inputs they require.
Supported by pioneering national agencies—including the National Science Foundation (NSF), Department of Energy (DOE), Office of Naval Research (ONR), and DARPA—AutoNumerics represents the infrastructure for the future of automated science.
By marrying the efficiency of generative AI with the mathematical rigor of formal verification, our platform dramatically reduces the manual development lifecycle of simulations. This introduces unprecedented agility to fields demanding extreme precision, safely accelerating automated workflows in fluid dynamics, combustion, fusion energy, climate modeling, and structural mechanics. AutoNumerics marks the transition from human-dependent code building to autonomous, trustworthy scientific discovery.
[6] K. V. Bodla, H. Yang*. Principal Prototype Analysis on Manifold for Interpretable Reinforcement Learning. [pdf]
[5] R. Bhatnagar, Y. Sun, C. A. Zhang, Y. Wen*, H. Yang*. HALT: Hallucination Assessment via Latent Testing. [pdf]
[4] K. V. Bodla, R. Bhatnagar, Haizhao Yang. Manifold-based Sampling for In-Context Hallucination Detection in Large Language Models. [pdf]
[3] K. V. Bodla, H. Yang*. Protocode: Prototype-Driven Interpretability for Code Generation in LLMs. [pdf]
[2] R. Thind^, Y. Sun^, L. Liang, H. Yang*. OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents. [pdf]
[1] J. Du^, Y. Sun^, H. Yang*. AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing. AI&PDE: ICLR 2026 Workshop on AI and Partial Differential Equations [pdf]