Lectures

Gilles Barthe (MPI Security and Privacy): Verifying probabilistic programs

Probabilistic programs are programs with access to primitives for sampling values from distributions. They have applications in many areas, in particular security and machine learning. The course will review approaches to verify probabilistic programs. We will cover several classes of properties, including expectation-based properties (what is the expected value of the output?), distribution-based properties (is the output uniformly distributed?), and relational properties (does the output leak secret inputs?).

Mike Dodds (Galois, Inc.): Formal Methods: What Works, What's Next

This series is about the engineering reality of formal methods: the practical economics of getting them adopted, Galois's approach to designing a production formal toolchain, and how AI is rapidly changing what's possible in the field.

Lecture 1: Selling formal methods: what works, what doesn't: Formal tools are best understood as engineering tools. As such, we need to ask whether a given tool delivers proportionate benefits at reasonable cost. In practice, this turns out to be surprisingly difficult, and many projects don't 'pencil out'. I'll talk about a few ways scoping and executing on formal methods projects is hard, and propose some tentative solutions.

Lecture 2: Formal methods the Galois way (Part I): SAW, the Software Analysis Workbench, is the workhorse tool we use at Galois. I'll introduce SAW, its specification language Cryptol, and its proof orchestration language SAWscript, all by example. I'll also cover the verification technology that underpins SAW: symbolic simulation and SMT solving.

Lecture 3: Formal methods the Galois way (Part II): Scale and complexity are the main barriers to formal verification in practice. I'll discuss why SAW proofs are hard, when they're hard, and walk through SAW's suite of tools for controlling difficulty. In the second half, I'll discuss a practical SAW case study: Galois's verification of the AWS s2n TLS library.

Lecture 4: Formal methods and AI: today, tomorrow, next year: AI is getting good at mathematical reasoning, and this is fundamentally changing the cost-benefit landscape for formal methods. I'll show what 2026-era AI models can do, and ask what comes next: (maybe) math AIs more capable than any human, and (maybe) serious risks that we'll need formal methods to mitigate.

Jean-Christophe Filliatre (CNRS / Université Paris-Saclay): The Why and How of Deductive Program Verification

This lecture introduces fundamental concepts and techniques related to deductive program verification, such as function contracts, loop invariants, termination proofs, ghost code, modeling of data structures, weakest preconditions, etc. A particular focus is made on the use of automated theorem provers in the verification process, and on the building of a standard library for verification. The lecture includes many examples using the Why3 tool (https://www.why3.org/) and a lot of small exercises in program verification are provided.

Orna Grumberg (Technion): Efficient Solving of Constrained Horn Clauses (CHC) for Program Verification
Constrained Horn Clauses (CHCs) have become a powerful tool in program verification, particularly for deriving loop invariants in safety proofs. Yet, the effectiveness of CHC satisfiability checking strongly depends on the underlying background theory, as different CHC solvers vary widely in their performance across theories.
In this lecture, I will introduce CHCs and their role in program verification, and then present Mosaic, a new theory-modular framework for CHC solving. Mosaic enables a given set of CHCs to be decomposed into fragments, each associated with a different background theory. It further provides an algorithm that coordinates reasoning across these fragments, allowing it to decide satisfiability with respect to the original set of clauses and their original theory.

John Mitchell (Stanford University): Evaluating Agentic AI Systems

There is an ongoing explosion in software development of AI-based applications that rely on LLMs and other foundation models. Starting initially with prompt-based systems, system architectures have evolved into Retrieval-Augmented Generation (RAG) systems and most recently Agentic AI. For all of these architectures, it is often easy to build an initial prototype simply by promoting a powerful model to behave in certain ways. However, refining a prototype into a system that is useful and reliable requires iterative improvement based on clear evaluation metrics. We thus focus on evaluation methods as the key to successful system development, considering also methods such as prompt optimization that rely on automated metrics. Drawing on a number of worked examples, we will look at frameworks such as LangGraph and DSPy, evolving system architectures, and evaluation methods and their effectiveness, setting the stage for further development of the science and engineering of AI-based systems.

Leonardo de Moura (Amazon Web Services): Lean 4 for Program Verification in the Age of AI

Lean 4 is both a functional programming language and an interactive theorem prover built on dependent type theory. Programs and their proofs coexist in the same language, share the same toolchain, and are checked by the same kernel. In this lecture series, we introduce Lean 4 from the perspective of program verification. We cover the type system and its use for specifying correctness properties, the tactic framework for constructing proofs, and the metaprogramming API that allows users to extend Lean in ways that go well beyond custom tactics. Lean's extensibility enables the construction of entire domain-specific verification frameworks, including custom specification languages, proof strategies, and automation pipelines, all within Lean itself, with no need for external tools or trusted code generators. We illustrate this through concrete examples drawn from ongoing verification efforts. We also discuss how AI is beginning to change the landscape of interactive theorem proving, and what this may mean for program verification in the near future.

Peter Müller (ETH Zurich): Automating Separation Logic Proofs with Viper

Separation logic is the prevalent program logic for the verification of heap-manipulating and concurrent programs. It uses a notion of ownership to determine which memory locations may be accessed by a method or thread. Ownership allows one to reason modularly about the possible side effects of an operation (framing) and to prove the absence of interference from other threads. This lecture series will introduce the Viper verification system and explain how it automates separation logic proofs. Viper offers an intermediate verification language that allows one to express a wide range of verification problems. Its verification backends automates proof search using an SMT solver. The lectures will also give an overview of applications of Viper, such as verification tools built on top of Viper (e.g., for Go, Python, and Rust).

Alexander Pretschner (TU München and fortiss): Scenario-Based Tests for Cyber-Physical Systems

We present several results on scenario-based testing, an equivalence class testing method typically applied to (autonomous) cyber-physical systems. We first recap how to generate tests by translating scenario descriptions into optimization problems. We then show that randomly sampling as well as re-using tests in general is problematic and that instead, new tests need to be generated for each new version of the systems, at least in principle - and this is prohibitive. We suggest a way to overcome this problem. Using the example of UAVs, we discuss empirical results on the power of heuristic search and show that different search algorithms lead to largely different results—with some generated test suites finding safety-critical behavior and some failing to do so. Moreover, we show how to create linguistic scenario descriptions from clustered recorded drives and, specifically, show how to validate the quality of these descriptions. Finally, we present different dimensions of and criteria for the completeness of scenario-based test suites.

Jonathan Protzenko (Google): From Linear Type Systems to Rust

In 1990, Phil Wadler wrote his seminal paper "Linear Types can Change the World". But it wasn't until the Rust programming language took off that linear types did change the world. A thirty-year delay might not seem too bad in the grand scheme of things, but it begs the question of what exactly took so long.

In this lecture series, I will focus on the essence of linearity, and the gap between theory and practice. We will start with a gentle introduction to linearity, with a simple lambda-calculus and the implementation of its type-checker. Then, we will review more advanced systems of linearity, studying the inevitable tradeoffs for the sake of usability. Indeed, in the real world, not all objects can be tracked within the type system, and one must eventually give up and trade static checks for dynamic checks (think: Rust's RefCell). Finally, I will show one of the reasons why we care about linearity: making reasoning about programs easier, and specifically, making program verification more tractable. I will illustrate this final point with the Aeneas program verification framework, and how it leverages linearity in the Rust semantics to yield a translation from Rust to a pure lambda-calculus.

Sukyoung Ryu (KAIST): Mechanized Specifications Adopted by Real-World Programming Languages

Real-world programming languages have multiple implementations and continue to evolve, often leading to inconsistencies across implementations. To mitigate this issue, language mechanization frameworks have been proposed that guarantee the same behavior between the executable semantics described in a mechanized specification and tools such as interpreters automatically generated from it. Recently, real-world programming languages have adopted mechanized specifications. In 2022, the TC39 committee decided to integrate ESMeta into the CI systems of the JavaScript language specification and its official test suite. In 2025, the WebAssembly (Wasm) Community Group voted to adopt SpecTec for authoring future versions of the Wasm specification. The P4 Language Design Working Group considers adopting P4-SpecTec for authoring the P4 specification. For each case, we share the challenges faced by the language community, how the mechanization framework addressed them, and how the language community’s governance helped promote adoption. We discuss programming languages promising for mechanization, various mechanization frameworks, and frequently asked questions about adopting mechanized specifications into new programming languages.

Page updated

Google Sites

Report abuse