The 2nd KAIST-Mila Prefrontal AI Workshop

The 2nd KAIST-Mila

Prefrontal AI Workshop

Overview

Welcome to the second annual KAIST-Mila Prefrontal AI Workshop. This event highlights our continued research partnership through the Prefrontal AI Research Center, supported by the National Research Foundation of South Korea.

The workshop brings together leading researchers to explore breakthrough developments in System 2 AI, Safe AI, and AI for Science, and offers an ideal setting for knowledge sharing, networking, and building lasting research partnerships.

Connect with an international community and discover opportunities for collaboration, research exchanges, and academic visits to South Korea through meaningful discussions that shape the future of AI.

⏳When: Wednesday, Jul 9th, 2 pm - 5:30 pm (EDT).

📍Where: Mila's Agora (6650 rue Saint-Urbain, H2S 3G9, Montréal)

📹Google Meet: https://meet.google.com/chk-cuky-xig

🤝Registration for Individual Meeting:
Also, if you are interested in scheduling an individual meeting with a member of the KAIST team, please register here:
https://docs.google.com/spreadsheets/d/1_VLtmD5f50mQMMHSuoGnHO4OFLM7_O_lOfs8slekgAw/edit?gid=0#gid=0

Schedules

14:00 - 14:30: Sungjin Ahn (KAIST) Monte Carlo Tree Diffusion for System 2 Planning

Abstract: Diffusion models have recently emerged as a powerful tool for planning. However, unlike Monte Carlo Tree Search (MCTS)—whose performance naturally improves with additional test-time computation (TTC)—standard diffusion-based planners offer only limited avenues for TTC scalability. In this talk, I introduce Monte Carlo Tree Diffusion (MCTD), a novel framework that integrates the generative strength of diffusion models with the adaptive search capabilities of MCTS. Our method reconceptualizes denoising as a tree-structured process, allowing partially denoised plans to be iteratively evaluated, pruned, and refined. By selectively expanding promising trajectories while retaining the flexibility to revisit and improve suboptimal branches, MCTD achieves the benefits of MCTS such as controlling exploration-exploitation trade-offs within the diffusion framework. Empirical results on challenging long-horizon tasks show that MCTD outperforms diffusion baselines, yielding higher-quality solutions as TTC increases.

14:30 - 15:00: Jinwoo Kim (KAIST) Extrinsic Symmetries for Neural Networks

Abstract: Geometric DL constrains neural networks to respect symmetries to improve generalization. The standard approach is to design and stack intrinsically equivariant layers. Recently, there have been advances in extrinsic approaches, which take an entire unrestricted neural network and make it equivariant. In this talk, I will present two such methods: probabilistic symmetrization and random walk neural networks. These methods achieve equivariance through stochastic transformations of the networks' inputs and outputs. The methods allow for flexible and scalable architectural choices, enable knowledge transfer from language or vision domains to symmetric domains by transferring model weights, and their performances can be boosted at test-time by ensembling stochastic predictions.

15:00 - 15:30: Siva Reddy Reasoning Models and Their Implication to Safety
(Mila & McGill University)

Abstract: TBD

15:30 - 15:50: Break Time (20 min) Coffee & Snack

15:50 - 16:20: Sungsoo Ahn (KAIST) Generative Models for Metal Organic Framework

Abstract: Metal-organic frameworks (MOFs) are a class of crystalline materials with promising applications in many areas such as carbon capture and drug delivery. In this talk, I will present our recent developments on generative models for structure prediction and de novo design of MOFs. Existing computational approaches, including ab initio calculations and even deep generative models, struggle with the complexity of MOF structures due to the large number of atoms in the unit cells. To overcome this, we propose to exploit the compositional nature and train a generative model that can propose building blocks (organic linkers and metal nodes) and assemble the blocks using Riemannian flow matching. Our experiments demonstrate improved reconstruction accuracy, the generation of valid, novel, and unique MOFs, and the ability of our model to create novel building blocks.

16:20 - 16:50: Alex Hernandez-Garcia Multi-fidelity Active Learning for Materials and Drug Discoveries
(Mila & UdeM)

Abstract: Science plays a fundamental role in tackling the most pressing challenges for humanity, such as the climate crisis, the threat of pandemics and antibiotic resistance. Meanwhile, the increasing capacity to generate large amounts of data, the progress in computer engineering and the maturity of machine learning methods offer an excellent opportunity to assist scientific progress. In this talk, I would like to offer an overview of our recent work on multi-fidelity active learning with GFlowNets, inspired by applications in materials and drug discovery. First, I will discuss why GFlowNets are a suitable generative framework for scientific discovery, and present examples in both materials and drug discovery. Then, I will present our recent algorithm for multi-fidelity active learning with GFlowNets, designed to efficiently explore combinatorially large, high-dimensional and mixed spaces (discrete and continuous).

16:50 - 17:20: Yoshua Bengio Estimating Computational Uncertainty
(Mila & LawZero & UdeM)

Abstract: GFlowNets are trained to approximately satisfy an exponentially large number of constraints. The result is a policy, i.e., a set of conditional probabilities, and a possibly conditional distribution over generated objects. How much should we trust all these estimated conditional probabilities? We first argue that we should treat these approximation errors differently from epistemic uncertainty and aleatoric uncertainty as usually done in the analysis of learning machines and instead consider these errors as arising from finite computational resources, rather than finite data or missing information in the input. Hence we call that computational uncertainty. Computational uncertainty could be used to sample "safe" objects that have both a high reward and a low uncertainty, or in active learning, such that the reward is increased when the uncertainty is larger. Computational uncertainty can also be used to guide training to spend more time where uncertainty is larger. How can we estimate the computational uncertainty on specific probabilistic predictions? An E-M like algorithm is then proposed to train an amortized estimator of computational uncertainty parametrized as a conditional Dirichlet, in which we treat the constraints as observations and perform an SGD update of the Dirichlet neural network for all the conditionals involved in a given constraint after visiting it during GFlowNet training.