Speakers

Panel

Nora Belrose is the head of interpretability research at EleutherAI.. Some of her recent work includes LEACE, a method for erasing concepts from neural network activations with provable guarantees, and the Tuned Lens, which enables visualization and analysis of the next-token predictions of language models across the depth of the network. She also founded the AI Optimism movement with Quintin Pope.

Yoshua Bengio (University of Montreal / Mila)

Yoshua Bengio is Full Professor at Université de Montreal, as well as the Founder and Scientific Director of Mila and the Scientific Director of IVADO. He also holds a Canada CIFAR AI Chair. Considered one of the world’s leaders in artificial intelligence and deep learning, he is the recipient of the 2018 A.M. Turing Award with Geoff Hinton and Yann LeCun, known as the Nobel prize of computing. He is a Fellow of both the Royal Society of London and Canada, an Officer of the Order of Canada, Knight of the Legion of Honor of France and Member of the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Julia Bossmann (CERC AAI / Mila)

Julia Bossmann is alignment coordinator at CERC AAI and director of AI at Rights Intelligence, an initiative by the Human Rights Measurement Initiative. She is a researcher on the societal impacts of AI. Julia Bossmann is an Edmund Hillary Fellow and contributor at the World Economic Forum.

Nick Bostrom (University of Oxford)

Nick Bostrom is a Professor at Oxford University, where he heads the Future of Humanity Institute as its founding director. He is the author of more than 200 publications, including Anthropic Bias (2002), Global Catastrophic Risks (2008), Human Enhancement (2009), and Superintelligence: Paths, Dangers, Strategies (2014), which became a New York Times bestseller and sparked a global conversation about the future of AI. His academic work has been translated into more than 30 languages, and he is the world’s most cited philosopher aged 50 or under. He is a repeat main-stage TED speaker and he has been interviewed more than 1,000 times by various media. He has been on Foreign Policy’s Top 100 Global Thinkers list twice and was included in Prospect’s World Thinkers list, the youngest person in the top 15. Some of his recent work has focused on the ethics of digital minds. He has a book in the works on a topic yet to be disclosed.

Ethan Caballero (McGill University/Mila)

Ethan Caballero is a PhD student at Mila working mostly with David Krueger, Irina Rish, and Blake Richards. His research focuses on out-of-distribution generalization, robustness, invariance and, most recently, on neural scaling laws towards forecasting neural network capabilities and alignment properties with increasing data, model size and compute. theory. He proposed the use of the Broken Neural Scaling Laws as the universal functional form for forecasting scaling behaviors of neural networks.

Jenia Jitsev, LAION

Jenia Jitsev is computer scientist, machine learning researcher and neuroscientist, who is co-founder and scientific lead of LAION e.V, the German non-profit research organization committed to open science around large-scale foundation models (openCLIP, openFlamingo) and datasets (LAION-400M/5B, DataComp). He also leads Scalable Learning & Multi-Purpose AI (SLAMPAI) lab at Juelich Supercomputer Center of Helmholtz Association, Germany. In LAION and in his lab at Juelich Supercomputing Center, Dr. Jitsev current focus is on driving and democratizing research on scalable, generalist, transferable multi-modal learning, leading to foundation models capable of strong transfer with predictable behavior derived from corresponding scaling laws, and therefore easily adaptable to various conditions and tasks. Jenia Jitsev is driving force behind uniting researchers from various labs to conduct large-scale machine learning experiments on publicly funded supercomputers in facilities like Juelich Supercomputing Center of Helmholtz Society in Germany or Oak Ridge National Laboratories in USA. For his work, Dr. Jitsev received Best Paper Award at IJCNN 2012, Outstanding Paper Award at NeurIPS 2022 and Falling Walls Scientific Breakthrough Award 2023.

Yann LeCun (Meta)

Yann LeCun is a Turing Award winning computer scientist working in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor of the Courant Institute of Mathematical Sciences at New York University and the Chief AI Scientist at Meta.

Percy Liang (Stanford/Together.ai)

Percy Liang is an Associate Professor of Computer Science at Stanford University (B.S. from MIT, 2004; Ph.D. from UC Berkeley, 2011) and the director of the Center for Research on Foundation Models. His research spans many topics in machine learning and natural language processing, including robustness, interpretability, semantics, and reasoning. He is also a strong proponent of reproducibility through the creation of CodaLab Worksheets. His awards include the Presidential Early Career Award for Scientists and Engineers (2019), IJCAI Computers and Thought Award (2016), an NSF CAREER Award (2016), a Sloan Research Fellowship (2015), a Microsoft Research Faculty Fellowship (2014), and multiple paper awards at ACL, EMNLP, ICML, and COLT.

Irina Rish, University of Montreal/CERC-AAI/Mila

Irina Rish is a Full Professor in the Computer Science and Operations Research department at the Université de Montréal (UdeM) and a core member of Mila – Quebec AI Institute. She holds the Canada CIFAR AI Chair and the Canadian Excellence Research Chair in Autonomous AI. Dr. Rish’s research focus is on continual learning, out-of-distribution generalization, robustness and understanding neural scaling laws and emergent behaviors (w.r.t. both capabilities and alignment) in foundation models - a vital stride towards achieving maximally beneficial Artificial General Intelligence (AGI).

Max Tegmark, MIT

Max Tegmark is a professor doing AI research at MIT as part of the Institute for Artificial Intelligence & Fundamental Interactions and the Center for Brains, Minds and Machines. He advocates for safe and beneficial AI as president of the Future of Life Institute and in his book “Life 3.0”. His AI current aI research focuses on mechanistic interpretability and formal verification.

Talks

Quentin Anthony (Eleuther/University of Montreal - CERC-AAI)

Quentin Anthony is the head of HPC at EleutherAI, a PhD student at the Ohio State University, and an HPC consultant for CERC-AAI. His research is focused on the intersection of deep learning frameworks and high performance computing, focusing on practical bottlenecks such as framework/model codesign, parallelism, model checkpointing, and model/optimizer compression. Quentin is the lead developer of GPT-NeoX, a leading framework for parallel transformer training.

Arjun Ashok (ServiceNow Research / Mila / CERC-AAI)

Arjun Ashok is a Visiting Researcher at ServiceNow Research, Montreal (advised by Alexandre Drouin) and a PhD student at MILA-Quebec AI Institute in the CERC-AAI Lab (advised by Irina Rish). His research interests are in time series forecasting and decision making.

Matthias Bethge

Tübingen AI Center

Charlie Catlett (Argonne National Laboratory)

Charlie Catlett is a senior computer scientist at Argonne National Laboratory and a visiting senior fellow at the Mansueto Institute for Urban Innovation at the University of Chicago. From 2020 to 2022 he was a senior research scientist at the University of Illinois Discovery Partners Institute. He was previously a senior computer scientist at Argonne National Laboratory and a senior fellow in the Computation Institute, a joint institute of Argonne National Laboratory and The University of Chicago, and a senior fellow at the University of Chicago's Harris School of Public Policy.

Maxence Ernoult (RAIN)

Maxence Ernoult has a diverse work experience spanning from 2013 to 2022. Maxence began their career in 2013 as an Undergraduate Teaching Fellow at Lycée Sainte-Geneviève. In 2015, they became a Graduate Research Fellow at the Harvard John A. Paulson School of Engineering and Applied Sciences, where they conducted research in plasmonics at the Capasso Group and earned the 'Grand prix de recherche' in physics from Ecole Polytechnique. In 2016, they were a Graduate Research Fellow at the Université Pierre et Marie Curie (Paris VI) and a Graduate Teaching Fellow at École Polytechnique. Maxence also held a research fellowship at the Department of Engineering at the University of Cambridge, where they earned the best poster award at the European Aerosol conference in Tours. In 2017, they began their Ph.D. Student role at Sorbonne Université, where they worked on neuromorphic computing under the supervision of Prof. Julie Grollier (CNRS/Thalès). In 2020, they became a Research Fellow at Mila - Institut Québécois d'Intelligence Artificielle, working (remotely) under the supervision of Yoshua Bengio and Blake Richards on biologically plausible deep learning. Most recently, in 2021, they became a Research Staff Member at IBM, working on AI safety in the fields of uncertainty quantification, out-of-distribution detection, model calibration, and object detection.

Maxence Ernoult completed their education history with a Master of Advanced Studies (MASt) in Applied Mathematics and Theoretical Physics from the University of Cambridge between 2015 and 2016. Prior to that, they attended \u00c9cole Polytechnique from 2012 to 2015, and Lyc\u00e9e Sainte-Genevi\u00e8ve from 2009 to 2012. Maxence also holds several certifications from Coursera, including Structuring Machine Learning Projects (November 2017), Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization (October 2017), Neural Networks and Deep Learning (September 2017), Neural Networks for Machine Learning (July 2017), and Machine Learning (April 2017).

Dirk Groeneveld

Allen Institute for AI

Andrey Gromov (University of Maryland/Meta)

Andrey Gromov is an assistant professor of Physics at the University of Maryland College Park & Condensed Matter Theory Center. His physics research focused on emergent macroscopic properties in classical and quantum strongly correlated systems including quantum Hall effect, topoclogical phases, liquid crystals and hydrodynamics. He is currently interested in developing quantitative understanding of emergent properties in neural networks such as grokking, scaling laws, abrupt changes in performance, neural network pruning and more.

Kshitij Gupta (University of Montreal/Mila)

Kshitij Gupta is a MSc student at Mila through the Université de Montréal (UdeM) under the supervision of

Prof. Irina Rish and Prof. Sarath Chandar. He completed his undergraduate degree in Computer Science at

the University of Illinois Urbana-Champaign. He is working towards building highly multimodal generally

intelligent agents and is researching topics on multimodal models, reasoning and memory augmented neural

networks. He is passionate about AGI and his research interests include scaling laws and embodied agents.

He has previously gained valuable industry experience through his work at Microsoft and DeepMind and has been recognized for his contributions, receiving awards such as the esteemed Henry Ford Scholar Award.

Julien Launay

Adaptive ML

William Marshall

Cerebras

Huu Nguyen (Ontocord)

Huu Nguyen is co-founder of ontocord.ai , a startup focusing on responsible Artificial Intelligence. He is a former BigLaw partner with a strong background in transactional and IP work. He focuses his practice on commercial and corporate transactions in the technology and venture space, and advises on artificial intelligence matters, licensing, outsourcing, complex commercial arrangements, strategic relationships, regulatory matters, privacy and security matters, cyber law and intellectual property rights matters. He has served as Vice-Chair of the Artificial Intelligence and Robotics Committee of the ABA.

Before practicing corporate, commercial and technology law, Huu focused on intellectual property litigation and patent prosecution. Prior to becoming a BigLaw partner, Huu seconded as an in-house counsel with a global financial services company. Huu was a board member of the Vietnamese-American Bar Association of Washington and has also worked on IP and transactional matters pro bono for nonprofits.

Alexis Roger (University of Montreal / Mila)

Alexis Roger is a master student at Université de Montréal and Mila. He's mainly working on large multimodal models, with a specialization in text and image models. He is highly interested in the ethics and morality of these models, particularly how it can be evaluated and improved. Lately, he has invested his time on the Robin project, an initiative to develop new open source SOTA text-image multimodal models, which is made possible thanks to the US DoE's 2023 INCITE compute grant.

Luke Sernau (Google)

Luke Sernau is an AI researcher at Google, where he studies memory, reasoning, and scale. His work spans the mathematical foundations of learning and model architecture, bringing theory to bear on practical problems. Most recently he is leading an effort to find 10x wins in parameter efficiency. He is a strong proponent of open source and believes that addressing scaling is one of the most important ways to make AI accessible to everyone.

Benjamin Thérien (University of Montreal - CERC-AAI / Mila)

Benjamin Thérien is a Ph.D. student at Mila and the University of Montreal (UdeM) co-advised by Irina Rish and Eugene Belilovsky, where he holds the prestigious FRQNT doctoral scholarship. Prior to his current studies, Benjamin completed his Master’s of Mathematics in Computer Science at the University of Waterloo specializing in 3D Computer Vision. His academic journey began with Bachelor’s degree in Computer Science Co-op at Concordia University, with professional experience at Accedian Networks and Morgan Stanley as well as research experience at Concordia’s CLaC Lab and Mila. His published academic research has focused on Natural Language Processing, 3D Computer Vision, and Geometric Deep Learning. Benjamin’s current research interests lie in improving the efficiency of pre-training algorithms for large-scale model development.

Vishaal Udandarao (Tübingen AI Center)

Vishaal Udandarao is an ELLIS PhD student at The University of Tuebingen (supervised by Matthias Bethge) and The University of Cambridge (supervised by Samuel Albanie). He is interested in the scaling properties of foundation models primarily along the data axis i.e. how does high-quality data inform model properties, and what the implications of high-quality data curation, filtering and scaling are for model generalisation, robustness, compositionality and lifelong learning/adaptation.

Tejas Vaidhya (University of Montreal - CERC-AAI/Mila/Nolano)

Tejas is grad student in computer science at MILA and University of Montreal supervised by Prof. Irina Rish. He received their undergraduate degree from Indian Institute of Technology, Kharagpur.

Tejas's research interests include Scaling Laws and Large Language Model Compression. The goal of his research is to develop technologies and agents that can perceive their environment, reason about it, and communicate their understanding via natural language!

In general, Tejas is curious about everything. Recently, he has developed an interest in economics, psychology, and philosophy.

Natalia Vassilieva

Cerebras

Abhinav Venigalla (Databricks)

Abhinav Venigalla is an NLP Architect at Databricks, where he leads the development of LLMs and training recipes that enable customers to efficiently build custom models on their own private data. He also investigates and writes about emerging ML hardware. In the past Abhinav was a researcher at Cerebras Systems and an early employee at MosaicML.

Luke Zettlemoyer (University of Washington/Meta)

Luke is a Professor in the Allen School of Computer Science & Engineering at the University of Washington, and also a Research Scientist at Facebook. His honors include multiple paper awards, and being named an PECASE Awardee and an Allen Distinguished Investigator. Previously, Luke did postdoctoral research at the University of Edinburgh and earned a Ph.D. at MIT.

Ce Zhang (Together AI)

Ce is currently the CTO of Together, building a cloud tailored for artificial intelligence, and the incoming Neubauer Associate Professor of Data Science at the Department of Computer Science of the University of Chicago. Before that I was an Associate Professor at ETH Zurich. He is interested in the fundamental tension between data, model, computation and infrastructure and the goal of my research is to democratize machine learning for wants to use them to make our world a better place.

Ce finished his PhD at the University of Wisconsin-Madison and spent another year as a postdoctoral researcher at Stanford, both advised by Chris Ré. He did his undergraduate study at Peking University, advised by Bin Cui.