NEW LOCATIONS - Find us in UComm 2-108 (regular weekly seminars) or UComm 2-350 (monthly Fellows seminars)
Monthly Amii Fellow Seminar
Speaker
Dr. Michael Bowling, Amii Fellow & Professor in the Department of Computing Science, University of Alberta
Title
Distribution Embedding: From Donuts to Cell Dynamics
Abstract
Machine learning researchers love to embed things. Given a complex entity we often seek to find a compact representation that captures its essence and relationship to other entities. For example, we embed words, users, images, and text prompts. However, how might we find compact representations of entities made up of other entities, such as a distribution? This has been substantially underexplored! Parametric representations of a distribution provide one approach to identifying a compact description (e.g., mean and covariance), but require knowing the family in advance. Empirical distributions can be agnostic to the family but may need many samples to provide descriptive coverage, ceasing to be compact or give meaning to the representation. In this talk I will present an approach to learning a compact representation of a distribution from an unknown family. I will show how the learned embeddings respect the underlying family, encode the relationships between members of the family, and can be used in downstream tasks. I will also show some applications of this work both in games and in biology, where the goal is to model dynamics of cell populations. Note that there will be many donuts in the talk, but the edible ones will need to be consumed before we start.
Presenter Bio
Michael is a Professor at the University of Alberta, a Canada CIFAR AI Chair, Fellow in the Alberta Machine Intelligence Institute, and a AAAI Fellow. While remaining a professor, he was also a Senior Staff Research Scientist at DeepMind and site lead for DeepMind’s first research office outside of London. At the University of Alberta, Michael led the Computer Poker Research Group, which built some of the best poker playing artificial intelligence programs in the world, including being the first to beat professional players at both limit and no-limit variants of the game. He was also behind the use of Atari 2600 games to evaluate the general competency of reinforcement learning algorithms, which is now a ubiquitous benchmark suite of domains for reinforcement learning.
Website
Speaker
Weijie Sun, PhD student, University of Alberta, supervised by Dr. Russ Greiner & Dr. Abram Hindle
Title
Rethinking False Positives: Can ECG-Based Diagnostic ML Models Predict Future Heart Failure Risk?
Abstract
Diagnostic machine learning (ML) models are typically judged by cross-sectional accuracy at the time of testing, and “errors” (false positives/negatives) are discarded. We ask whether ECG-based diagnostic models also encode prognostic information about future certain diseases, such as heart failure (HF). Using two large real-world cohorts, we stratify patients into TP/FP/FN/TN groups from model results and ground truth diagnosis label, and follow them for HF future event and all-cause mortality. To interpret longitudinal HF risk under censor and death scenario, we present following analysis: (i) conditional survival Kaplan Meier for HF with death censored, (ii) composite survival time to HF or death, and (iii) competing risks Aalen Johansen cumulative incidence with Gray’s test and Fine–Gray subdistribution hazards. We further quantify discrimination with time-dependent AUROC and Harrell’s C-index, and cross-validate biological plausibility using natriuretic peptide biomarkers (BNP/NT-proBNP). In the result: we further compared against a classical prognosis pipeline (ResNet + Cox): the prognostic model achieves relatively better C-index and long-term dynamic AUROC, while our diagnostic model is better in short-term dynamic AUROC.
Presenter Bio
Weijie Sun is a Ph.D. student in the Department of Computing Science at the University of Alberta, under the supervision of Prof. Russ Greiner and Prof. Abram Hindle. His research primarily focuses on predicting future clinical events using multi-modality clinical features, such as electrocardiogram (ECG) data and electronic health records. Weijie earned his Master’s degree at the University of Alberta (Supervised by Prof. Russ Greiner and Prof. Padma Kaul), where his thesis about machine learning models for diagnosis and prognosis from ECG data, achieving several notable publications in reputable journals and conferences, including NPJ Digital Medicine.
Watch on YouTube (coming soon)
Speaker
Shi-ang Qi, PhD student at the University of Alberta, supervised by Dr. Russ Greiner
Title
Explicitly Modeling Censoring Yields Sharper Estimates for Survival Analysis
Abstract
In survival analysis, it is standard to assume that an instance’s event time and censoring time are conditionally independent given the features. Under this assumption, a simple linear survival model does not benefit from modeling the censoring distribution. However, in the era of deep learning involving non-linear models, we propose that explicitly modeling the censoring distribution can lead to more expressive latent representations, thereby improving time-to-event estimation. To address more complex scenarios, our system learns four disjoint sets of internal, latent representations: those affecting only the event distribution, those affecting only the censoring distribution, those influencing both, and those that are not relevant to either. Building on these insights, we propose a framework that learns latent decomposed representations corresponding to the first three categories, mitigating biases introduced by censoring. We apply this proposed method to four popular deep-learning survival models and compare it to existing SOTA models on nine datasets to demonstrate its superior performance.
Presenter Bio
Shi-ang Qi is a Ph.D. candidate in the Department of Computing Science at the University of Alberta, supervised by Dr. Russell Greiner. His research focuses on machine learning for healthcare, survival analysis, causal inference, and recommendation systems. He collaborates with multidisciplinary teams to develop and deploy AI-driven solutions for real-world challenges across healthcare, finance, and technology. His work blends theory and application, showcasing his leadership in translating research into impactful, practical innovations.
Watch on YouTube (coming soon)
Speaker
Dagmar Loftsdottir, PhD student at the University of Alberta, supervised by Dr. Matthew Guzdial
Title
Process-Centered Design for Creativity Support Tools
Abstract
The design of creativity support tools can be driven by a variety of factors. One consideration for such tools is what practices are already in place for the intended user. In this seminar, Dagmar will explore their findings in designing systems with existing processes in mind. The scope of these systems range from texture synthesis to assistive tools for animating sprites.
Presenter Bio
Dagmar is a PhD student at the University of Alberta working with Dr. Matthew Guzdial. Their research focuses on designing co-creative tools for illustration. Dagmar received their MSc from the University of Alberta and their BSc from the University of Reykjavík and previously worked as a research engineer at Nox Medical.
Speaker
Dr. Ti-Rong Wu, Academia Sinica, Taipei, Taiwan, hosted by Dr. Martin Mueller
Title
Exploring Game AI through Planning
Abstract
Planning has proven to be a powerful approach in building game AI, as demonstrated by the success of AlphaZero and MuZero algorithms. In this talk, I will present my research journey on planning methods for game AI, focusing on three key directions. First, pursuing efficiency, I will introduce methods that incorporate options into planning to improve the scalability of AlphaZero-style algorithms. Second, pursuing better human interaction, I will describe approaches for estimating and adjusting playing strength, enabling game AI to dynamically match human opponents and styles, thereby creating more engaging experiences. Finally, pursuing perfection, I will present approaches that combine heuristics with exact methods to tackle goal-achieving problems and game solving. Together, these directions demonstrate how planning can make game AI more efficient, adaptive, and powerful, with potential applications extending beyond board games.
Presenter Bio
Dr. Ti-Rong Wu is currently an Assistant Research Fellow/Professor at the Institute of Information Science, Academia Sinica, Taiwan. He received his Ph.D. degree in Computer Science from National Chiao Tung University in 2020 and joined Academia Sinica in 2022. His research interests include planning, reinforcement learning, computer games, and deep learning. He led a team in developing a computer Go program, named CGI, which has won multiple gold medals in Computer Olympiad, and won second place in the first World AI Open in 2017. His research works have been published at various top-tier AI conferences and journals, such as AAAI, IJCAI, ICLR, NeurIPS, IEEE CIM, and IEEE TAI.
Website
homepage.iis.sinica.edu.tw/pages/tirongwu/
Add event to calendar
Apple Google Office 365 Outlook Outlook.com Yahoo
Watch on YouTube (coming soon)
Speaker
Esraa Elelimy, PhD student at the University of Alberta, supervised by Dr. Martha White
Title
Deep Reinforcement Learning with Gradient Eligibility Traces
Abstract
Achieving fast and stable off-policy learning in deep reinforcement learning (RL) is challenging. Most existing methods rely on semi-gradient temporal-difference (TD) methods for their simplicity and efficiency, but are consequently susceptible to divergence. While more principled approaches like Gradient TD (GTD) methods have strong convergence guarantees, they have rarely been used in deep RL. Recent work introduced the Generalized Projected Bellman Error (GPBE), enabling GTD methods to work efficiently with nonlinear function approximation. However, this work is only limited to one-step methods, which are slow at credit assignment and require a large number of samples. In this paper, we extend the GPBE objective to support multistep credit assignment based on the λ-return and derive three gradient-based methods that optimize this new objective. We provide both a forward-view formulation compatible with experience replay and a backward-view formulation compatible with streaming algorithms. Finally, we evaluate the proposed algorithms and show that they outperform both PPO and StreamQ in MuJoCo and MinAtar environments, respectively.
Presenter Bio
Esraa is a second-year PhD student at the RLAI lab advised by Martha White. Her main research interests lie primarily in online reinforcement learning, real-time recurrent learning, and continual learning.
Website
Watch on YouTube (coming soon)
Speaker
Daniel Haight, President & Co-Founder, Darkhorse Analytics
Title
Analytics and AI in the Real World
Abstract
In this talk, we discuss lessons from a thirty-year career in analytics, AI, and consulting. We'll share strong opinions (that are subject to change), about how and why (and why not) people adopt analytical insights.
Presenter Bio
Daniel Haight is the President and co-founder of Darkhorse Analytics. He is a Certified Analytics Professional and an award-winning lecturer at the University of Alberta School of Business. His work has spanned healthcare, energy, marketing, professional sports and transportation. His current work focuses on predictive analytics and data visualization. His goal is to help managers make better decisions by combining their experiences with the power of analytics.
Website
Watch on YouTube (coming soon)
Speaker
Jake Tuero, PhD student, University of Alberta, supervised by Dr. Levi Lelis & Dr. Michael Buro
Title
Subgoal-Guided Policy Heuristic Search with Learned Subgoals
Abstract
Policy tree search is a family of tree search algorithms that use a policy to guide the search. These algorithms provide guarantees on the number of expansions required to solve a given problem that are based on the quality of the policy. While these algorithms have shown promising results, the process in which they are trained requires complete solution trajectories to train the policy. Search trajectories are obtained during a trial-and-error search process. When the training problem instances are hard, learning can be prohibitively costly, especially when starting from a randomly initialized policy. As a result, search samples are wasted in failed attempts to solve these hard instances. This talk introduces a novel method for learning subgoal-based policies for policy tree search algorithms. The subgoals and policies conditioned on subgoals are learned from the trees that the search expands while attempting to solve problems, including the search trees of failed attempts. We empirically show that our policy formulation and training method improve the sample efficiency of learning a policy and heuristic function in this online setting.
Presenter Bio
Jake Tuero is a PhD student at the University of Alberta, under supervision of Michael Buro and Levi Lelis. His primary research interest lies in policy tree search, which is the combination of policy learning from reinforcement learning methods with tree search algorithms. His work has been published at several top conferences and journals, including AAAI and ICML.
Speaker
Ahmed Hendawy, PhD Student at LiteRL and IAS, TU Darmstadt
Title
Multi-Task Reinforcement Learning via Mixture of Experts
Abstract
This talk explains the role of mixture of experts in Multi-Task Reinforcement Learning (MTRL). We show how to acquire diverse representations to enable effective knowledge sharing among different tasks. In addition, we introduce a novel approach to enhance the inference-time efficiency of mixture of experts-based MTRL methods.
Presenter Bio
Ahmed Hendawy is a fourth-year Ph.D. student affiliated with the LiteRL and IAS research groups at TU Darmstadt, Germany, under the supervision of Prof. Carlo D’Eramo and Prof. Jan Peters. His research focuses on Reinforcement Learning (RL), with a particular interest in multi-task learning. Ahmed aims to boost the learning process of agents by leveraging knowledge across multiple tasks through a mixture of models. He develops RL algorithms that are application agnostic; however, he leans towards robot learning applications.
Website
Watch on YouTube (coming soon)
Speaker
Hon Tik (Rick) Tse, PhD student at the University of Alberta, supervised by Dr. Marlos Machado
Title
Reward-Aware Proto-Representations in Reinforcement Learning
Abstract
This talk explains the default representation, a representation similar to the successor representation that is reward-aware. In this talk, we will explain how the default representation captures reward dynamics, its similarities and differences compared with the successor representation, and how the reward-awareness of this representation can be beneficial in several settings.
Presenter Bio
Hon Tik Tse is a PhD student at the University of Alberta, focusing on reinforcement learning. His current research focuses on learning representations to facilitate exploration, transfer, and credit assignment.
Speaker
Aidan Bush, recent graduate, University of Alberta, supervised by Dr. Ioanis Nikolaidis
Title
An Empirical Investigation of Multi-Agent Contextual Bandits for Deflection Routing
Abstract
In this seminar from the Alberta Machine Intelligence Institute and the Department of Computing Science, Aidan Bush, a recent UA graduate, describes the deflection routing problem before presenting his solution. The deflection routing problem involves selecting paths through a non-stationary environment while minimizing loss and delay. In the presented solution paths are formed through individual routing decisions, made by contextual bandit agents at each network switch.
Presenter Bio
Aidan Bush is a recent graduate from the University of Alberta and member of RLAI. His research has focused around the practical application of Reinforcement Learning and Multi-Agent Reinforcement Learning. His work encompasses domains such as communication network routing, congestion control, and building energy management. Aidan is interested in developing robust and scalable AI solutions for real-world systems.
Speaker
Dr. Reza Sabbagh, CEO of iClassifier, Adjunct Professor in Mechanical Engineering at the U of A
Title
iPLF System: Implementing AI and Computer Vision for Livestock Assessment
Abstract
In this seminar from the Alberta Machine Intelligence Institute, Technology Alberta, and the Department of Computing Science, Dr. Reza Sabbagh, CEO of iClassifier, explains how integrating AI and computer vision through intelligent Precision Livestock Farming (iPLF) systems can transform livestock health and behavior monitoring. By automating assessments through image analysis, this approach improves accuracy, reduces labor, and supports animal welfare, enabling smarter, data-driven decision-making in modern agriculture.
Presenter Bio
Dr. Reza Sabbagh, CEO at iClassifier Inc. and Adjunct Professor in the Mechanical Engineering Department of the University of Alberta, is an expert in business development and project management. He has a PhD in mechanical engineering from the University of Alberta with 20 years of experience in designing and manufacturing and product development. He is also experienced in imaging techniques, visualization, image processing, and optical diagnostics techniques. He also performs research in the area of microfluidics and biomedicine application related to flow in high curvature geometries.
Website
Speaker
Dr. Khaled Barakat, CEO of Thoth Biosimulations Inc & Professor in the Faculty of Pharmacy and Pharmaceutical Sciences UofA
Title
AI in Drug Discovery: From Co-Pilot to Driver’s Seat?
Abstract
Artificial intelligence (AI) is no longer a futuristic add-on in drug discovery. It’s a real and rapidly evolving force. What began as a supportive tool for data analysis and hit identification is now being positioned to lead critical stages of the discovery pipeline. From generative molecular design and predictive modeling to multi-objective optimization and decision-making, AI is moving closer to the driver’s seat. But are we truly ready to hand over the wheel?
In this talk, Dr. Barakat will explore the evolving role of AI in drug discovery, examining its current capabilities, transformative successes, and key limitations. He will draw on real-world case studies and lessons from both academia and industry to assess whether AI can—and should—lead the innovation process. We’ll also discuss the importance of human oversight, domain expertise, and experimental validation in making AI-driven discovery reliable, ethical, and effective. Ultimately, this session will provoke critical reflection on the future of AI-human collaboration in one of the most high-stakes arenas of science.
Presenter Bio
Dr. Khaled Barakat is a Full Professor at the Faculty of Pharmacy and Pharmaceutical Sciences at the University of Alberta, and the CEO and Chief Scientific Officer of Thoth Biosimulations Inc., a Canadian startup pioneering AI-driven drug discovery. With a multidisciplinary background in engineering, biophysics, and pharmaceutical sciences, Dr. Barakat is internationally recognized for his research in computational drug design, immunotherapy, and ion channel modulation. He has led groundbreaking work in the development of small-molecule inhibitors for cancer immunotherapy, antiviral therapies, and cardioprotective agents. His team integrates molecular modeling, AI, and bioinformatics to design novel therapeutics and decode complex biological systems. Dr. Barakat has secured major funding from CIHR, NSERC, Alberta Innovates, Alberta Cancer Foundation, Li Ka Shing Institute of Virology, and the Kidney Foundation to support his translational research. He also leads funded collaborations with academic and industry partners to validate AI-designed therapeutics. In addition to his research, Dr. Barakat is a dedicated mentor and educator, advancing training in molecular modeling and AI applications in drug discovery. His work bridges academia and industry, driving innovation that addresses urgent biomedical challenges.
Website
Speaker
Prabhat Nagarajan, PhD student, supervised by Dr. Marlos Machado & Dr. Martha White
Title
Recent Insights in Value-based Deep Reinforcement Learning
Abstract
In this seminar from the Alberta Machine Intelligence Institute and the Department of Computing Science, Prabhat Nagarajan, a PhD student at the University of Alberta under Marlos C. Machado, explores recent findings in value-based deep reinforcement learning. These findings include the tandem effect, the phenomenon of policy churn, and the curse of diversity in ensemble-based exploration.
Presenter Bio
Prabhat Nagarajan is currently a PhD student in reinforcement learning (RL), working with Marlos C. Machado. In the past he received undergraduate and master's degrees from the University of Texas at Austin in Computer Science and has interned at Sony AI and Microsoft Research. His past research has touched on a variety of topics in RL including: deep RL, reproducibility, inverse RL, reward shaping, RL for language models, temporal difference learning, and goal-conditioned RL.
Website
Speaker
Dr. Ostap Okhrin, Professor at TU Dresden, Germany, hosted by Dr. Rich Sutton
Title
Two-Sample Testing in Reinforcement Learning
Abstract
Overestimation bias is a known threat to those algorithms and can sometimes lead to dramatic performance decreases or even complete algorithmic failure. We frame the bias problem statistically and consider it an instance of estimating the maximum expected value (MEV) of a set of random variables. We propose the T-Estimator (TE) based on two-sample testing for the mean, that flexibly interpolates between over- and underestimation by adjusting the significance level of the underlying hypothesis tests. We introduce modifications of Q-Learning and the Bootstrapped Deep Q-Network (BDQN) using the TE, and prove convergence in the tabular setting.
Presenter Bio
Dr. Ostap Okhrin is a Professor of Econometrics and Statistics, especially in Transportation, at the Institute of Transport and Economics, TU Dresden, Germany. He has co-authored nearly 100 publications in the fields of mathematical and applied statistics, econometrics, and reinforcement learning, with applications to finance, economics, and autonomous driving.
Website
RL Dresden - Reinforcement Learning Research Group
Speaker
Sheila Schoepp, Phd student, supervised by Dr. Osmar Zaiane & Dr. Matthew Taylor
Title
The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning
Abstract
Large language models (LLMs) and vision-language models (VLMs) enrich reinforcement learning (RL) with knowledge, communication, and reasoning—capabilities that standard RL lacks. Our survey, "The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning", classifies the role of the foundation model (FM) into three categories: Agent, where the FM selects actions; Planner, where the FM decomposes long-horizon tasks into sub-goals; and Reward, where the FM supplies preference feedback or autonomously proposes, tests, and refines reward functions.
Presenter Bio
Sheila Schoepp is a PhD student in Computing Science at the University of Alberta, supervised by Drs. Osmar R. Zaïane and Matthew E. Taylor. Her work bridges reinforcement learning and foundation models, and she has co-authored a survey on their intersection. Sheila's research now focuses on cooperative multi-agent systems driven by foundation models (FMs), aiming to create FM agents that learn to cooperate with one another, as well as with humans, in both simulated and real-world settings.
Speaker
Reginald McLean, PhD Candidate, Toronto Metropolitan University, hosted by Dr. Marlos Machado
Title
Multi-Task Reinforcement Learning Enables Parameter Scaling
Abstract
Multi-task reinforcement learning (MTRL) aims to endow a single agent with the ability to perform well on multiple tasks. Recent works have focused on developing novel sophisticated architectures to improve performance, often resulting in larger models; it is unclear, however, whether the performance gains are a consequence of the architecture design itself or the extra parameters. We argue that gains are mostly due to scale by demonstrating that naively scaling up a simple MTRL baseline to match parameter counts outperforms the more sophisticated architectures, and these gains benefit most from scaling the critic over the actor. Additionally, we explore the training stability advantages that come with task diversity, demonstrating that increasing the number of tasks can help mitigate plasticity loss. Our findings suggest that MTRL's simultaneous training across multiple tasks provides a natural framework for beneficial parameter scaling in reinforcement learning, challenging the need for complex architectural innovations.
Presenter Bio
Reginald (Reggie) McLean (he/him) is a final year PhD Candidate in Computer Science at Toronto Metropolitan University where he has studied skill acquisition capabilities of autonomous agents, with interests in the fields of multi-task, continual, and inverse reinforcement learning. Throughout his doctoral studies, Reggie has taught at various post-secondary institutions at the college and university levels, while also collaborating with researchers from Intel and Google Deepmind. Prior to his PhD, Reggie received his Master of Science degree from Brock University where he researched the abilities of swarm-based algorithms to train neural networks. Approaching dissertation completion, Reggie is seeking postdoctoral opportunities to further develop learning capabilities of autonomous agents, while expanding on collaborative research in autonomous agents. In his free time, Reggie enjoys spending time outside with his dogs, golfing, or reading fantasy/sci-fi novels.
Website
Speaker
Dr. Sébastian Gambs, Université du Québec à Montréal, hosted by Dr. Bailey Kacsmar
Title
Understanding and Addressing Fairwashing in Machine Learning
Abstract
Fairwashing refers to the risk that an unfair black-box model can be explained by a fairer model through post-hoc explanation manipulation. In this talk, I will first discuss how fairwashing attacks can transfer across black-box models, meaning that other black-box models can perform fairwashing without explicitly using their predictions. This generalization and transferability of fairwashing attacks imply that their detection will be difficult in practice. Finally, I will nonetheless review some possible avenues of research on how to limit the potential for fairwashing.
Presenter Bio
Dr. Sébastien Gambs has held the Canada Research Chair in Privacy and Ethical Analysis of Massive Data since December 2017 and has been a professor in the Department of Computer Science at the Université du Québec à Montréal since January 2016. His main research theme is privacy in the digital world. He is also interested in solving long-term scientific questions such as the existing tensions between massive data analysis and privacy as well as ethical issues such as fairness, transparency and algorithmic accountability raised by personalized systems.
NO SEMINAR - Upper Bound
Speaker
Rohan Saha, PhD student, supervised by Dr. Alona Fyshe
Title
Assessing the impact of curriculum pre-training in small-scale multimodal models
Abstract
Inspired by human pedagogical strategies, curriculum learning methods order training samples according to a predefined difficulty metric. In this talk, we will present an empirical study of curriculum pre‑training for small‑scale vision‑language models under limited‑data conditions. Our work, as part of the BabyLM Challenge 2024, demonstrates that curriculum pre-training can be beneficial for improving model performance in limited data multimodal settings.
Presenter Bio
Rohan Saha is a PhD student at the University of Alberta specializing in Natural Language Processing. His research focuses on efficient training methods and compositional learning in small language models. He is currently investigating how modern language models understand and solve problems involving the composition of sub-tasks. During his master’s degree, he explored human language processing through the lens of embedding models.
Speaker
Emily Halina, PhD student, supervised by Dr. Matthew Guzdial
Title
Generating Valid Virtual Environments via Path-based Reconstructive Partitioning
Abstract
This talk discusses the challenges in generating levels for video games via machine learning with strong playability and cohesion guarantees. We will give an overview of two state-of-the-art approaches, tree-based and path-based reconstructive partitioning, and discuss applications to domains including reinforcement learning and game design.
Presenter Bio
Emily Halina is a PhD student at the University of Alberta researching human-centered artificial intelligence. Her research interests include procedural content generation for games, machine learning, and reinforcement learning. She has published work at multiple top AI in games venues, including IJCAI and AIIDE. Before academia, Emily worked as an indie game developer, and as a result she is particularly interested in helping independent developers make cool, weird games using her research.
Speaker
Shay Sharma, Founder of Duologue Systems
Title
Use of artificial intelligence in the justice sector
Abstract
The talk will feature a novel Canadian solution to assist with complex resource navigation and mental health supports for folks in the justice system.
Content Advisory
This presentation discusses sensitive topics that may be distressing to some attendees, including prison and justice systems, suicide, mental health struggles, depression, addictions, substance abuse, and Fetal Alcohol Syndrome Disorder. For mental health support, see the UofA Supports available during a crisis page.
Presenter Bio
Shay Sharma, Founder of Duologue Systems, brings 8+ years of experience in product development and digital transformation across government and non-profits. With an engineering degree from the University of Alberta and product development certification from PDMA, Shay excels in user-centric design and stakeholder management with particular focus on social impact initiatives. Shay has been central in design, development and deployment of several customized digital platforms and AI models across Canadian public entities including digitizing licensure for internationally trained physicians and building multi-modal social engagement platforms for non-profits.
Speaker
Zijun Wu, PhD student, supervised by Dr. Lili Mou
Title
The emergence of latent structures in natural language processing
Abstract
This talk introduces symbolic bottleneck networks, our proposed concept for improving interpretability in neural models. We present two works: the first applies fuzzy logic for explainable phrase-level reasoning in NLI under weak supervision; the second uses a hierarchical RNN to induce phrase-level chunking structures through downstream tasks. Symbolic bottleneck network highlights an interpretability-performance trade-off in NLP.
Presenter Bio
Zijun Wu is a PhD student in NLP and Machine Learning at the University of Alberta, supervised by Dr. Lili Mou. His research focuses on the emergence of symbolic structures in neural networks, aiming to improve model interpretability and communication. He proposed the concept of symbolic bottleneck networks and has published in ICLR and Computational Linguistics.
NO SEMINAR - Good Friday
Speaker
Justin Stevens, University of Alberta, supervised by Dr. Vadim Bulitko
Title
Intelligent Tutoring Systems for Math, CS, and Games
Abstract
Justin will detail a path toward creating intelligent tutoring systems for math education, computer science education, and video games. In particular, Justin hopes to create tutoring systems that can give hints to learners that help guide them toward solving a tricky math problem, coding challenge, or puzzle in a game. The first step of Justin’s research plan is figuring out the right time to give a hint to a learner, which involves user modelling and goal recognition. The second step is to figure out the best hint to give a learner, which involves generative AI, verification, and explainable AI. The third step is to figure out how much the hints help the learners compared to other pedagogical strategies.
In the second half of this talk, Justin will detail several of his papers that stemmed from his time in Edmonton. In his master’s work, Justin wrote several papers on explainable AI applied to games. Furthermore, while working at Amii, Justin received the New & Future AI Educator Award at AAAI 2024. There, Justin presented his blue sky idea for the future of AI in education. He’ll give a quick version of this 5-minute presentation to wrap up the seminar.
Presenter Bio
Justin Stevens is an incoming doctoral student at Washington University in St. Louis. He previously earned his M.Sc. in Computing Science from the University of Alberta and worked as a Machine Learning Educator at Amii. Justin is known for his extensive teaching experience and curriculum development in AI education. Justin is a prominent figure in Edmonton's tech community, having co-founded the Undergraduate AI Society and volunteered actively with the CS Graduate Students’ Association and Technology Alberta.
NO SEMINAR - due to illness
Speaker
Dr. Mamatha Bhat, University Health Network, Ajmera Transplant Centre, hosted by Dr. Russ Greiner
Title
Development of AI tools in Transplantation: Current and Future Prospects
Abstract
Artificial Intelligence (AI) tools have been increasingly applied to clinical questions in transplant medicine in recent years. As we continue to push the limits of transplantation, there are many challenges throughout transplant medicine that must be better addressed including equity and objectivity in decision-making. Various factors affect liver transplant pathology and outcomes, including sex, ethnicity, genetics, BMI, diabetes, and immunosuppressive regimens. There exist complex, non-linear patterns in laboratory tests that must be considered in conjunction with the complex clinical variables to predict outcome. Additionally, electronic health record data, imaging technologies, histology, and ‘omics data have continued to expand the types of data available. These complex data points, their hidden patterns and interrelationships can be uniquely leveraged with the use of AI tools. Longitudinal changes in these variables are also being examined to provide a continuous reassessment of risk along the timeline. Applications of AI in transplant medicine include waitlist prioritization, donor-recipient matching, and short-term/long-term outcome prediction. In this talk, I will go over these considerations with respect to application of AI in transplant medicine. I will additionally discuss integration of ‘omics data, as well as perspectives regarding clinical deployment of AI tools.
Presenter Bio
Dr. Mamatha Bhat is a Hepatologist and Clinician-Scientist at the University Health Network's Ajmera Transplant Centre, Scientist at TGHRI and Associate Professor of Medicine at the University of Toronto. Dr. Bhat completed her medical school and residency training at McGill University. She then completed a Transplant Hepatology fellowship at the Mayo Clinic in Rochester, Minnesota, followed by a CIHR Fellowship for Health Professionals, through which she completed a PhD in Medical Biophysics. The goal of Dr. Bhat’s research program is to improve long-term outcomes of liver transplantation by developing tools of Artificial Intelligence integrating clinical and 'omics data, and has been funded by CIHR, Terry Fox research institute, Canadian Liver Foundation, American society of Transplant among others. She has published over 155 papers in journals such as Lancet Digital Health, Journal of Hepatology, JAMA Surgery, Gut and Hepatology.
Speaker
Tian Tian, PhD student, supervised by Dr. Rich Sutton
Title
Confident Natural Policy Gradient for Local Planning in q_pi-realizable Constrained MDPs
Abstract
The constrained Markov decision process (CMDP) framework emerges as an important reinforcement learning approach for imposing safety or other critical objectives while maximizing cumulative reward. However, the current understanding of how to learn efficiently in a CMDP environment with a potentially infinite number of states remains under investigation, particularly when function approximation is applied to the value functions. In this paper, we address the learning problem given linear function approximation with $q_{\pi}$-realizability, where the value functions of all policies are linearly representable with a known feature map, a setting known to be more general and challenging than other linear settings. Utilizing a local-access model, we propose a novel primal-dual algorithm that, after $\tilde{O}(\poly(d) \epsilon^{-3})$\footnote{Here $\tilde{O}(\cdot)$ hides $\log$ factors.} queries, outputs with high probability a policy that strictly satisfies the constraints while nearly optimizing the value with respect to a reward function. Here, $d$ is the feature dimension and $\epsilon > 0$ is a given error. The algorithm relies on a carefully crafted off-policy evaluation procedure to evaluate the policy using historical data, which informs policy updates through policy gradients and conserves samples. To our knowledge, this is the first result achieving polynomial sample complexity for CMDP in the $q_{\pi}$-realizable setting.
Presenter Bio
Tian Tian is a PhD student in Computing Science at the University of Alberta, working under the supervision of Rich Sutton and collaborating with Ling F. Yang from UCLA and Csaba Szepesvári. She completed her master’s degree at the University of Alberta under Rich Sutton’s guidance, following her bachelor’s degree in Computing Engineering and Statistics from the same institution. Tian Tian's primary research interest lies in reinforcement learning theory.
Speaker
Shi-ang Qi, PhD student, supervised by Dr. Russ Greiner
Title
Enhancing Survival Analysis: Improving Calibration without Compromising Discrimination
Abstract
While survival models have traditionally focused on discrimination -- accurately ranking patient risks (e.g., prioritizing patients for a rare transplant), there is a growing interest in improving calibration performance, reflecting the alignment of predicted probabilities with the actual distribution of observations. As these measure distinct aspects of a model, it is hard for models to optimize both objectives simultaneously -- indeed, many previous results found improving calibration tends to diminish discrimination performance. This talks introduces two novel approaches, which use conformal prediction to disentangle calibration from the discrimination, allowing us to improve a model’s calibration without degrading its discrimination. We provide theoretical guarantees for this claim and demonstrate its effectiveness over diverse scenarios.
Presenter Bio
Shi-ang Qi is a Ph.D. candidate in the Department of Computing Science at the University of Alberta, supervised by Dr. Russell Greiner. His research focuses on machine learning for healthcare, survival analysis, causal inference, and recommendation systems. He collaborates with multidisciplinary teams to develop and deploy AI-driven solutions for real-world challenges across healthcare, finance, and technology. His work blends theory and application, showcasing his leadership in translating research into impactful, practical innovations.
Speaker
Nafaa Haddou, Founder of Firesafe AI
Title
FireSafe AI: Applying AI and Machine Learning to Solving Critical Issues in Wildfire Mitigation
Abstract
FireSafe AI empowers communities, emergency responders, and policymakers with actionable insights to build smarter, data-driven wildfire prevention strategies, protecting both lives and ecosystems. FireSafe AI leverages advanced artificial intelligence and machine learning to revolutionize wildfire mitigation. By analyzing vast datasets—weather patterns, vegetation density, historical fires, and real-time sensor data—FireSafe AI predicts fire risks, identifies vulnerable areas, and enhances early detection. Its predictive modeling helps allocate resources more effectively, improving response times and reducing environmental and economic damage
Presenter Bio
Nafaa Haddou is the visionary co-founder and CEO of Nu Terra Labs, the innovative company behind FireSafe AI. With over a decade of experience in the information technology sector, Nafaa has channeled his expertise into developing cutting-edge solutions for one of the most pressing environmental challenges of our time: wildfire management.
As the driving force behind FireSafe AI, Nafaa leads a team dedicated to wildfire detection and mitigation through the application of artificial intelligence and machine learning. Under his guidance, Nu Terra Labs has created a comprehensive system that combines real-time surveillance, predictive analytics, and resource optimization to combat wildfires more effectively.
Nafaa's entrepreneurial journey began in 2021 when he co-founded Nu Terra Labs with his brother Ismail. Initially focused on vertical farming and automation, Nafaa's keen insight into emerging technologies and environmental needs led to a strategic pivot towards wildfire management.
His innovative approach has garnered significant attention in the tech and environmental sectors. Nafaa's leadership has steered FireSafe AI through successful pilot projects, including a collaboration with the Municipal District of Bighorn, and has secured letters of intent from various potential clients. Notably, FireSafe AI is part of Cohort 6 of the Community Safety and Wellness Accelerator supported by Alberta Innovates, further validating the potential impact of their technology.
A testament to his forward-thinking vision, Nafaa was selected for the inaugural THRIVE Academy agrifood pre-accelerator program in 2022, further solidifying his position as a leader in agritech innovation.
In his presentation, ""FireSafe AI: Applying AI and Machine Learning to Solving Critical Issues in Wildfire Mitigation,"" Nafaa will share insights into the technology behind FireSafe AI and discuss the potential of AI to address urgent environmental challenges.
Speaker
Fei Wang, PhD student, supervised by Dr. Russ Greiner & Dr. David Wishart
Title
Towards Reference-Free Metabolite Identification: Part 1 – Predicting How Molecules Blow Up
Abstract
Identifying chemicals from mass spectra is essential in medicine, food, and environmental science. Traditional methods match spectra to libraries, an approach that is limited by incomplete library coverage. Predicting spectra can improve identification by augmenting real libraries, yet existing models struggle with resolution, scalability, and interpretability. We introduce FraGNNet, a probabilistic approach that efficiently and accurately predicts high-resolution spectra. FraGNNet leverages a structured latent space to reveal underlying processes that define the spectrum, enhancing interpretability and insight into chemical fragmentation.
Presenter Bio
Fei Wang is a PhD student in the Computing Science Department of the University of Alberta, interested in machine learning models for small molecules and mass spectra.
NO SEMINAR - Reading Week
Speaker
Dr. Michael Bowling, Amii Fellow, University of Alberta
Title
Rethinking the Foundations for Continual Reinforcement Learning
Abstract
Efforts to build systems that continually learn and adapt in an ever changing, complex world, often build on the foundations and standard practices of traditional reinforcement learning. In this talk, I will argue that sacredly held concepts, such as optimal policies and Markov decision processes, may collectively be holding us back more than helping us move forward. I will then propose an alternative set of foundations. I hope to spur on others to rethink these traditional foundations, but also start thinking about new algorithms enabled by alternative, better-suited foundations.
Presenter Bio
Dr. Michael Bowling is a Professor at the University of Alberta, a Canada CIFAR AI Chair, Fellow in the Alberta Machine Intelligence Institute, and a AAAI Fellow. He led the Computer Poker Research Group, which built some of the best poker playing artificial intelligence programs in the world, including being the first to beat professional players at both limit and no-limit variants of the game. He also was behind the use of Atari 2600 games to evaluate the general competency of reinforcement learning algorithms, which is now a ubiquitous benchmark suite of domains for reinforcement learning. He has published well over 150 peer-reviewed conference and journal articles, including two articles in Science. His research has been featured on the television programs Scientific American Frontiers, National Geographic Today, and Discovery Channel Canada, as well appearing in the New York Times, Wired, on CBC and BBC radio, and twice in exhibits at the Smithsonian Museums.
Speaker
Dr. John McDougall, CEO & COO, SynBioBlox Innovations Ltd.
Title
The Next Industrial Revolution: The Use of AI and ML to Advance Synthetic Biology
Abstract
A problem like excess greenhouse gases can be solved by “enhancing mother nature”. Biological processes with higher productivity and smaller footprints become possible with synthetic biology. However, the time, cost and risk of laboratory-based trial and error approaches are major hurdles. In 2023 “Nature” stated, “genetic engineering and accelerated evolution — with AI to speed the effort will likely be the most effective way forward.” Learn how SynBioBlox is applying AI and ML to support data annotation, data analytics, modelling and prediction to design effective and efficient biological organisms.
Presenter Bio
Dr. John McDougall is President and CEO of SynBioBlox, Chair of McDougall & Secord, Limited and a director of several for profit and not for profit enterprises. Honoured as one of Alberta’s 50 most influential citizens, John's career spans the globe across many sectors, with a far-reaching range of accomplishments and roles related to technology, innovation and business recognized by awards, medals and fellowships from organizations including the Canadian Academy of Engineers, Engineers Canada, Mexican College of Civil Engineers and PICMET.
He began as a petroleum engineer with Imperial Oil, evolved into the ownership and management of an international engineering consulting firm and subsequently a private merchant bank before becoming President of the Alberta Research Council and then the National Research Council. John has served, often as Chair, at the local, provincial, federal and international level on numerous public and private agencies, not-for-profits and advisory committees related to trade, education, technology, innovation, engineering, economic development and employment.
John is a graduate in engineering from the University of Alberta where he also completed post graduate courses in computing science and environment engineering. He was the Inaugural Poole Chair in Management for Engineers at the UA and the founder of “Innovation School” in collaboration with leading research and technology organizations across Canada.
Speaker
Montaser Mohammedalamen, PhD candidate, supervised by Dr. Michael Bowling
Title
Generalization in Monitored Markov Decision Processes (Mon-MDPs)
Abstract
Reinforcement learning (RL) typically models the interaction between the agent and environment as a Markov decision process (MDP), where rewards guide behavior. However, many real-world scenarios involve environments where rewards are not always observable. The monitored MDP (Mon-MDP) framework addresses this challenge by modeling interactions in settings where agents cannot always observe rewards. Prior studies on Mon-MDPs have been limited to simple, tabular cases, restricting their applicability to real-world problems. This work explores Mon-MDPs in non-tabular using function approximation (FA) and investigates the challenges involved. We show that combining FA with a learned reward model enables agents to generalize from monitored states, where rewards are observable to unmonitored environment states, where rewards are unobservable, achieving near-optimal policies in environments originally deemed unsolvable demonstrating. However, we identify a critical limitation of FA: its tendency to induce "over-generalization'', where agents incorrectly extrapolate reward signals to unmonitored states, resulting in undesirable behaviors. To address this, we propose a cautious learning method that incorporates reward uncertainty, enabling agents to avoid undesirable outcomes. Our approach begins to bridge the gap between Mon-MDP theory and real-world applications, advancing RL in environments with partially observable rewards.
Presenter Bio
Montaser Mohammedalamen is a PhD candidate advised by Dr. Michael Bowling, exploring how AI systems can learn in environments where rewards are not always observable. His research focuses on designing autonomous agents that can act cautiously in uncertain scenarios, contributing to advancements in reinforcement learning for partially observable settings.
Before starting his PhD, Montaser worked as an AI engineer at SonyAI, where he was part of a team developing multi-agent robotic systems. His work included training agents using self-play and goal-conditioned reinforcement learning, transferring learned behaviors from simulations to real-world settings, and integrating them with vision systems and robot control methods.
Montaser is passionate about bridging theoretical research with practical applications to create adaptive and intelligent systems for complex environments.
Speaker
Dr. Randy Goebel, Amii Fellow, University of Alberta
Title
Debugging foundation models: the elephant in the room
Abstract
The scientific field of artificial intelligence (AI) has never had a stronger public presence, which can be both positive and negative. Rather than get caught up in the naive controversies about the spectrum of good and bad potential consequences of AI, we focus on some foundational scientific challenges which seem to get ignored. We claim one such challenge is the largest "elephant in the room," and raises a fundamental question about debugging so-called "foundation models." What constitutes a foundation model is increasingly complicated by an emerging spectrum of neurosymbolic foundation models which seem to span a broad scope of representations from deep neural networks, reinforcement learning policies, Bayesian probability, and a collection of logical, non-monotonic, and belief revision representation models. We explore how several of these foundation representation systems might participate in a more general AI foundation framework, by considering the role of explainability (XAI) and debuggability. An immediate observation is that a coordinated "stack" of foundation models may provide the basis for a new era of AI.
Presenter Bio
R.G. (Randy) Goebel is currently professor of Computing Science in the Department of Computing Science, adjunct Professor in the Department of Medicine in the Faculty of Medicine and Dentistry, at the University of Alberta. He is also Fellow and co-founder of the Alberta Machine Intelligence Institute (Amii). He received the B.Sc. (Computer Science), M.Sc. (Computing Science), and Ph.D. (Computer Science) from the Universities of Regina, Alberta, and British Columbia, respectively.
Professor Goebel's theoretical work on abductive and hypothetical reasoning and belief revision is internationally well known; his recent research is focused on the formalization of visualization and explainable artificial intelligence (XAI), especially for applications in autonomous driving, legal reasoning, and precision health. He has worked on optimization, algorithm complexity, systems biology, natural language processing, and automated reasoning.
Randy has previously held faculty appointments at the University of Waterloo, University of Tokyo, Multimedia University (Kuala Lumpur), Hokkaido University (Sapporo), visiting researcher engagements at National Institute of Informatics (Tokyo), DFKI (Germany), and NICTA (Australia); he is actively involved in collaborative research projects in Canada, Japan, Germany, France, the UK, and China.
NO SEMINAR - no speaker available
Speaker
Dr. Bingshan Hu, Postdoctoral Fellow at University of British Columbia, hosted by Dr. Nidhi Hedge
Title
Efficient and Adaptive Thompson Sampling Algorithms for Bandits
Abstract
A Multi-armed bandit problem is a classical sequential decision-making problem in which the goal is to accumulate as much reward as possible. In this learning model, only a limited amount of information is revealed in each round. The imperfect feedback model results in the learning algorithm being in a dilemma between exploration (gaining information) and exploitation (accumulating reward). Thompson Sampling, one of the oldest randomized learning algorithms, is able to achieve a good balance between exploration and exploitation and it always has a very competitive empirical performance. We study Thompson Sampling-based algorithms for stochastic bandits with bounded rewards. As the existing problem-dependent regret bound for Thompson Sampling with Gaussian priors is vacuous when the learning horizon T ≤ 6235149080811616882909238708, we derive a more practical bound. Additionally, motivated by large-scale real-world applications that require scalability, adaptive computational resource allocation, and a balance in utility and computation, we propose a parameterized Thompson Sampling-based algorithm: Exploration-driven Thompson Sampling, where parameter α ∈ [0, 1] trades off utility and computation.
Presenter Bio
A Multi-armed bandit problem is a classical sequential decision-making problem in which the goal is to accumulate as much reward as possible. In this learning model, only a limited amount of information is revealed in each round. The imperfect feedback model results in the learning algorithm being in a dilemma between exploration (gaining information) and exploitation (accumulating reward). Thompson Sampling, one of the oldest randomized learning algorithms, is able to achieve a good balance between exploration and exploitation and it always has a very competitive empirical performance. We study Thompson Sampling-based algorithms for stochastic bandits with bounded rewards. As the existing problem-dependent regret bound for Thompson Sampling with Gaussian priors is vacuous when the learning horizon T ≤ 6235149080811616882909238708, we derive a more practical bound. Additionally, motivated by large-scale real-world applications that require scalability, adaptive computational resource allocation, and a balance in utility and computation, we propose a parameterized Thompson Sampling-based algorithm: Exploration-driven Thompson Sampling, where parameter α ∈ [0, 1] trades off utility and computation.