CAIS 2026 Trustworthy Healthcare AI Tutorial

Trustworthy Multi-Agent AI Systems for Healthcare (Tutorial)

Overview

The rapid rise of large language models and multi-agent AI systems is transforming healthcare at an unprecedented pace — reshaping how patients access information, how clinicians document care, and how health systems deliver services at scale. Yet as deployment accelerates, so do the risks. Omission errors, bias, automation complacency, and error propagation across complex agentic systems demand rigorous, principled approaches to evaluation and governance.

This tutorial provides a structured introduction to the key technical and methodological challenges in developing trustworthy agentic AI for healthcare. We examine multi-agent workflow design, pre- and post-deployment evaluation methodology, and governance frameworks — drawing on recent empirical findings and real-world deployments spanning virtual primary care, clinical documentation, and other healthcare workflows. A central theme is the inadequacy of current evaluation practice: benchmarks designed for single-model, single-turn settings systematically underestimate failure modes that emerge in interactive, multi-agent clinical agentic systems. We conclude by surfacing open research problems, including the formal characterization of trust hierarchies in agentic systems, evaluation methodology for multi-agent systems, detection of silent performance degradation, and the operationalization of calibrated autonomy across varying healthcare risk contexts. Our goal is to stimulate further research on multi-agent AI systems for healthcare, and enable researchers and practitioners to build more robust and trustworthy healthcare AI applications.

Contributors

Anitha Kannan (Curai Health, USA)

Krishnaram Kenthapadi (Oracle Health AI, USA)

Tutorial Logistics

ACM Conference on AI and Agentic Systems (CAIS 2026)
- 9 AM - 12:30 PM PT on Tuesday, May 26, 2026 in Santa Clara (Room inside DoubleTree by Hilton San Jose) [CAIS Program Agenda]

Contributor Bios

Dr. Anitha Kannan is part of the founding team and VP of AI at Curai Health. Curai Health, founded in 2017, is an AI-driven virtual primary care clinic focused on making healthcare easily accessible and affordable, by deploying AI/machine learning into clinical workflows. Dr. Kannan leads Curai's AI strategy, development, and deployment so that AI solutions are seamlessly integrated into clinician and patient workflows. Before Curai, she was a senior research scientist at Facebook AI Research and Microsoft Research. Her research impacted Microsoft products, for which she has received many technical awards, including multiple Gold Star awards. She holds a PhD in machine learning from the University of Toronto in 2006 and was a Darwin Fellow at the University of Cambridge.

Dr. Krishnaram Kenthapadi is the Chief Scientist, Healthcare AI at Oracle Health, where he leads the AI initiatives for Clinical AI Agent and other Oracle Health products. Previously, as the Chief AI Officer & Chief Scientist of Fiddler AI, he led initiatives on generative AI (e.g., Fiddler Auditor, an open-source library for evaluating & red-teaming LLMs before deployment; AI safety, observability & feedback mechanisms for LLMs in production), and on AI safety, alignment, observability, and trustworthiness, as well as the technical strategy, customer-driven innovation, and thought leadership for Fiddler. Prior to that, he was a Principal Scientist at Amazon AWS AI, where he led the fairness, explainability, privacy, and model understanding initiatives in Amazon AWS AI platform. Prior to joining Amazon, he led similar efforts at the LinkedIn AI team, and served as LinkedIn’s representative in Microsoft’s AI and Ethics in Engineering and Research (AETHER) Advisory Board. Previously, he was a Researcher at Microsoft Research Silicon Valley Lab. Krishnaram received his Ph.D. in Computer Science from Stanford University in 2006. His work has been recognized through awards at NAACL, WWW, SODA, CIKM, ICML AutoML workshop, and Microsoft’s AI/ML conference (MLADS). He has published 50+ papers, with 7000+ citations and filed 150+ patents (75 granted). He has presented tutorials on grounding and evaluation for LLMs, trustworthy generative AI, privacy, fairness, explainable AI, ML model monitoring, and responsible AI in industry at forums such as KDD, WSDM, WWW, FAccT, AAAI, and ICML, and instructed a course on responsible AI at Stanford.

Tutorial Outline and Description

The tutorial will consist of the following parts:

Introduction: State of Healthcare and Role of AI

How Clinical/Healthcare AI Actually Fails

The Evaluation Problem

Multi-Agent Systems as the Deployment Reality

Human Factors and Trust

Governance, Monitoring, and the Road Ahead

Industry Case Studies

This tutorial is aimed at attendees with a wide range of interests and backgrounds both in academia and industry, including researchers interested in knowing about trustworthy healthcare AI challenges, techniques, and best practices as well as practitioners interested in implementing multi-agent systems for various healthcare AI applications. We will not assume any prerequisite knowledge, and present the advances, challenges, and opportunities related to trustworthy agentic AI for healthcare by building intuition to ensure that the material is accessible to all attendees.