Speaker : Eran Halperin (UCLA / United Health)
Abstract: In this talk, I will present an overview of machine learning opportunities in healthcare, emphasizing challenges specific to healthcare and not necessarily as critical in other domains. I will describe an overview of some examples of such challenges and current solutions developed by our team in Optum in a few domains, including the applications of large language models, dealing with missing data, and machine learning applied to medically related wearables.
Bio: Dr. Eran Halperin is a senior vice president of AI/ML in Optum AI (United HealthGroup) and an adjunct professor in the Department of Computer Science at UCLA. Prior to his current positions, he was a full professor in the departments of Computer Science, Computational Medicine, Anesthesiology, and Human Genetics at UCLA, and previously he held research and postdoctoral positions at the University of California, Berkeley, the International Computer Science Institute in Berkeley, Princeton University, and Tel-Aviv University. At UCLA, Dr. Halperin’s lab develops computational and machine-learning methods for a variety of health-related applications, including different genomic applications (genetics, methylation, microbiome, single-cell RNA) and medical applications (medical imaging, physiological waveforms, and electronic medical records). He published more than 160 peer-reviewed publications, and he received various honors for academic achievements, including the Rothschild Fellowship, the Technion-Juludan Prize for technological contribution to medicine, the Krill Prize, and he was elected as an International Society of Computational Biology (ISCB) fellow.
Talks in Session 1
Speaker : Qi Liu (FDA)
Abstract: In this talk, we'll explore the role of AI/ML in drug development and regulation, beginning with a landscape analysis of related submissions to US FDA’s Center of Drug Evaluation and Research (CDER). Some review examples will be highlighted, followed by a discussion of the challenges and regulatory considerations of AI/ML in drug development. The session will conclude with a discussion on AI/ML activities at the CDER's Office of Clinical Pharmacology.
Bio: Qi Liu, Ph.D., M.Stat., FCP is the Associate Director for Innovation & Partnership in the Office of Clinical Pharmacology (OCP)/ Office of Translational Sciences, CDER, FDA. She leads OCP’s innovative initiatives through strategic partnership. She has helped developing OCP’s portfolio on machine learning/artificial intelligence, real world evidence and digital health technologies, collaborating with internal and external experts. She worked on working groups for FDA guidance documents and Manual of Policies & Procedures development. She is an Associate Editor of Clinical Translational Science and on the editorial board of five scientific journals. Before joining FDA, Dr. Liu was a senior pharmacokineticist at Merck & Co. Inc. She obtained her Ph.D. degree in Pharmaceutics and a concurrent Master's degree in Statistics from the University of Florida in 2004. In addition, she has a Master's degree in Pharmaceutics and a Bachelors’ degree in Clinical Pharmacy from West China University of Medical Sciences.
Speaker: Brian Hill (OptumAI / UnitedHealth Group)
Abstract: Recent advances in large language models (LLMs) have shown that foundation models (FMs) can learn highly complex representations of sequences that can be used for downstream generative and discriminative tasks such as text generation and classification. While most FMs focus on text, recent work has shown FMs can be learnt for sequential medical data, e.g. ICD-10 diagnosis codes associated with specific patient visits. These FMs demonstrate improved performance on downstream discriminative disease classification tasks but cannot be used for generative tasks such as synthesizing artificial patient visits for data augmentation or privacy-preserving data sharing since they utilize BERT-based pre-training. In this talk, we introduce CHIRon, the first generative FM for sequential medical data. CHIRon utilizes causal masking during for pre-training, enabling generative applications, and incorporates a number of architectural improvements and support for additional medical data types (diagnoses, procedures, medications, lab results, place of service, demographics). We show empirically that CHIRon can be used to generate realistic sequential medical data and also outperforms state of the art FMs for sequential medical data on disease classification tasks.
Speaker : Xiaofeng Lin (UCLA)
Abstract: Attributed largely to their extensive and diverse pre-training datasets and versatile transformer architectures, foundation models (FMs), such as large language models (LLMs), can proficiently handle a wide array of data domains and tasks. Despite their advancements, the application of foundation models to tabular data remains a formidable challenge. This difficulty stems from the intrinsic structural complexities of tables, their heterogeneous features, and the variability in data dimensions. Recent methodologies have sought to bridge this gap by employing language-interfacing techniques, which translate table observations into textual strings, thereby allowing large language models to process this converted data. However, our analysis identifies significant limitations in these approaches, particularly in high computational cost and the modeling of complex numerical data.To address these challenges, we introduce a novel framework for a foundation tabular model that integrates a latent diffusion model on a transferrable latent space of tables. Our empirical evidence suggests that this framework exhibits substantial promise in crafting a versatile foundation tabular generative model.
Bio: Xiaofeng Lin is a doctoral candidate at the University of California, Los Angeles (UCLA), specializing in statistics under the mentorship of Dr. Guang Cheng. His research is centered on the generation of synthetic tabular data, as well as enhancing the safety and trustworthiness of Large Language Models (LLMs).
Speaker : Lichen Shen (Medidata)
Abstract: In the session we offer an alternative view to identify/evaluate (Gen)AI use cases in Clinical Trial, the guiding principle we must follow to be regulatory compliant, and a design framework to enable production level software deployment. We will also deep drive into 1 or 2 specific case studies that ties the use case, guiding principle, and design framework together in a practical manner.
Bio: Lichen imagines a world without diseases and monsters. He innovates in commercial tech, national defense intelligence, and life sciences industry. One of his current roles is co-leading the (Generative) AI Platform as a Service and the next generation clinical trial data platform at Medidata.
Speaker : Wilko Schulz-Mahlendorf (Amazon Health Science)
Abstract: This talk will provide a high level overview of problems and opportunities confronting health customers, patients, and providers. Examples and anecdotes from ecommerce, pharmacy, and primary care will feature prominently, but solutions to the posed problems are applicable to broader health care contexts.
Bio: Wilko is the Director of Amazon Health Science. His team collaborates with groups across Amazon Health Services to deliver advanced machine learning and economics solutions. Their portfolio includes: (1) delivery of safe and scalable large language models (LLMs) to support customer, patient, and provider applications and (2) programs and initiatives fueled by insights from behavioral economics and health economics. Wilko has recently returned to Amazon from Wayfair where spearheaded the company’s generative AI strategy and led Wayfair’s pricing and marketing data science organizations. Prior to Wayfair, in his initial tenure at Amazon, Wilko played a key role in launching Amazon’s Economics Practice. He led interdisciplinary data science teams across Amazon’s retail, logistics, and video and music departments to develop production ML systems powered by economic insights. Wilko holds a PhD in Economics from UCLA, as well as graduate and undergraduate degrees in economics and mathematics from Duke University.
Talks in Session 2
Speaker : Sheng Wang (University of Washington)
Abstract: Medical foundation models have achieved state-of-the-art performance on a variety of biomedical applications. Despite their encouraging performance on artificial biomedical benchmarks, there are still critical gaps needed to be filled before these models can be used in real-world clinics. In this talk, I will first introduce these gaps, including incomplete patient information, accessibility and privacy. I will introduce BiomedCLIP, a public multi-modal medical foundation model trained from 15 millions text-image pairs. BiomedCLIP, as a public model, can be used a proxy for clinicians to query large language models without exposing their private data. I will then introduce LLaVA-Rad, a 7B parameter that achieves superior performance than Med PaLM M (84B), demonstrating the possibility to build small models that reduce the computational resources needed for fine-tuning and inference in clinics.
Bio: Dr. Sheng Wang is an assistant professor in the School of Computer Science and Engineering at the University of Washington. He is interested in developing large-scale models for biomedical applications, with a focus on digital pathology and drug discovery. His research has been published in top journeys such as Nature, Science, Nature Biotechnology, Nature Machine Intelligence and The Lancet Oncology, and used by major biomedical institutes, including Chan Zuckerburg Biohub, UW Medicine and National Center for Advancing Translational Sciences.
Speaker : Quanquan Gu (UCLA)
Abstract: Electronic health records (EHRs) are a pivotal data source that enables numerous applications in computational medicine, e.g., disease progression prediction, clinical trial design, and health economics and outcomes research. Despite wide usability, their sensitive nature raises privacy and confidentially concerns, which limit potential use cases. To tackle these challenges, we explore the use of generative models to synthesize artificial, yet realistic EHRs. While diffusion-based methods have recently demonstrated state-of-the-art performance in generating other data modalities and overcome the training instability and mode collapse issues that plague previous GAN-based approaches, their applications in EHR generation remain underexplored. The discrete nature of tabular medical code data in EHRs poses challenges for high-quality data generation, especially for continuous diffusion models. To this end, we introduce a novel tabular EHR generation method, EHR-D3PM, which enables both unconditional and conditional generation using the discrete diffusion model. Our experiments demonstrate that EHR-D3PM significantly outperforms existing generative baselines on comprehensive fidelity and utility metrics while maintaining less membership vulnerability risks. Furthermore, we show EHR-D3PM is effective as a data augmentation method and enhances performance on downstream tasks when combined with real data.
Bio: Quanquan Gu is an Associate Professor of Computer Science at UCLA. His research is in artificial intelligence and machine learning, with a focus on nonconvex optimization, deep learning, reinforcement learning, large language models, and deep generative models. Recently, he has been utilizing AI to enhance scientific discovery in domains such as biology, medicine, chemistry, and public health. He received his Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign in 2014. He is a recipient of the Sloan Research Fellowship, NSF CAREER Award, Simons Berkeley Research Fellowship among other industrial research awards.
Speaker : Ramesh Durvasula (Eli Lily)
Abstract: TBD
Bio: As the Information Officer for R&D, Ramesh Durvasula is responsible for the technology portfolio (software, data, analytics, etc) that accelerates the output of Lilly R&D, from drug discovery to clinical, regulatory, pharmacovigilance, etc. His mission at Lilly is to unite science with technology to accelerate the R&D pipeline and deliver innovative therapeutics. Prior to Lilly, Ramesh spent over a decade at BMS, integrating discovery labs and digital systems into seamless capabilities. Prior to BMS, Ramesh spent several years at Tripos, a computational chemistry software firm. He is an active leader in many industry forums such as the Pistoia Alliance, and he is also a member of the Dean’s Advisory Council of Indiana University’s Luddy School of Informatics, Computing, and Engineering. Ramesh earned his BA in Chemistry and PhD in Biophysics from the University of Virginia.
Talks in Session 3
Speaker : Haoda Fu (Eli Lily)
Abstract: Generative AI is a rapidly evolving technology that has garnered significant interest lately. In this presentation, we'll discuss the latest approaches, organizing them within a cohesive framework using stochastic differential equations to understand complex, high-dimensional data distributions. We'll highlight the necessity of studying generative models beyond Euclidean spaces, considering smooth manifolds essential in areas like robotics and medical imagery, and for leveraging symmetries in the de novo design of molecular structures. Our team's recent advancements in this blossoming field, ripe with opportunities for academic and industrial collaborations, will also be showcased.
Bio: Dr. Haoda Fu is an Associate Vice President and an Enterprise Lead for Machine Learning, Artificial Intelligence, and Digital Connected Care from Eli Lilly and Company. Dr. Haoda Fu is a Fellow of ASA (American Statistical Association), and IMS Fellow (Institute of Mathematical Statistics). He is also an adjunct professor of biostatistics department, Univ. of North Carolina Chapel Hill and Indiana university School of Medicine. Dr. Fu received his Ph.D. in statistics from University of Wisconsin - Madison in 2007 and joined Lilly after that. Since he joined Lilly, he is very active in statistics and data science methodolog research. He has more than 100 publications in the areas, such as Bayesian adaptive design, survival analysis, recurrent event modeling, personalized medicine, indirect and mixed treatment comparison, joint modeling, Bayesian decision making, and rare events analysis. In recent years, his research area focuses on machine learning and artificial intelligence. His research has been published in various top journals including JASA, JRSS, Biometrika, Biometrics, ACM, IEEE, JAMA, Annals of Internal Medicine etc.. He has been teaching topics of machine learning and AI in large industry conferences including teaching this topic in FDA workshop. He was board of directors for statistics organizations and program chairs, committee chairs such as ICSA, ENAR, and ASA Biopharm session. He is a COPSS Snedecor Awards committee member from 2022-2026, and will also serve as an associate editor for JASA theory and method from 2023.
Speaker : Hope Johnson (Teladoc Health)
Abstract: The inclusion of machine learning and artificial intelligence approaches have become table stakes components of most advanced solutions within the healthcare industry. Yet historically many of these advancements have not served all populations equitably. With more focus on health equity from government agencies like Centers for Medicaid and Medicare Services, industry players are more and more being asked about how they are ensuring equitable access, experience and care for their population, especially in reference to advanced ML solutions. With continued use of machine learning and now the explosion of generative AI approaches, how do developers ensure that models, especially those trained with real-world data portraying the current healthcare inequities, don’t recapitulate and reinforce this biased state of the industry?
Bio: Dr. Hope Johnson serves as the Director of Data Science at Teladoc Health, where she applies her extensive experience in data science to enhance digital healthcare solutions and promote equity. Her current work includes leading the health equity analysis of Teladoc's offerings and contributing to the Health Equity Data Governance team. With a strong background in both the practical and strategic aspects of data science from her years of experience within healthcare technology, Dr. Johnson has played a pivotal role in optimizing healthcare efficiency and reducing costs through data-driven insights. Holding a PhD in Neurobiology from UCLA, her academic and professional journey reflects a dedicated pursuit of applying sophisticated data analysis to understand and solve complex healthcare challenges. Dr. Johnson combines her technical expertise with a deep commitment to making healthcare accessible and fair. Her approach is characterized by a practical focus on leveraging technology to address inequities in healthcare access and outcomes.