Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluations
The 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) Tutorial
June 10th, 2025, Sydney Masonic Centre
Sydney, Australia
Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluations
The 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) Tutorial
June 10th, 2025, Sydney Masonic Centre
Sydney, Australia
Abstract
Computational Pathology Foundation Models (CPathFMs) have emerged as a transformative approach for automating histopathological analysis by leveraging self-supervised learning on large-scale, unlabeled whole-slide images (WSIs). These models, categorized into uni-modal and multi-modal frameworks, facilitate tasks such as segmentation, classification, biomarker discovery, and prognosis prediction. However, the development of CPathFMs faces significant challenges, including limited dataset availability, domain-specific adaptation requirements, and the absence of standardized evaluation benchmarks. This tutorial will provide a comprehensive overview of the current state of CPathFMs, covering key datasets, adaptation strategies such as contrastive learning and multi-modal integration, and a taxonomy of evaluation tasks. We will discuss how these models are trained, fine-tuned, and assessed, addressing the critical gaps in generalization, bias mitigation, and clinical applicability. Additionally, we will explore emerging research directions in fairness, transparency, security, and standardization of evaluation protocols. This tutorial will serve as an essential resource for researchers, clinicians, and AI practitioners looking to advance the field of AI-driven computational pathology.
Materials
Tutorial Slides:
TBA
Survey Paper:
Dong Li, Guihong Wan, Xintao Wu, Xinyu Wu, Ajit J. Nirmal, Christine G. Lian, Peter K. Sorger, Yevgeniy R. Semenov, Chen Zhao. A Survey on Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluation Tasks. ArXiv preprint, 2501.15724.
Paper Link: https://arxiv.org/abs/2501.15724
Schedule
Time: 11:00 am - 12:30 pm, June 10th, 2025
Location: Tuscan Room (Ground Floor)
Outline
Part I: Introduction to Computational Pathology in Healthcare
Part II: Foundations of Computational Pathology Foundation Models
Part III: Pretraining Challenges and Adaptation Strategies
Part IV: Datasets and Benchmarking for CPathFMs
Part V: Uni-Modal and Multi-Modal CPathFMs
Part VI: Evaluation and Performance Assessment of CPathFMs
Part VII: Future Directions and Open Challenges
Part VIII: Q&A and Interactive Discussion
Tutors' Bios
Dong Li is currently a Ph.D. student in the Department of Computer Science at Baylor University. His main research directions include graph mining, fairness-aware machine learning, domain generalization, and computational biology. He has received multiple academic scholarships and national competition awards. His publications have been accepted by top international conferences such as KDD, IJCAI, CIKM, etc.
Chen Zhao is an Assistant Professor in the Department of Computer Science at Baylor University. His research focuses on machine learning, data mining, and computational biology, particularly fairness-aware machine learning, novelty detection, and domain generalization. His publications have been accepted and published in premier conferences, including KDD, CVPR, IJCAI, ICDE, AAAI, WWW, etc. Dr. Zhao served as a PC member of top international conferences, such as KDD, NeurIPS, IJCAI, ICML, AAAI, ICLR, etc. He has organized and chaired multiple workshops on topics of Ethical AI, Uncertainty Quantification, Distribution Shifts, and Trustworthy AI for Healthcare at KDD (2022, 2023, 2024, 2025), AAAI (2023), IEEE BigData (2024), and SDM (2025). He serves as the chair of the Challenge Cup of the IEEE Bigdata 2024 conference, the tutorial chair for the PAKDD 2025 and ICDM 2025 conferences, and the workshop chair for the IEEE Bigdata 2025 conference.
Guihong Wan is an Instructor of Dermatology at Massachusetts General Hospital and Harvard Medical School. Her research focuses on developing computational methodologies for integrative analyses of biomedical data and building biologically explainable machine learning models for predicting patient outcomes. Her research has been published in AAAI, IJCAI, ICDE, The Lancet Oncology, npj Precision Oncology, Briefings in Bioinformatics, Journal of the American Academy of Dermatology, British Journal of Dermatology, Nature Medicine, and others. She served as the tutorial co-chair for the 16th International Conference on Brain Informatics, 2023.
Xintao Wu is a Professor and the Charles D. Morgan/Acxiom Endowed Graduate Research Chair in Database and leads Social Awareness and Intelligent Learning (SAIL) Lab in the Electrical Engineering and Computer Science Department at the University of Arkansas. Dr. Wu is an associate editor or editorial board member of several journals and program committees as area chair, senior PC, and PC of top international conferences. He has served as the program co-chair of ACM EAI-KDD workshops (2022, 2023, 2024, 2025), the IEEE BigData'2020, ICLMA'2024, and PAKDD'2025. He also gave multiple tutorials on trustworthy AI at top international conferences, including ACM KDD, IEEE BigData, and SIAM SDM.
Yevgeniy R. Semenov is a practicing dermatologist and an Assistant Professor of Dermatology at Massachusetts General Hospital and Harvard Medical School. He is also the co-director of the Oncodermatology Program. His primary areas of clinical and research interest are in oncodermatology and cutaneous oncology. He received his MD degree from Johns Hopkins University School of Medicine and completed residency training in Dermatology at Washington University in Saint Louis. His research has been published in the Journal of the American Academy of Dermatology, British Journal of Dermatology, Journal of Investigative Dermatology, JAMA Dermatology, The Lancet Oncology, npj Precision Oncology, Nature Medicine, and others.