9:30 a.m.–10.00 a.m.
10.00 a.m.–10.15 a.m.
Haiping Lu
10.15 a.m.–11.00 a.m.
Sarah Gibson
Abstract: Siloed science is slow science. A lack of reproducibility and interoperability across systems make it difficult to build upon progress. However, breaking down these siloes often requires time, effort and intention at the start of a project, which can be overwhelming. During this talk, Sarah will discuss lessons learned through working in Open Science, Open Source Software, and Open Infrastructure projects that put us on pathways to solving reproducibility and interoperability issues in research, in the context of the health sector.
Break 11.00 a.m.–11.15 p.m. (Tea and coffee provided)
11.15 a.m.–11.40 p.m.
Andres Diaz-Pinto
Abstract: The lack of annotated datasets is a major challenge in training new task-specific supervised AI algorithms as manual annotation is expensive and time-consuming. To address this problem, we present MONAI Label, a free and open-source platform that facilitates the development of AI-based applications that aim at reducing the time required to annotate 3D medical image datasets. Through MONAI Label researchers can develop annotation applications focusing on their domain of expertise. It allows researchers to readily deploy their apps as services, which can be made available to clinicians via their preferred user-interface. Currently, MONAI Label readily supports locally installed (3DSlicer) and web-based (OHIF) frontends, and offers two Active learning strategies to facilitate and speed up the training of segmentation algorithms. MONAI Label allows researchers to make incremental improvements to their labeling apps by making them available to other researchers and clinicians alike. Lastly, MONAI Label provides sample labeling apps, namely DeepEdit and DeepGrow, demonstrating dramatically reduced annotation times.
11.40 p.m.–12.05 p.m.
Fernando Pérez-García
Abstract: Processing of medical images such as MRI or CT presents unique challenges compared to RGB images typically used in computer vision. TorchIO is an open-source Python library to enable efficient loading, preprocessing, augmentation, and patch-based sampling of medical images for deep learning. It provides multiple generic preprocessing and augmentation operations as well as simulation of MRI-specific artifacts. TorchIO was developed to help researchers standardise medical image processing pipelines and allow them to focus on the deep learning experiments. It encourages open science by supporting reproducibility and representing a standard for data augmentation of multidimensional images. In this talk, I will describe the library and its history, its adoption in research and industry, including at Microsoft Research, and what I have learnt developing it.
12.05 p.m.–12.30 p.m.
Benedek Rozemberczki
Abstract: In this talk I discuss ChemicalX, a PyTorch-based deep learning library designed for providing a range of state of the art models to solve the drug pair scoring task. The primary objective of the library is to make deep drug pair scoring models accessible to machine learning researchers and practitioners in a streamlined framework.The design of ChemicalX reuses existing high level model training utilities, geometric deep learning, and deep chemistry layers from the PyTorch ecosystem. Our system provides neural network layers, custom pair scoring architectures, data loaders, and batch iterators for end users. These features are showcased with example code snippets and case studies to highlight the characteristics of ChemicalX. A range of experiments on real world drug-drug interaction, polypharmacy side effect, and combination synergy prediction tasks demonstrate that the models available in ChemicalX are effective at solving the pair scoring task. Finally, it is shown that ChemicalX could be used to train and score machine learning models on large drug pair datasets with hundreds of thousands of compounds on commodity hardware.
Lunch 12.30 p.m.–1.30 p.m. (Provided)
1.30 p.m.–2.15 p.m.
Gaël Varoquaux
Abstract: Developing data analyses pipelines around open source, open data, and open communities has shown great successes. The most used machine-learning tools to date, scikit-learn, was assembled by a large community, with different contributors bringing different expertise. Can such success be carried over to health data? I will discuss my experience building first scikit-learn, then nilearn in the brain imaging community, and finally more recent work in electronic health records. Spoiler: things are harder with electronic health records.
2.15 p.m.–2.40 p.m.
Yipeng Hu
2.40 p.m.–3.25 p.m.
Mihaela van der Schaar
What are the major bottlenecks of open-source AI software in healthcare?
How do we engage more researchers in open-source AI software development for healthcare?
How will AI standards impact open-source software development in the short/long term?
Closing 3.55 p.m.-4.00 p.m.