PharML 2022

Lung Cancer Survival Prediction Challenge

Discovery Challenge at ECML PKDD 2022

Lung Cancer Survival Prediction Challenge

Every year 1.9 million people are diagnosed with Non-Small Cell Lung Cancer (NSCLC). Only 25% of those will survive 5 years beyond their diagnosis, with prognosis depending on many factors including demographics, clinical characteristics, and genetic alterations, among others.

Survival Machine Learning models could enable us to better predict prognosis for individual patients, which in turn has real-world clinical applications for improving treatment and our understanding of NSCLC. Additionally, representations learned by fitting Survival Machine Learning models on NSCLC data could be used to stratify patients and obtain clinical clusters or phenotypes that give us insights on how to better categorize NSCLCs.

In this challenge, you will predict the risk of overall death using clinical EHR data from around 75,000 advanced NSCLC patients provided by Flatiron Health. The features consist of patient characteristics such as demographic information, vital sign data, and biomarkers.


The challenge will take place between the following dates:

  • Start: April 15, 2022 April 25, 2022 (extended)

  • End: June 15, 2022


Due to a limitation on the number of total participants, the organizers ask potential participants to pre-apply for the challenge. Approved applicants will then receive a communication before the official start date of the challenge regarding a mandatory data usage agreement (DUA) that must be signed with Flatiron Health in order to participate. Once the DUA is executed, participants will receive the login credentials that will allow access to the challenge execution and data environments.

You can find our form to pre-apply for participation here. Pre-registration is closed.

The winners will be announced on the PharML 2022 website. Winners and selected competitors will be invited to present their solutions at the ECML-PKDD 2022 Discovery Challenge. Participants from winning teams who are students or recent graduates will also be considered for a potential internship at Roche.

Data Challenge Organizers

  • James Black (Roche, Switzerland)

  • Selen Bozkurt (Flatiron Health, USA)

  • Lee Cooper (Northwestern University, USA)

  • Naghmeh Ghazaleh (Roche, Switzerland)

  • Jonas Richiardi (Lausanne University Hospital and University of Lausanne, Switzerland)

  • Damian Roqueiro (Roche, Switzerland)

  • Diego Saldana (Novartis, Switzerland)