Overview The Pipeline for Cancer Inference PiCnIc is our attempt at devise an effective pipeline to extract ensemble-level cancer progression models from cross-sectional data. The pipeline is versatile, modular and customizable and exploits state-of-the-art data processing and machine learning tools to:
The pipeline was first described in our paper:
The main steps of PicNiC Motivation All these steps are necessary to minimize the confounding effects of inter-tumor heterogeneity, which are likely to lead to wrong results when data is not appropriately pre-processed. In each stage of PicNiC different techniques can be employed, alternatively or jointly, according to specific research goals, input data, and cancer type. Prior knowledge can be easily accommodated into our pipeline, as well as appropriate computational tools . The rationale is similar in spirit to workflows implemented by consortia such as TCGA to analyze huge populations of cancer samples. One of the main novelties of our approach, is the exploitation of groups of exclusive alterations as a proxy to detect fitness-equivalent trajectories of cancer progression. This is only possible by the hypothesis-testing features of our recently developed CAPRI algorithm, an algorithm uniquely addressing this crucial aspect of the ensemble-level progression inference problem. Which tools The tools that PicNiC can exploit are of different nature, and we plan to include the of TRONCO to interface with them as far as our case studies are developed. We are happy to receive suggestions about tools that you would like to use with this pipeline, and accept your contribution towards this effort. The current version of TRONCO supports input/output toward these tools:
|