Case Studies‎ > ‎

MSI/MSS Colorectal Cancers


Overview of this case study

It is currently accepted that colorectal tumors can be classified according to their global genomic status into two main types: microsatellite instable tumors (MSI) and microsatellite stable (MSS) tumors (also known as tumors with chromosomal instability).  This taxonomy plays a significant role in determining pathologic, clinical and biological characteristics of colon tumors: MSS tumors are characterized by changes in chromosomal copy number and show worse prognosis,  on the contrary the less common MSI tumors (about 15%) are characterized by the accumulation of a high number of mutations and show predominance in females, proximal colonic localization, poor differentiation, tumor-infiltrating lymphocytes and a better prognosis. 

In addition, these subtypes exhibit different responses to chemotherapeutic agents, and it is also well established that arise from a distinctive molecular mechanism. While MSS tumors generally follow the classical adenoma-to-carcinoma progression described by Vogelstein and Fearon, MSI tumors results from the inactivation of DNA mismatch repair genes like MLH-1.
We instantiated our Pipeline for Cancer Inference PiCnIc  to process MSI and MSS colorectal tumors collected from the The Cancer Genome Atlas (TCGA) project "Human Colon and Rectal Cancer'' (COADREAD).

Instantiation of the PiCnIc pipeline

  • In brief, we split subtypes by the microsatellite status of each tumor as annotated by the TCGA consortium
  • Once split into groups, the input COADREAD data is processed to maintain only samples for which both high-quality curated mutation and CNA data are available; for CNAs we use focal high-level amplifications and homozygous deletions from GISTIC.
  • Then, for each sample we select only alterations (mutations/CNAs) from a list of 33 driver genes manually annotated to 5 pathways by the consortium, as a result of manual curation and running MutSigCV.
  • Then,  we fetch groups of exclusive alterations. We scanned these groups by using MUTEX and merged its results with the group that TCGA detected by using MEMOKnowledge on the potential exclusivity among some input driver genes was exploited as well (e.g., APC/CTNNB1). 
  • Groups were then used to create CAPRI's formulas; we also included hypotheses for genes which harbour mutations and homozygous deletions across different samples.
  • CAPRI was run by selecting recurrent alterations from the pool of 33 pathway genes and using both AIC/BIC regularizer. CAPRI's edges are assessed by requiring significance p<0.05 after 100 non-parametric bootstrap iterations.
  • The significance of the reconstructed models and the input data is assessed by computing: p-values for temporal priority, probability raising and hypergeometric testing, non-parametric and statistical bootstrap scores and entropy loss, prediction and posterior errors via 10-fold cross-validation.
For clarity in this page we only show results from MSI-HIGH tumors; the source code deals with both subtypes.

Example data used for MSI-HIGH tumors

  • [A]  MSI-HIGH colorectal tumors  used for inference. Data from the TCGA COADREAD project, restricted to 27 samples with both somatic mutations and high-resolution CNA data available and a selection out of 33 driver genes annotated to 5 pathways. This dataset is used to infer the model shown below. 
  • [B] Altered pathways. Mutations and CNAs in these tumors mapped to pathways confirm heterogeneity even at the pathway-level.  
  • [C] Mutually exclusive alterations. Groups were obtained from the TCGA reference publication and by the MUTEX/MEMO tools. Plus, previous knowledge about exclusivity among genes in the RAF pathway was exploited.  
  • [D] Construction of a formula. A Boolean formula inputed to CAPRI to test the hypothesis that alterations  the RAF genes KRAS, NRAS and BRAF confer equivalent selective advantage.  The formula accounts for hard exclusivity of alterations in NRAS mutations and deletions, jointly with soft exclusivity with KRAS and BRAF alterations.

Example progression of MSI-HIGH tumors

  • [A] Selective advantage relations inferred by CAPRI constitute MSI-HIGH progression. Formulas written on groups of exclusive alterations are expanded. We can find interesting relations involving APC mutations which select for PIK3CA ones (via BIC) as well as selection of the MEMO group (ERBB2/PIK3CA mutations or IGF2 deletions) predicted by AIC. Similarly, we find a strong selection trend among mutations in ERBB2 and KRAS, despite in this case the temporal precedence among those mutations is not disentangled as the two events have the same marginal frequencies (26%). 
  • [B] Branching and confluent evolutionary trajectories of clonal expansion inferred from the selective advantage relations implicit in the data. Such trajectories capture progression trends that are representative of alternative trajectories among patients, as driven by different types of genomic lesions. Note that while the majority of the selectivity inferences are genuine, some of them could be spurious: e.g., the suggestion that APC-mutated clones shall enjoy expansion, up to acquisition of further selective advantage via mutations or homozygous deletions in NRAS. Nonetheless, the putative genuine selectivity relations need to be further validated: e.g., the suggestion that the clones of patients harbouring distinct alterations in ACVR1B -- and different upstream events -- will enjoy further selective advantage from mutation in the TGFBR2 gene.

Reference Papers

Source code

  • The R source code which replicates this study is hosted at our GitHub, download it as follows:  
git clone

The source code will install automatically TRONCO.