Tumor-Normal Titration

The CGC project hosting tumor-normal titration data is located here

Reheader BAM files with multiple SM's

MuTect2 identifies tumor and normal sample by SM tags in the BAM header instead of BAM files itself. Some BAM files needed to be reheadered because of it. These are the CGC apps to do so:

SomaticSeq 5-tool Settings (3-tool for INDEL)

Uploaded two BWA-based classifiers that were trained on all of the synthetic data used to build the original gold set, plus the combined NovaSeq data, i.e., same approach as the NeuSomatic training data, with "if_TNscope" excluded from classifier building procedure:

10X

30X

50X

80X

100X

200X

300X

Ran on local cluster due to running out of fund temporarily

SomaticSeq prediction

Then, use the SomaticSeq classifiers specified above to predict mutation status, i.e., score the *.Ensemble.sSNV.tsv and *.Ensemble.sINDEL.tsv files

#!/bin/bash


#$ -o /PATH/Tumor-Normal-Purity/logs

#$ -e /PATH/Tumor-Normal-Purity/logs

#$ -S /bin/bash

#$ -l h_vmem=128G

set -e


for file in /PATH/Tumor-Normal-Purity/*.Ensemble.sSNV.tsv

do

    docker run --rm -v /PATH:/PATH -u $UID lethalfang/somaticseq:2.8.1 /opt/somaticseq/r_scripts/ada_model_predictor.R /PATH/Classifiers/GoldSetData.bwa.sSNV.tsv.ntChange.Classifier.RData $file ${file/Ensemble/SomaticSeq}

    docker run --rm -v /PATH:/PATH -u $UID lethalfang/somaticseq:2.8.1 /opt/somaticseq/SSeq_tsv2vcf.py -tsv $file -vcf ${file%.Ensemble.sSNV.tsv}.SomaticSeq.sSNV.vcf -pass 0.5 -low 0.1 -all -phred -paired -tools MuTect2 SomaticSniper VarDict MuSE Strelka

done


for file in /PATH/Tumor-Normal-Purity/*.Ensemble.sINDEL.tsv

do

    docker run --rm -v /PATH:/PATH -u $UID lethalfang/somaticseq:2.8.1 /opt/somaticseq/r_scripts/ada_model_predictor.R /PATH/Classifiers/GoldSetData.bwa.sINDEL.tsv.ntChange.Classifier.RData $file ${file/Ensemble/SomaticSeq}

    docker run --rm -v /PATH:/PATH -u $UID lethalfang/somaticseq:2.8.1 /opt/somaticseq/SSeq_tsv2vcf.py -tsv $file -vcf ${file%.Ensemble.sINDEL.tsv}.SomaticSeq.sINDEL.vcf -pass 0.5 -low 0.1 -all -phred -paired -tools MuTect2 VarDict Strelka

done


NeuSomatic Pre-processing

10X

30X

50X

80X

200X

300X