See more validation strategies in the notes and rulings tabs of the survey spreadsheet, and in the Benchmarks page.
BIC-seq2 2016 CNV
- Validated by generating 100 pseudo hg18 chr22 with 42 CNV. metaSim was used to “sequence” the pseudochromosomes, and GC content bias was introduced. Mapped with bwa to different coverage levels.
- Compared to d CNVnator, FREEC, and ReadDepth
CNVnator 2011 CNV
- Validated against CEPH and Yoruba trios from 1k genomes project, including CGH array data.
VarScan2 2012 SNP denovo
- Validation data included 151 tumor-normal pairs from CGA Research Network (dbGaP database http:// www.ncbi.nlm.nih.gov/gap).
- Validated CNV variant caller using 5 ovarian tumor samples that were validated using combined SNP array, exome and WG sequencing.
- Validated putative somatic coding SNVs by PCR and deep resequencing.
- Validated sensitivity of mutation detection by comparing to validated somatic mutations reported for 60 tumor–normal pairs from CGA Research Network 2011.
Bambino 2011 SNP
- Validated against 55 deep sequenced SNPs, validated variants from TCGA
- seven normal samples from the TCGA ovarian cancer project which have both whole-genome next-generation sequencing and Affymetrix SNP6 array data
DeepSNV 2014 SNV
- Validate against 111 cancer genes from hematological cancers. Amplified the genes, sequenced, aligned with bwa to GRCh37. Uses 32 normal samples, 20 technical replicates of cancer samples.
- Compares to Caveman, Mu Tect, and DeepSNV.
EBCall 2013 SNV
- Performs deep sequencing on interesting loci to validate gold standard mutations
- Study compares ebcall to genomon-fisher, VarScan, SomaticSniper
Mutect 2013 SNV
- Validated with synthetic data and downsampling
Seurat 2013 SNP
- Validation: treat NA19240 and NA12878 as normal and ‘tumor’ then comparing variant calls to validated variant calls from HapMap.
- Simulate tissue contamination by sampling randomly from two bams
Shimmer 2013 SNV
Somatic Sniper 2011 SNV
- Validation: against synthetic data as well as variant calls from cell line COLO-829BL (lymphoblastoid) subtracted from cell line COLO-829, a malignant melanoma pre-treatment
Strelka 2012 SNP
- Validation: against COLO-829/COLO-829BL variant dif, plus COLO-829 reads mixed with COLO-829BL reads to simulate tumor impurities.
Virmid 2013 SNV
- Uses synthetic data, hg19 with artificial mutations to validate purity estimation algorithm.
- 15 tumor to normal matched whole exome samples were taken from among 545 available at CGhub (which is currently down).
- Somatic mutations were previously validated with capture/resequencing.
- Another validation dataset of 5 HME normal/tumor paired samples was taken from a study behind a paywall http://www.nature.com/ng/journal/v44/n8/full/ng.2329.html.
Delly 2012 SV/CNVs
- Validated on synthetic data
- Compared to Pindel, Breakdancer, GASV, and HYDRA
- validated on 635 samples <1kb builds from 1000GP pilot data
- PCR validation experiments on five pilot samples (NA07347, NA10847, NA11831, NA11992 and NA12003) to assess the accuracy of SVs discovered by DELLY. Out of 44 randomly selected deletion loci
Lumpy 2014 SV
- Validation: Simulated and vs NA12878 CEPH.
- 2500 various artificial SV, 5516 deletion SVs added artificially to build 37.
- Known SVs in NA12878.
- Artificial reads with variable percent purity made using WGSIM, from NA12878 modified using SVsim, to make variable MAF and represent heterozygocity.
- Compared to Delly, GASVpro, pindel
VarDict 2016 SV/SNP
- Validated using 1 NA12878 and Genome In a Bottle (GIAB) variant calls 2 DREAM: synthetic challenge 3 and 4:
- artificial 60–80× coverage WGS datasets from single sample, randomly sampled into non-overlapping subsets, and introduced typical cancer mutations to half
- 3 exome BAM files of TCGA lung adenocarcinoma (LUAD) from CGHub (https://cghub. ucsc.edu/) for a given set of genes