This study compares bcbio variant callers using a small ACMG data set. A NA12878 bam with downsampled reads (25-30x) was downloaded from the GiaB ftp site. bedtools intersect was used to extract reads from the 56 ACMG regions for variant calling.
bcbio with GRCh37 reference data was installed on a CentOS7 virtual machine with 1 core and 3 gb memory. After minor tweaks (noted in the configuration file), initial testing showed successful runs with samtools, platypus, freebayes, and vardict for SNV callers and lumpy, delly, and cnvkit for SV callers. SV callers were not compared in this study because no significant structural variants were detected within the ACMG regions. See below for notes on other bcbio-installed variant callers.
The SNV callers vardict, samtools, platypus, freebayes, and the bcbio ensemble method with minimum calls of 2 or 3 (ensemble2 and ensemble3) were used to generate variant calls for the ACMG reads. These calls were compared to the truth dataset giab-NA12878 provided by bcbio. The bcbio Validate tool (based on RTG tools) was used to produce the following reports for this comparison.
The images show comparison of bcbio-installed SNP and indel variant callers with an ensemble run requiring 2 positive calls in the top figure (ensemble2), or 3 positive calls required in the bottom figure (ensemble3). Ensemble2 has a higher total number of calls, with fewer missed calls (FN - false negative) and more false positives (FP). This is the expected behavior, as ensemble3 is more stringent.
In this study, the ensemble methods perform well on SNPs, where ensemble2 makes more true positive calls with fewer FN and better FP than other methods. Ensemble3 misses more of the true variants, but reaches the lowest false positive rate among all methods. For indels, freebayes outperforms every method in every way except has higher FP rates than platypus and ensemble3. Ensemble2 outperforms vardict, samtools, and platypus in all ways except for platypus' reduced FP rate.
In conclusion based on these very simple results we cannot rule out that ensemble2 and ensemble3 are preferable over singular variant callers, for SNV variants. There is a built-in scale (min calls, 2 or 3) that gives us superior normal use (ensemble2) with the option of running in stringent mode (ensemble3) with reduced numbers of false positives. This small study is just a sanity check though -- much larger data sets, diverse variant caller portfolios, and in-depth analyses would be required to verify these claims.
Note: The following image is the second bed region, and is simply visual confirmation that everything looks reasonable. The GiaB truth calls are the grey bars in the center. The colored bars above the truth calls show the locations of bcbio variant caller results. The bottom shows the mapped reads.
Note: Among callers present in bcbio but not included in this study: qsnp, tnhaplotypecaller, cn.mops, cpgbattenberg may be working but require tumor/normal pairs; varscan as installed is not producing quality scores in its vcfs; mutect and gatk are not yet installed; manta seems to want more memory than my VM can give it at the moment; gridss and seq2c seem to be under current development.