created by Sheila
on 2015-08-18
You specified -A <some annotation>
in a command line invoking one of the annotation-capable tools (HaplotypeCaller, MuTect2, UnifiedGenotyper and VariantAnnotator), but that annotation did not show up in your output VCF.
Keep in mind that all annotations that are necessary to run our Best Practices are annotated by default, so you should generally not need to request annotations unless you're doing something a bit special.
There can be several reasons why this happens, depending on the tool, the annotation, and you data. These are the four we see most often; if you encounter another that is not listed here, let us know in the comments.
-A
.Updated on 2016-04-13
From johnma on 2016-09-12
I’m not sure whether the behavior mentioned in point (4) is still a MuTect2 behavior in 3.6-0. My experience is that if I supply a list of -A’s to the command line, then the default annotations won’t be written. This means, if I want to output some annotations on top of the default, I’d have to request for all the annotations, including those marked as default.
From shlee on 2016-09-12
Hi @johnma,
Can you give us the list of `-A`s that you are supplying where you see the default annotations being dropped? Thanks.
From Sheila on 2016-09-12
@johnma
Hi,
Yes, I think that must still be an issue in MuTect2. For now, you will need to use the workaround. I will put in a request to have this changed in the future.
-Sheila
From johnma on 2016-09-13
@shlee
Hi,
My command used is as follows:
````java -Xmx64G -jar $GATK_JAR -T MuTect2 -R $REFGENOME -I:tumor $TUMOR_INPUT -I:normal $NORMAL_INPUT —dbsnp “All_20160527.vcf.gz” —cosmic “Cosmic_v78_GRCh38_all_variants.vcf.gz” -A DepthPerAlleleBySample -A BaseQualitySumPerAlleleBySample -A TandemRepeatAnnotator -A OxoGReadCounts -o $MUTECT2_OUTPUT_FILE````
The log does not seem remarkable with the exception of the “Null qss” error that I’d expect from FisherStrand.
````INFO 16:37:40,286 HelpFormatter – —————————————————————————————————————————
INFO 16:37:40,291 HelpFormatter – The Genome Analysis Toolkit (GATK) v3.6-0-g89b7209, Compiled 2016/06/01 22:27:29
INFO 16:37:40,291 HelpFormatter – Copyright © 2010-2016 The Broad Institute
INFO 16:37:40,291 HelpFormatter – For support and documentation go to https://www.broadinstitute.org/gatk
INFO 16:37:40,292 HelpFormatter – [Fri Sep 09 16:37:40 CDT 2016] Executing on Linux 2.6.32-431.23.3.el6.x86_64 amd64
INFO 16:37:40,292 HelpFormatter – Java HotSpot™ 64-Bit Server VM 1.8.0_45-b14 JdkDeflater
INFO 16:37:40,297 HelpFormatter – Program Args: -T MuTect2 -R GRCh38.primary_assembly.genome.fa -I:tumor [redacted]_aligned.sorted.markduplicates.realigned.recal.bam -I:normal [redacted]_aligned.sorted.markduplicates.realigned.recal.bam —dbsnp All_20160527.vcf.gz —cosmic Cosmic_v78_GRCh38_all_variants.vcf.gz -A QualByDepth -A MappingQualityRankSumTest -A ReadPosRankSumTest -A FisherStrand -A StrandOddsRatio -o [redacted]
INFO 16:37:40,311 HelpFormatter – Executing as [redacted] on Linux 2.6.32-431.23.3.el6.x86_64 amd64; Java HotSpot™ 64-Bit Server VM 1.8.0_45-b14.
INFO 16:37:40,311 HelpFormatter – Date/Time: 2016/09/09 16:37:40
INFO 16:37:40,311 HelpFormatter – —————————————————————————————————————————
INFO 16:37:40,312 HelpFormatter – —————————————————————————————————————————
INFO 16:37:40,385 GenomeAnalysisEngine – Strictness is SILENT
INFO 16:37:40,558 GenomeAnalysisEngine – Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 16:37:40,569 SAMDataSource$SAMReaders – Initializing SAMRecords in serial
INFO 16:37:40,672 SAMDataSource$SAMReaders – Done initializing BAM readers: total time 0.10
WARN 16:37:41,119 IndexDictionaryUtils – Track cosmic doesn’t have a sequence dictionary built in, skipping dictionary validation
WARN 16:37:41,120 IndexDictionaryUtils – Track dbsnp doesn’t have a sequence dictionary built in, skipping dictionary validation
INFO 16:37:41,230 GenomeAnalysisEngine – Preparing for traversal over 2 BAM files
INFO 16:37:41,676 GenomeAnalysisEngine – Done preparing for traversal
INFO 16:37:41,676 ProgressMeter – [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 16:37:41,677 ProgressMeter – | processed | time | per 1M | | total | remaining
INFO 16:37:41,678 ProgressMeter – Location | active regions | elapsed | active regions | completed | runtime | runtime
INFO 16:37:41,796 MuTect2 – Using global mismapping rate of 45 => -4.5 in log10 likelihood units
Using un-vectorized C++ implementation of PairHMM
INFO 16:37:44,703 VectorLoglessPairHMM – libVectorLoglessPairHMM unpacked successfully from GATK jar file
INFO 16:37:44,704 VectorLoglessPairHMM – Using vectorized implementation of PairHMM
ERROR 16:37:50,339 MuTect2 – Null qss at 867116
…
INFO 11:18:09,312 VectorLoglessPairHMM – Time spent in setup for JNI call : 64.688102505
INFO 11:18:09,312 PairHMM – Total compute time in PairHMM computeLikelihoods() : 186546.49047094802
INFO 11:18:09,313 MuTect2 – Ran local assembly on 44824 active regions
INFO 11:18:09,576 ProgressMeter – done 3.099750718E9 66.7 h 77.0 s 100.0% 66.7 h 0.0 s
INFO 11:18:09,576 ProgressMeter – Total runtime 240027.90 secs, 4000.46 min, 66.67 hours
INFO 11:18:09,577 MicroScheduler – 25818756 reads were filtered out during the traversal out of approximately 143208294 total reads (18.03%)
INFO 11:18:09,577 MicroScheduler – -> 0 reads (0.00% of total) failing BadCigarFilter
INFO 11:18:09,577 MicroScheduler – -> 25481557 reads (17.79% of total) failing DuplicateReadFilter
INFO 11:18:09,578 MicroScheduler – -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 11:18:09,578 MicroScheduler – -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 11:18:09,578 MicroScheduler – -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
INFO 11:18:09,579 MicroScheduler – -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
INFO 11:18:09,579 MicroScheduler – -> 337199 reads (0.24% of total) failing UnmappedReadFilter````
From shlee on 2016-09-14
Thanks @johnma.