253. Release notes for GATK version 23

IMPORTANT: This is the legacy GATK documentation. This information is only valid until Dec 31st 2019. For latest documentation and forum click here

created by ebanks

on 2012-12-17

GATK 2.3 was released on December 17, 2012. Highlights are listed below. Read the detailed version history overview here: http://www.broadinstitute.org/gatk/guide/version-history

Base Quality Score Recalibration

Soft clipped bases are no longer counted in the delocalized BQSR.
The user can now set the maximum allowable cycle with the --maximumcyclevalue argument.

Unified Genotyper

Minor (5%) run time improvements to the Unified Genotyper.
Fixed bug for the indel model that occurred when long reads (e.g. Sanger) in a pileup led to a read starting after the haplotype.
Fixed bug in the exact AF calculation where log10pNonRefByAllele should really be log10pRefByAllele.

Haplotype Caller

Fixed the performance of GENOTYPEGIVENALLELES mode, which often produced incorrect output when passed complex events.
Fixed the interaction with the allele biased downsampling (for contamination removal) so that the removed reads are not used for downstream annotations.
Implemented minor (5-10%) run time improvements to the Haplotype Caller.
Fixed the logic for determining active regions, which was a bit broken when intervals were used in the system.

Variant Annotator

The FisherStrand annotation ignores reduced reads (because they are always on the forward strand).
Can now be run multi-threaded with -nt argument.

Reduce Reads

Fixed bug where sometime the start position of a reduced read was less than 1.
ReduceReads now co-reduces bams if they're passed in toghether with multiple -I.

Combine Variants

Fixed the case where the PRIORITIZE option is used but no priority list is given.

Phase By Transmission

Fixed bug where the AD wasn't being printed correctly in the MV output file.

Miscellaneous

A brand new version of the per site down-sampling functionality has been implemented that works much, much better than the previous version.
More efficient initial file seeking at the beginning of the GATK traversal.
Fixed the compression of VCF.gz where the output was too big because of unnecessary call to flush().
The allele biased downsampling (for contamination removal) has been rewritten to be smarter; also, it no longer aborts if there's a reduced read in the pileup.
Added a major performance improvement to the GATK engine that stemmed from a problem with the NanoSchedule timing code.
Added checking in the GATK for mis-encoded quality scores.
Fixed downsampling in the ReadBackedPileup class.
Fixed the parsing of genome locations that contain colons in the contig names (which is allowed by the spec).
Made ID an allowable INFO field key in our VCF parsing.
Multi-threaded VCF to BCF writing no longer produces an invalid intermediate file that fails on merging.
Picard jar remains at version 1.67.1197.
Tribble jar updated to version 119.

Tags:

official, release-notes

Updated on 2012-12-18

From severin on 2012-12-20

In the new release for the HaplotypeCaller function does downsampling do anything?

Can we still use —enable_experimental_downsampling?

What is your recommendation.

From Geraldine_VdAuwera on 2012-12-20

You no longer need to use `—enable_experimental_downsampling` for anything; the experimental downsampling is now the regular downsampling (see Version history/Version highlights for details) and is used by default by all tools that downsample reads.

Report abuse