110. Metrics say percent Doublecheck those decimals Fractions everywhere

IMPORTANT: This is the legacy GATK documentation. This information is only valid until Dec 31st 2019. For latest documentation and forum click here

created by shlee

on 2016-03-18

[Picard metrics](https://broadinstitute.github.io/picard/picard-metric-definitions.html) that say ‘percent’ actually mean ‘fraction’. Let’s take metrics from MarkDuplicates as an example. Under PERCENT_DUPLICATION we see 0.134008. If we divide READ_PAIR_DUPLICATES by READ_PAIRS_EXAMINED we get ~1/7 or 14%. Our sanity check makes clear the PERCENT_DUPLICATION metric is a fraction that translates to 13.4%.

——

Title is a [haiku](http://www.toyomasu.com/haiku/).

Updated on 2016-03-18

From mzabidi on 2017-12-29

Thanks for this. I never get exactly the same number as in the PERCENT_DUPLICATION when I do READ_PAIRS_EXAMINED / READ_PAIR_DUPLICATES,but it’s good enough…

From shlee on 2018-01-02

Hi @mzabidi,

Not getting the same number is concerning? Are you running this on the same exact file multiple times and not getting the same summary metrics?

From mzabidi on 2018-01-12

@shlee ,

What I meant is that, when I divide READ_PAIRS_EXAMINED by READ_PAIR_DUPLICATES, I didn’t get exactly the same as PERCENT_DUPLICATION.

Usually they are really close, though.

i.e, in the example above: 18254/136562 is 0.133668, rather than 0.134008 as reported.

From Sheila on 2018-01-17

@mzabidi

Hi,

I suspect the tool is not using the filtered reads in its calculation, but the filtered reads are reported in the output.

-Sheila

Report abuse