created by shlee
on 2016-03-18
[Picard metrics](https://broadinstitute.github.io/picard/picard-metric-definitions.html) that say ‘percent’ actually mean ‘fraction’. Let’s take metrics from MarkDuplicates as an example. Under PERCENT_DUPLICATION we see 0.134008. If we divide READ_PAIR_DUPLICATES by READ_PAIRS_EXAMINED we get ~1/7 or 14%. Our sanity check makes clear the PERCENT_DUPLICATION metric is a fraction that translates to 13.4%.
——
Title is a [haiku](http://www.toyomasu.com/haiku/).
Updated on 2016-03-18
From mzabidi on 2017-12-29
Thanks for this. I never get exactly the same number as in the PERCENT_DUPLICATION when I do READ_PAIRS_EXAMINED / READ_PAIR_DUPLICATES,but it’s good enough…
From shlee on 2018-01-02
Hi @mzabidi,
Not getting the same number is concerning? Are you running this on the same exact file multiple times and not getting the same summary metrics?
From mzabidi on 2018-01-12
@shlee ,
What I meant is that, when I divide READ_PAIRS_EXAMINED by READ_PAIR_DUPLICATES, I didn’t get exactly the same as PERCENT_DUPLICATION.
Usually they are really close, though.
i.e, in the example above: 18254/136562 is 0.133668, rather than 0.134008 as reported.
From Sheila on 2018-01-17
@mzabidi
Hi,
I suspect the tool is not using the filtered reads in its calculation, but the filtered reads are reported in the output.
-Sheila