created by Geraldine_VdAuwera
on 2015-10-25
Bait bias (single bait bias or reference bias artifact) is a type of artifact that affects data generated through hybrid selection methods.
These artifacts occur during or after the target selection step, and correlate with substitution rates that are biased or higher for sites having one base on the reference/positive strand relative to sites having the complementary base on that strand. For example, a G>T artifact during the target selection step might result in a higher (G>T)/(C>A) substitution rate at sites with a G on the positive strand (and C on the negative), relative to sites with the flip (C positive)/(G negative). This is known as the “G-Ref” artifact.
Updated on 2016-03-07
From joneskm4 on 2016-05-05
I think we are seeing this type of artifact in some of our exome sequencing data, but I’m a little unclear on what the actual cause is. What causes the substitution during the target selection step?
From dekling on 2016-05-10
Don’t hold my feet to the fire on this, but I believe the errors are introduced to the bait from sample handling. Essentially, guanines on the bait sequence are sensitive to oxidation from extraction agents, heat, etc. This can cause some guanine nucleotides to become 8-oxoguanine (8-OxoG, OxoG) nucleotides. These modified guanines can basepair with T instead of C as would normally be expected. Thus, during PCR, this error is propagated. Since the G is sensitive to oxidation, you will likely see a higher frequency of G ->A then C->T. Is this helpful?
From aerijman on 2019-04-29
Are “bait-bias artifacts” substitutions attributed to have happened to the PROBES that were used to fish out or ENRICH the DNA sample?
From joneskm4 on 2019-05-07
Circling back around on this because we are seeing this happen again, and I don’t feel like I ever got a clear answer on what causes this. And I can’t find much in the literature about it. Is the “G-ref” artifact caused by damage to the capture probes/baits?? The description says it can happen “during or after the target selection step”. And that a “G>T artifact during the target selection step” can cause it. But that doesn’t really explain in my mind when and how the artifact is being introduced.
From joneskm4 on 2019-05-07
And I should clarify-what we are seeing are definitely G>T artifacts, not G>A/C>T, which are OxoG artifacts. Picard is flagging these samples as having low qscores for baitbias/G>T changes, too. So I think we are seeing this artifact, I just don’t understand the origin of the artifact.
From joneskm4 on 2019-05-08
After reading the pre-adapter bias documentation, it looks like oxidative damage can show up as G>T or C>A changes as well.
So, how do you know if elevated G>T rates are OxoG artifacts, or G-ref artifacts, and what’s the difference?