Can I apply the germline variant joint calling workflow to my RNAseq data

IMPORTANT: This is the legacy GATK documentation. This information is only valid until Dec 31st 2019. For latest documentation and forum click here

created by Geraldine_VdAuwera

on 2016-04-01

We have not yet validated the joint genotyping methods (HaplotypeCaller in `-ERC GVCF` mode per-sample then GenotypeGVCFs per-cohort) on RNAseq data. Our standard recommendation is to process RNAseq samples individually as laid out in the RNAseq-specific documentation.

However, we know that a lot of people have been trying out the joint genotyping workflow on RNAseq data, and there do not seem to be any major technical problems. You are welcome to try it on your own data, with the caveat that we cannot guarantee correctness of results, and may not be able to help you if something goes wrong. Please be sure to examine your results carefully and critically.

If you do pursue this, you will need to pre-process your samples according to our RNA-specific documentation, then switch to the GVCF workflow at the HaplotypeCaller stage. For filtering, it will be up to you to determine whether the hard filtering or VQSR filtering method produce best results. We have not tested any of this so we cannot provide a recommendation. Be prepared to do a lot of analysis to validate the quality of your results.

Good luck!

From blueskypy on 2016-12-06

Thanks Geraldine! Could you explain what could be the potential problem in theory, comparing to DNA-seq, to apply joint calling on RNA-seq data?

Thanks!

From shlee on 2016-12-06

Hi @blueskypy,

Please take a look at for a brief description. You can also use the Search box at the top right of the forum page to search the term RNA-seq for related discussion. And finally, [here](http://gatkforums.broadinstitute.org/gatk/discussion/comment/32331#Comment_32331)’s one use case that comes to my mind.

From blueskypy on 2016-12-07

Thanks so much, @shlee!

From blueskypy on 2016-12-19

Hi, shlee and Geraldine

To test the benefit of joint calling on RNA-seq data, I’d need a cohort and gold standard. While 1KG and Genome In a Bottle server that purpose for DNA-seq, I wonder where are such data for RNA-seq? Do you have any recommendation?

Also the assumption for joint calling is that the samples in a cohort are similar, but I think the mRNA reads are more variable than DNA reads, so that assumption might be more difficult to hold for joint calling on RNA-seq data.

Thanks,

From Geraldine_VdAuwera on 2016-12-21

@blueskypy Sorry, we don’t provide RNAseq resources. There may be some go th coming down the right but we’re just not in a position to offer that right now.

We haven’t looked at joint calling in RNAseq so can’t provide any guidance at this point.

Report abuse