created by GATK_Team
on 2017-12-28
It sure seems like everyone has a need for speed these days. So, there are two main ways to get your analysis results faster:
Technically there's also a third option (which I think of as the turd option, personally): cut corners by skipping steps and/or compromising on quality. But that's a topic for another time, another doc...
Due to the extreme variety of infrastructure and uses cases out in the world, we don't give specific guidelines for the type and configuration of hardware setup you should use to run GATK, because that's outside the scope of what we can reasonably provide with the resources we have.
We do however share the WDL workflows that we use in production to run the GATK Best Practices pipelines. These scripts feature the parallelization strategies that we chose to implement for each pipeline, and the accompanying example input JSON files include the parameter settings for hardware resources that we use on the Google Cloud Platform. You can even run these workflows for yourself the same way we do through FireCloud. FireCloud is a secure, freely accessible cloud-based analysis portal developed at the Broad Institute. It includes preconfigured GATK Best Practices pipelines as well as tools for building your own custom pipelines (with any command line tool you want, not just GATK).
Alternatively, our collaborators at the Intel-Broad Center for Genomic Data Engineering have done a ton of benchmarking and can provide you with recommended hardware configurations for local infrastructure based on your planned usage. Let us know in the comment thread if you'd like us to introduce you.
From mikedamour on 2018-07-31
Hi BITeam,
I have read through and like the idea of the free cloud portal (Thanks!). Right now I don’t have a project to charge GCP time, so need to use my Mac (i7/4core, plenty of RAM/SSD) for some .org cancer work. I see the HaplotypeCallerSpark beta – nice work! Any multithread work on Mutect2 for MacOSX?
Glad to be beta on that. Any timeline?
Best, Mike D’Amour
From Sheila on 2018-08-06
@mikedamour
Hi Mike,
There are talks of “Sparkifying” Mutect2, but it has not happened yet. Have a look at [this issue ticket](https://github.com/broadinstitute/gatk/issues/4325) for more information. Perhaps if you post there, it may resurrect the discussion.
-Sheila
From mikedamour on 2018-10-09
Oops, just noticed your reply. Thanks, Sheila. Have gotten a billing account and like using FireCloud and GCP better than waiting on my machine.
Best, Mike D’