created by Geraldine_VdAuwera
on 2017-06-29
The GATK4 beta version command-line tools are provided as a single executable jar file. You can download a zipped package containing the jar file from this Github link (GATK4 Download page coming soon). Once you unzip the package, you will find four files inside the resulting directory:
gatk-launch gatk-package-4.beta.x-local.jar gatk-package-4.beta.x-spark.jar README.md
where x
is the minor release version in the jar file names.
Now you may ask, why are there two jars? As the names suggest, gatk-package-4.beta.x-spark.jar
is the jar for running Spark tools on a Spark cluster, while gatk-package-4.beta.x-local.jar
is the jar that is used for everything else (including running Spark tools "locally", ie on a regular server or cluster).
So does that mean you have to specify which one you want to run each time? Nope! See the gatk-launch
file in there? That's an executable wrapper script that you invoke and that will choose the appropriate jar for you based on the rest of your command line. You can still invoke a specific jar if you want, but using gatk-launch
is easier, and it will also take care of setting some parameters that you would otherwise have to specify manually. We'll talk about that in a minute.
There is no installation necessary in the traditional sense, since the precompiled jar files should work on any POSIX platform (NOT Microsoft Windows!) equipped with the appropriate version of Java (see below). You'll simply need to open the downloaded package and place the folder containing the jar files in a convenient directory on your hard drive (or server). Although the jars themselves cannot simply be added to your PATH, you can do so with the gatk-launch
wrapper script. Please look up instructions depending on the terminal shell you use; in bash
the typical syntax is export PATH=$PATH:/path/to/gatk/gatk-launch
where path/to/
is the path to the location of the gatk-launch
executable. Note that the jars must remain in the same directory as gatk-launch
for it to work.
Important note about Java version
For the tools to run properly, you must have Java 8 / JDK or JRE 1.8 installed. To check your java version, open your terminal application and run the following command:
java -version
If the output looks something like java version "1.8.x_y"
, you are good to go. If not, you may need to change your version. You can download a suitable upgrade either from Oracle or from OpenJDK. To be clear, OpenJDK is now fully supported.
To test that you can run GATK tools, run the following command in your terminal application (we assume that you have added gatk-launch
to your PATH):
./gatk-launch --help
This will output a summary of the GATK4 invocation syntax, options for listing tools and invoking a specific tool's help documentation, and main Spark options.
Tools are invoked as follows:
./gatk-launch ToolName -OPTION1 value1 -OPTION2 value2
If you have previous used older GATK versions, you'll notice that ToolName
is no longer passed with -T
and that it is now positional: the tool name must always be the first thing you write after the ./gatk-launch
part (or the jar file if you're invoking the jar directly).
Available tools are all listed in the Tool Documentation section, which is versioned; on the website, use the orange dropdown menu button to switch between versions. This provides a complete list of tools with usage recommendations, options, and example commands.
Docker images for GATK4 releases can be found at https://hub.docker.com/r/broadinstitute/gatk/
Updated on 2017-06-29
From Yanjane on 2017-07-11
Hi,
Happy to see the first release of GATK4 beta here.
I downloaded it and ran some tests, performed well, however when I ran BaseRecalibrator and ApplyBQSR with -nt ot -nct, it would be like this:
http://gatkforums.broadinstitute.org/gatk/utility/thumbnail/3061/s3://uploads/FileUpload/ed/47cbb3540756032bf4099b3290807f.png
it’ll run correctly if I remove this parameter, but so slowly.
Is there any change for -nt/-nct or this release of GATK4 doesn’t support this function?
Thanks and looking foward to the formal release of GATK4 beta.
From Geraldine_VdAuwera on 2017-07-11
We have removed the -nt /-nct multithreading functionality and replaced it with spark support. There is some preliminary documentation about the spark functionality in the readme doc on github; we’ll write up something more detailed in the coming weeks.
From mcvu on 2018-12-03
Hi,
I’ve just downloaded GATK-4.0.11.0. Within the zip file there is no ‘‘gatk-launch’‘.
Please advise.
Thanks
From mcvu on 2018-12-05
> @mcvu said:
> Hi,
>
> I’ve just downloaded GATK-4.0.11.0. Within the zip file there is no ‘‘gatk-launch’‘.
>
> Please advise.
>
> Thanks
@Geraldine_VdAuwera any thoughts as to what is going on?
From Geraldine_VdAuwera on 2018-12-05
That was the beta syntax. It was changed to just gatk in the full release, a year ago.