created by Geraldine_VdAuwera
on 2013-07-02
Objective
Install all software packages required to follow the GATK Best Practices.
Prerequisites
To follow these instructions, you will need to have a basic understanding of the meaning of the following words and command-line operations. If you are unfamiliar with any of the following, you should consult a more experienced colleague or your systems administrator if you have one. There are also many good online tutorials you can use to learn the necessary notions.
You will also need to have access to an ANSI compliant C++ compiler and the tools needed for normal compilations (make, shell, the standard library, tar, gunzip). These tools are usually pre-installed on Linux/Unix systems. On MacOS X, you may need to install the MacOS Xcode tools. See https://developer.apple.com/xcode/ for relevant information and software downloads. The XCode tools are free but an AppleID may be required to download them.
Starting with version 3.6, the GATK requires Java Runtime Environment version 1.8 (Java 8). Previous versions down to 2.6 required JRE 1.7, and earlier versions required 1.6. All Linux/Unix and MacOS X systems should have a JRE pre-installed, but the version may vary. To test your Java version, run the following command in the shell:
java -version
This should return a message along the lines of ”java version 1.8.0_25” as well as some details on the Runtime Environment (JRE) and Virtual Machine (VM). If you have a version that does not match the requirements stated above for the version of GATK you are running, the GATK may not run correctly or at all. The simplest solution is to install an additional JRE and specify which you want to use at the command-line. To find out how to do so, you should seek help from your systems administrator.
Software packages
Note that the version numbers of packages you download may be different than shown in the instructions below. If so, please adapt the number accordingly in the commands.
Read the overview of the BWA software on the BWA project homepage, then download the latest version of the software package.
Unpack the tar file using:
tar xvzf bwa-0.7.12.tar.bz2
This will produce a directory called bwa-0.7.12
containing the files necessary to compile the BWA binary. Move to this directory and compile using:
cd bwa-0.7.12 make
The compiled binary is called bwa
. You should find it within the same folder (bwa-0.7.12
in this example). You may also find other compiled binaries; at time of writing, a second binary called bwamem-lite
is also included. You can disregard this file for now. Finally, just add the BWA binary to your path to make it available on the command line. This completes the installation process.
Open a shell and run:
bwa
This should print out some version and author information as well as a list of commands. As the Usage line states, to use BWA you will always build your command lines like this:
bwa <command> [options]
This means you first make the call to the binary (bwa
), then you specify which command (method) you wish to use (e.g. index
) then any options (i.e. arguments such as input files or parameters) used by the program to perform that command.
Read the overview of the SAMtools software on the SAMtools project homepage, then download the latest version of the software package.
Unpack the tar file using:
tar xvjf samtools-0.1.2.tar.bz2
This will produce a directory called samtools-0.1.2
containing the files necessary to compile the SAMtools binary. Move to this directory and compile using:
cd samtools-0.1.2 make
The compiled binary is called samtools
. You should find it within the same folder (samtools-0.1.2
in this example). Finally, add the SAMtools binary to your path to make it available on the command line. This completes the installation process.
Open a shell and run:
samtools
This should print out some version information as well as a list of commands. As the Usage line states, to use SAMtools you will always build your command lines like this:
samtools <command> [options]
This means you first make the call to the binary (samtools
), then you specify which command (method) you wish to use (e.g. index
) then any options (i.e. arguments such as input files or parameters) used by the program to perform that command. This is a similar convention as used by BWA.
Read the overview of the Picard software on the Picard project homepage, then download the latest version (currently 2.4.1) of the package containing the pre-compiled program file (the picard-tools-2.x.y.zip file).
Unpack the zip file using:
tar xjf picard-tools-2.4.1.zip
This will produce a directory called picard-tools-2.4.1
containing the Picard jar files. Picard tools are distributed as a pre-compiled Java executable (jar file) so there is no need to compile them.
Note that it is not possible to add jar files to your path to make the tools available on the command line; you have to specify the full path to the jar file in your java command, which would look like this:
java -jar ~/my_tools/jars/picard.jar <Toolname> [options]
This syntax will be explained in a little more detail further below.
However, you can set up a shortcut called an "environment variable" in your shell profile configuration to make this easier. The idea is that you create a variable that tells your system where to find a given jar, like this:
PICARD = "~/my_tools/jars/picard.jar"
So then when you want to run a Picard tool, you just need to call the jar by its shortcut, like this:
java -jar $PICARD <Toolname> [options]
The exact way to set this up depends on what shell you're using and how your environment is configured. We like this overview and tutorial which explains how it all works; but if you are new to the command line environment and you find this too much too deal with, we recommend asking for help from your institution's IT support group.
This completes the installation process.
Open a shell and run:
java -jar picard.jar -h
This should print out some version and usage information about the AddOrReplaceReadGroups.jar
tool. At this point you will have noticed an important difference between BWA and Picard tools. To use BWA, we called on the BWA program and specified which of its internal tools we wanted to apply. To use Picard, we called on Java itself as the main program, then specified which jar file to use, knowing that one jar file = one tool. This applies to all Picard tools; to use them you will always build your command lines like this:
java -jar picard.jar <ToolName> [options]
This means you first make the call to Java itself as the main program, then specify the picard.jar
file, then specify which tool you want, and finally you pass whatever other arguments (input files, parameters etc.) are needed for the analysis.
Note that the command-line syntax of Picard tools has recently changed from java -jar <ToolName>.jar
to java -jar picard.jar <ToolName>
. We are using the newer syntax in this document, but some of our other documents may not have been updated yet. If you encounter any documents using the old syntax, let us know and we'll update them accordingly. If you are already using an older version of Picard, either adapt the commands or better, upgrade your version!
Next we will see that GATK tools are called in essentially the same way, although the way the options are specified is a little different. The reasons for how tools in a given software package are organized and invoked are largely due to the preferences of the software developers. They generally do not reflect strict technical requirements, although they can have an effect on speed and efficiency.
Hopefully if you're reading this, you're already acquainted with the purpose of the GATK, so go ahead and download the latest version of the software package.
In order to access the downloads, you need to register for a free account on the GATK support forum. You will also need to read and accept the license agreement before downloading the GATK software package. Note that if you intend to use the GATK for commercial purposes, you will need to purchase a license. See the licensing page for an overview of the commercial licensing conditions.
Unpack the tar file using:
tar xjf GenomeAnalysisTK-3.3-0.tar.bz2
This will produce a directory called GenomeAnalysisTK-3.3-0
containing the GATK jar file, which is called GenomeAnalysisTK.jar
, as well as a directory of example files called resources
. GATK tools are distributed as a single pre-compiled Java executable so there is no need to compile them. Just like we discussed for Picard, it's not possible to add the GATK to your path, but you can set up a shortcut to the jar file using environment variables as described above.
This completes the installation process.
Open a shell and run:
java -jar GenomeAnalysisTK.jar -h
This should print out some version and usage information, as well as a list of the tools included in the GATK. As the Usage line states, to use GATK you will always build your command lines like this:
java -jar GenomeAnalysisTK.jar -T <ToolName> [arguments]
This means that just like for Picard, you first make the call to Java itself as the main program, then specify the GenomeAnalysisTK.jar
file, then specify which tool you want, and finally you pass whatever other arguments (input files, parameters etc.) are needed for the analysis.
The Integrated Genomics Viewer is a genome browser that allows you to view BAM, VCF and other genomic file information in context. It has a graphical user interface that is very easy to use, and can be downloaded for free (though registration is required) from this website. We encourage you to read through IGV's very helpful user guide, which includes many detailed tutorials that will help you use the program most effectively.
Download the latest version of RStudio IDE. The webpage should automatically detect what platform you are running on and recommend the version most suitable for your system.
Follow the installation instructions provided. Binaries are provided for all major platforms; typically they just need to be placed in your Applications (or Programs) directory. Open RStudio and type the following command in the console window:
install.packages("ggplot2")
This will download and install the ggplot2 library as well as any other library packages that ggplot2 depends on for its operation. Note that some users have reported having to install two additional package themselves, called reshape
and gplots
, which you can do as follows:
install.packages("reshape") install.packages("gplots")
Finally, do the same thing to install the gsalib library:
install.packages("gsalib")
This will download and install the gsalib library.
Important note
If you are using a recent version of ggplot2
and a version of GATK older than 3.2, you may encounter an error when trying to generate the BQSR or VQSR recalibration plots. This is because until recently our scripts were still using an older version of certain ggplot2
functions. This has been fixed in GATK 3.2, so you should either upgrade your version of GATK (recommended) or downgrade your version of ggplot2. If you experience further issues generating the BQSR recalibration plots, please see this tutorial.
Updated on 2020-01-31
From haseley on 2013-07-08
Hello,
I’m having an issue getting picard tools configured to work in any directory. I’ve downloaded and unpacked the picard zip file and added the picard-tools-1.94 directory to my path, however when I run:
java -jar AddOrReplaceReadGroups.jar -h
I get the following error: Error: Unable to access jarfile AddOrReplaceReadGroups.jar
The command works if I am in the picard-tools-1.94 directory, making me think that something is wrong with my path variable but when I echo my path variable and copy the relevant path directly into a cd command I move to the correct directory (so there are no typos) and the command works (so I should be adding the correct directory). Any suggestions? Here is the value of my PATH variable:
bash:tin:~ 53 $ echo $PATH /idi/hunglabusers/GenomeAnalysisTK-2.6-4-g3e5ff60/:/idi/hunglabusers/SalmonellaRNAseq/picard/picard-tools-1.94/:/idi/hunglabusers/GATK_workshop/htslib-master/:/broad/software/free/Linux/redhat_5_x86_64/pkgs/oracle-java-jdk_1.7.0-17_x86_64/bin:/broad/software/free/Linux/redhat_5_x86_64/pkgs/bwa_0.7.4:/broad/software/free/Linux/redhat_5_x86_64/pkgs/samtools/samtools_0.1.19/bin:/home/unix/haseley/bin:/home/unix/haseley/bin:/broad/tools/NoArch/pkgs/local:/usr/lib64/qt-3.3/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin
The relevant path is the second one listed. Below is the version information for java:
bash:node1379:~ 53 $ java -version java version “1.7.0_17“ Java™ SE Runtime Environment (build 1.7.0_17-b02) Java HotSpot™ 64-Bit Server VM (build 23.7-b01, mixed mode)
Thanks!
Nathan
From Geraldine_VdAuwera on 2013-07-08
What I do is set up an environment variable that points to the directory where the jars live, so I can so something like $picardDir/AddOrReplaceReadGroups.jar
From briankweiner on 2013-07-08
I’m having some trouble with the very last bit when trying to install the gaslb2 in R. If you are using the most recent version of R 3.0.1 for Mac OS X then you’ll receive the following error:
> install.packages(“gsalib2”)
Warning in install.packages : package ‘gsalib2’ is not available (for R version 3.0.1)
You can however install “gsalib” if that will work.
From briankweiner on 2013-07-08
Also, for any other Mac users who are frustrated with the inability of the command line interface to recognize the most recent Java install, you can correct this problem by going to this webpage: http://stackoverflow.com/questions/12757558/installed-java-7-on-mac-os-x-but-terminal-is-still-using-version-6
From Geraldine_VdAuwera on 2013-07-08
@briankweiner, thanks for pointing out the gsalib2 typo — I’ve corrected the name of the gsalib library in the article.
And thanks for linking to that article! It is sure to be helpful for others.
From sourav8888 on 2013-07-19
Hi.
I am new to GATK. While installing everything went well except installation of ggplot2 and gsalib. I am getting error msg as :
Warning in install.packages : unable to connect to ‘cran.rstudio.com’ on port 80.
Is it a problem of network only or something else I have to do.
Thanks in advance.
From Geraldine_VdAuwera on 2013-07-19
Hi @sourav8888, that is a network error that has nothing to do with GATK, so we can’t help you with that. You should ask for help from a colleague or your IT department.
From sourav8888 on 2013-07-20
Thanks a lot Geraldine_VdAuwera. Yes I will contact them.
From JPC on 2013-10-25
Hello, when I make htslib I don't get a htscmd binary in /htslib-master, lookign back through the install i see the following;
and at the end; > clang: error: linker command failed with exit code 1 (use -v to see invocation) make: * [htscmd] Error 1 >
I don't understand the error sorry
JPC
From Geraldine_VdAuwera on 2013-10-25
I’m sorry, neither do I — I recommend you contact the makers of htslib; they will be better able to help you.
From psanchez820 on 2013-10-25
JPC:
I also get the same error, did you manage to get an answer for this?
> @JPC said:
> Hello, when I make htslib I don’t get a htscmd binary in /htslib-master, lookign back through the install i see the following;
> JPC
Thanks!
From alastair_kerr on 2013-10-28
R-dependencies:
I found that all the following were needed to run the later AnalyzeCovariates example:
ggplot2, gplots, reshape, grid, tools, gsalib
It would be useful if these were noted here
From Geraldine_VdAuwera on 2013-10-28
Hi Alastair, we only list ggplot2 and gsalib because the rest are dependencies of ggplot2 and should get installed automatically when you install ggplot2.
From alastair_kerr on 2013-11-04
Hi Geraldine, this was not the case in my install, perhaps because ggplot2 had been installed on my system for a few years. It took me a while to figure out the problem and I would save others such inconvenience if the full list were included.
From Geraldine_VdAuwera on 2013-11-04
The problem is that we would then have to update the dependencies every time the developer of another library changes their package, and that’s just too much burden on us. As it is now, it is your responsibility to keep your software up to date. If you have some software that has been installed for several years, one of the first things you should think of if you run into problems is to update everything.
From alastair_kerr on 2013-11-05
Sorry I was not clear. The libraries were completely up to date, they just did not have the additional packages as dependencies.
From mkasiedu on 2014-01-03
> @Geraldine_VdAuwera said:
> What I do is set up an environment variable that points to the directory where the jars live, so I can so something like $picardDir/AddOrReplaceReadGroups.jar
Geraldine,
I am having the same problem as haseley above but I am not sure how to set up the envirinment variable using the information you provided above. I am new to linux. Can you send me a command line to run. Do I have to run “$picardDir/AddOrReplaceReadGroups.jar” before running “java -jar AddOrReplaceReadGroups.jar -h “
this is the java version
[michael@asl158 ~]$ java -version
java version “1.7.0_45“
OpenJDK Runtime Environment (fedora-2.4.3.0.fc19-x86_64 u45-b15)
OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
and the PATH is
[michael@asl158 ~]$ echo $PATH
/usr/lib64/qt-3.3/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/home/michael/Downloads/htslib-master:/home/michael/Downloads/vcftools_0.1.11:/home/michael/Downloads/samtools-0.1.19:/home/michael/Downloads/bwa-0.7.5a:/home/michael/Downloads/picard-tools-1.105:/home/michael/Downloads/GenomeAnalysisTK-2.8-1-g932cd3a:/home/michael/.local/bin:/home/michael/bin:/home/michael/Downloads/htslib-master:/home/michael/Downloads/vcftools_0.1.11:/home/michael/Downloads/samtools-0.1.19:/home/michael/Downloads/bwa-0.7.5a:/home/michael/Downloads/picard-tools-1.105:/home/michael/Downloads/GenomeAnalysisTK-2.8-1-g932cd3a
Thanks,
From Geraldine_VdAuwera on 2014-01-06
Hi @mkasiedu,
I recommend you look for an online tutorial that covers working with environment variables. There are many fine tutorials for Linux beginners, and I think this will be more useful to you in the long run than giving you a set of commands to run. Good luck!
From adaywill on 2014-01-07
Hi,
The most current branch of htslib is the develop branch. Is there a reason to install the master branch that looks like it has been stopped being developed and merged into the develop branch?
Thanks,
Aaron
From Geraldine_VdAuwera on 2014-01-07
Hi @adaywill,
That’s a fair point, but we’ve only tested the “master” package (in keeping with the usual Earth-logic software naming convention; not sure what the htslib devs are doing merging master into develop…), so proceed with “develop” at your own risk.
From mkasiedu on 2014-01-10
> @haseley said:
> Hello,
>
> I’m having an issue getting picard tools configured to work in any directory. I’ve downloaded and unpacked the picard zip file and added the picard-tools-1.94 directory to my path, however when I run:
>
> java -jar AddOrReplaceReadGroups.jar -h
>
> I get the following error: Error: Unable to access jarfile AddOrReplaceReadGroups.jar
>
> The command works if I am in the picard-tools-1.94 directory, making me think that something is wrong with my path variable but when I echo my path variable and copy the relevant path directly into a cd command I move to the correct directory (so there are no typos) and the command works (so I should be adding the correct directory). Any suggestions? Here is the value of my PATH variable:
>
> bash:tin:~ 53 $ echo $PATH
> /idi/hunglabusers/GenomeAnalysisTK-2.6-4-g3e5ff60/:/idi/hunglabusers/SalmonellaRNAseq/picard/picard-tools-1.94/:/idi/hunglabusers/GATK_workshop/htslib-master/:/broad/software/free/Linux/redhat_5_x86_64/pkgs/oracle-java-jdk_1.7.0-17_x86_64/bin:/broad/software/free/Linux/redhat_5_x86_64/pkgs/bwa_0.7.4:/broad/software/free/Linux/redhat_5_x86_64/pkgs/samtools/samtools_0.1.19/bin:/home/unix/haseley/bin:/home/unix/haseley/bin:/broad/tools/NoArch/pkgs/local:/usr/lib64/qt-3.3/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin
>
> The relevant path is the second one listed. Below is the version information for java:
>
> bash:node1379:~ 53 $ java -version
> java version “1.7.0_17“
> Java™ SE Runtime Environment (build 1.7.0_17-b02)
> Java HotSpot™ 64-Bit Server VM (build 23.7-b01, mixed mode)
>
>
> Thanks!
>
> Nathan
Hi Nathan,
Were you able to resolve the java issue with picard? I am having the same problem and not making any progress resolving it. Will appreciate the help.
Thanks,
Michael
From virshu on 2014-01-13
Hi,
I am sysadmin helping our scientists set up GATK environment. I am following the instructions in this post, and some of them cause doubts – at least on Linux (don’t know much about Mac).
Step 4 (Picard installations) tells to “add the Picard directory to your path to make the tools available on the command line”. As somebody else already mentioned, this doesn’t make sense – jar invocation doesn’t use PATH to find jar file. So, java -jar AddOrReplaceReadGroups.jar -h doesn’t work, unless AddOrReplaceReadGroups.jar is in the current directory. I assume, GATK itself relies on Picard archives; the typical way is to add jars to the CLASSPATH. Should I add all of them? That doesn’t seem right? Could you please correct the instructions.
Step 5. The trivial invocation (with -h flag) works; however, any ToolName throws an error that the tool is not found. I don’t know if it’s related to Picard jars missing, or there is some other reason – but some verification that goes beyond just -h would be very helpful.
Step 6. Most of our Linux servers don’t have GUI installed. Scientists use R Studio Server for all their R development and modelling needs. However, you recommend installing R Studio IDE, which is a client-based software. Do you want me to install it on the server? It won’t work without X. Or you want the scientists to install it on their workstations (many of them have it already). Then I would need some instructions as to how to integrate such client installation with GATK. Or I misunderstood this whole step?
Thanks.
From pdexheimer on 2014-01-13
Hi @virshu -
I can try to help with a couple of these.
For Picard, we define an environment variable PICARD_HOME, and then invoke with java -jar $PICARD_HOME/AddOrReplaceReadGroups.jar. Actually, we do the same thing for GATK.
For GATK, I assume that you tried tool names that should exist (like PrintReads or UnifiedGenotyper)? If you build the jars yourself, it’s possible to mess things up and not compile in the tools, but the downloadable jar for distribution (at least v2.8-1) doesn’t have this problem.
You don’t need an R IDE, I suspect that the recommendation was made just for ease of installation. Just make sure that Rscript is on the path and that the ggplot2 package (and all dependencies) is installed
From Geraldine_VdAuwera on 2014-01-13
Hi there,
Step 4: Apologies for the confusion; what it means is what @pdexheimer outlines: create an environment variable to use as shortcut to the directory where you store the jars. We typically have several versions on the same machine so it’s easier to control what we’re using that way rather than using classpaths. We’ll try to clarify the doc.
Step 5: Not sure what you mean — could you please post the command line you tried that didn’t work, and what result or error message you got?
Step 6: The IDE is not required, it’s just a recommendation for people who don’t already work with R, as it can help make installing the libraries easier for them.
From virshu on 2014-01-13
Wow, thanks for such speedy reply!
Step 4: I assume that GATK needs to somehow know where Picard jar files are located, right? So, whether I use directory name or environment variable – it doesn’t let GATK know about it. Or there is no dependency, and the scientist is supposed to invoke Picard jars independently of GATK? Then I don’t have any questions.
Step 5: As I said, I am sysadmin (although hanging around the scientists for a long time). And our lawyers asked scientists to stay out of the system while they are finalizing contract. In short, I don’t have the tools “that should exist”. That’s exactly my question – can somebody suggest a command (beyond just help screen) that should work.
I didn’t build jars myself; as Step 5 instructs, the jars are pre-built in the download… The goal is to package Amazon AMIs for the scientists, and I want to make sure that all the pieces work correctly before I start packaging.
Thanks again…
From Geraldine_VdAuwera on 2014-01-14
We’re here to help :)
Step 4: there is no direct dependency; GATK does not make calls to Picard, if that’s what you mean. We just ask users to get Picard because there is some of data preprocessing that needs to be done with Picard before the data can be input to GATK.
Step 5: Oh I see. Well you could run one of the simple analysis tools on the example data that is provided with the download (if I remember correctly, in the resources subdirectory). E.g. you would do:
java -jar GenomeAnalysisTK.jar -T CountReads -R exampleFASTA.fasta -I exampleBAM.bam
Let me know if you have any trouble with that.
From virshu on 2014-01-15
YES! Thank you so much (on both steps)! Step 4: that certainly clarifies. and Step 5: The results are much more comforting than just help screen! I got “CountReads – CountReads counted 33 reads in the traversal” and “0 reads were filtered out during the traversal” which looks really great!
Thanks a lot
From Geraldine_VdAuwera on 2014-01-15
You’re welcome! We’ll look into providing some more helpful quick-start examples along those lines.
From adaywill on 2014-01-27
> @Geraldine_VdAuwera said:
> Hi adaywill,
>
> That’s a fair point, but we’ve only tested the “master” package (in keeping with the usual Earth-logic software naming convention; not sure what the htslib devs are doing merging master into develop…), so proceed with “develop” at your own risk.
Hi Geraldine,
Thanks. With the newer version of htslib you can just recompile samtools with the new htslib library and all the functionality is available from samtools.
From Geraldine_VdAuwera on 2014-01-27
Oh, that is very cool. Thanks for reporting back on this, thanks @adaywill
From FabriceBesnard on 2014-03-27
Dear Geraldine,
I’m getting some troubles with setting the path of the picard tools…
I have the same issue as Nathan and Mickaël: it works if I move in the “picard” dir, but not from somewhere else.
I red forums explaining what an environment variable is. As you suggested, I modified my .bashrc as follows:
-I created an environmental variable “ picardDir”: export picardDir=”$HOME/picard.tools-1.110“
-I added it to my path: export PATH=”$PATH:$picardDir“
By typing “env” I could verify that both environmental variable “picardDir” and “PATH” were modified correctly.
then I run:
java -jar $picardDir/AddOrRpeplaceReadGroups.jar -h (tested also with quotes: java -jar “$picardDir/AddOrRpeplaceReadGroups.jar” -h)
However, I still get the same error message:
Error: Unable to access jarfile /home/fabrice/picard-tools-1.110/AddOrRpeplaceReadGroups.jar
And if I try to type the absolute path in the command rather that my “picardDir” environment variable:
java -jar /home/fabrice/picard-tools-1.110/AddOrRpeplaceReadGroups.jar -h
I get the same error message.
So would you know what I am doing wrong ?
Thanks a lot for your help,
Fabrice
From Geraldine_VdAuwera on 2014-03-27
Hi Fabrice,
There’s a typo in the name of the program you’re trying to call… an extra ‘p’ after the ‘R’
From FabriceBesnard on 2014-05-19
Hey,
I am trying to install GATK and all all required packages on a MacOS 10.8.5 I installed Xtools (+ command lines) 5.1.
I successfully installed bwa and samtools, but I failed compiling htslib-master. I get this error message: 5 warnings generated. gcc -c -g -Wall -Wc++-compat -O2 -Ihtslib vcfnorm.c -o vcfnorm.o gcc -c -g -Wall -Wc++-compat -O2 -Ihtslib vcfgtcheck.c -o vcfgtcheck.o gcc -g -Wall -Wc++-compat -O2 -o htscmd main.o samview.o vcfview.o bamidx.o bcfidx.o bamshuf.o bam2fq.o tabix.o abreak.o bam2bed.o vcfcheck.o vcfisec.o vcfmerge.o vcfquery.o vcffilter.o vcfnorm.o vcfgtcheck.o -Lhtslib -lhts -lpthread -lz -lm Undefined symbols for architecture x8664: "bcfgttype", referenced from: mainvcfcheck in vcfcheck.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) make: * [htscmd] Error 1
would you have any idea how to help me fixing this issue ?
Thanks a lot, Fabrice
From Geraldine_VdAuwera on 2014-05-19
Hi Fabrice,
I’m not sure — htslib is under active development and is maintained by others (not us) so I can’t really comment on compilation issues. You’d have to ask the developers of htslib for support.
However I can tell you that htslib is not really required in order to run GATK, so you can skip it unless you need the ability to revert a bam file to FastQ (which is what we use it for in the tutorial).
From FabriceBesnard on 2014-05-26
Ok thanks Geraldine for your answer,
I haven’t fixed the issue so I am running GATK without htslib and that’s fine for what I want to do.
However, I will let you know if ever I find a fix to this issue…
From mayaab on 2014-06-23
Hello,
I’m running a virtual machine, and get the following error while compiling bwa:
fatal error: stdio.h: no such file or directory
the command:
g++ -v
returns an error
I guess I should install g++, but can’t find something on the web to help me with this. do you know how can I install it?
Maya
From Sheila on 2014-06-24
@mayaab
Hi Maya,
Unfortunately, this is not something we can help with.
-Sheila
From shangzhong0619 on 2014-09-02
I just have a comment on the part 7. R packages. It seems that ggplot2 and gsalib are not enough. I installed another package called ‘reshape’, then it worked for generating the figures in the BQSR step.
From engrasif09 on 2014-09-23
Absolutely amazing. Going great till now. Will seek your help if required!
From yasminf on 2014-11-17
Dear Geraldine,
I have downloaded GenomeAnalysisTK.jar but when i try
java -jar GenomeAnalysisTK.jar -h
i get :
Unable to access jarfile GenomeAnalysisTK.jar
help!
many thanks
Yasmin
From Sheila on 2014-11-17
@yasminf
Hi Yasmin,
The issue is that you are not specifying where the GenomeAnalysisTK.jar file exists. You must specify the path to the file before the file name. For example, my GATK is stored in my Applications folder, so my command would be:
java -jar /Applications/GenomeAnalysisTK.jar -h
I hope this helps.
-Sheila
From yasminf on 2014-11-17
It does! many thanks. now i need to sort my java version…
From yasminf on 2014-11-17
thats, too is sorted thanks to the clue: install the latest jdk from oracle!!
From carsweshau on 2015-09-20
`tar xvzf samtools-0.1.2.tar.bz2 `
This should read as: tar xv**j**f samtools-0.1.2.tar.bz2, otherwise it will throw the error:
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
From Sheila on 2015-09-21
@carsweshau
Hi,
Thanks for the catch. I will fix it now.
-Sheila
From mmterpstra on 2016-06-01
Similar problem as @alastair_kerr
Missing R depencies for Analyse Covariants:
I havent got any problems installing the ```reshape gplots gsalib``` libraries but …
`tools` is not present in CRAN?! But might work in [R-3.0.2](http://www.inside-r.org/r-doc/tools/format.compactPDF). Looking at the github code showed me that these are imported in [`BQSR.R`(clickme)](https://github.com/broadgsa/gatk/blob/3.5/public/gatk-engine/src/main/resources/org/broadinstitute/gatk/engine/recalibration/BQSR.R). This also shows that this is only triggered in AnalyseCovariants when `-plots` is specified (with `-csv` it runs correctly).
#OOOPS missed it!
I cannot remove this post. `library(“tools”)` works. In the `library()` function I failed to scroll down package `tools` is present in the default installation of R.
From Sheila on 2016-06-01
@mmterpstra
Hi,
I’m glad you solved the issue, and thanks for posting your solution here :smile:
-Sheila
From scw on 2016-11-11
I’m new for Linux&GATK
```
tar xvzf samtools-0.1.2.tar.bz2
```
carsweshau has pointed out the error, it should be like this below:
```
tar xvjf samtools-0.1.2.tar.bz2
```
Until now it has not been fixed.
Whatever, the article helps me a lot, thanks~
From Geraldine_VdAuwera on 2016-11-12
Fixed now, thanks for pointing it out.
From tphillip on 2017-06-14
Hi guys.. Thanks so much for the great support. I am also a little new to Linux and I believed everything was working fine but now I am stuck and have been fighting with this for some time. I have installed java with what seems to be a working version and run:
java -jar GenomeAnalysisTK.jar —help
with the appropriate path to GenomeAnalysisTK.jar. I get what seems like an appropriately long readout with a lot of options (not sure if this is what everyone else sees) but I think I am having a problem because I don’t see a list of tools.. When I run the analysis with the example data and CountReads I get the error
MESSAGE: Invalid command line: Malformed walker argument: Could not find walker with name: CountReads
I actually get this walker error with any tool. I have tried redownloading a few times but I guess I could try again.
Maybe MD5 Sum Check??
From Geraldine_VdAuwera on 2017-06-14
@tphillip Can you clarify which version you downloaded, your command line, and post the full output that is produced?
From tphillip on 2017-06-14
Sorry. I downloaded the current binary 3.7 from here. I am attempting to run on a docker container which is based in linux with openjdk
From tphillip on 2017-06-14
Maybe @Geraldine_VdAuwera could confirm. I got it to work but could not be used with OpenJDK 9 only 8. Had to run OpenJDK 8 instead. Yet I was attracted to 9 because I believed there was some base line memory allocation when java is running in containers running on servers. This just will not work on 9?
From Geraldine_VdAuwera on 2017-06-14
Ah no, Java 9 is not supported yet. There are usually enough changes between Java versions to break compatibility.
From Adnan_Yousaf on 2018-07-30
hi
i downloaded current version (gatk-4.0.6.0.zip). i can’t see tar file in it. if i unzip it it extract two jar files. i am confused how to install gatk.
thanks
From Sheila on 2018-08-05
@Adnan_Yousaf
Hi,
Perhaps [the Quick Start Guide](https://software.broadinstitute.org/gatk/documentation/quickstart) will help.
-Sheila
vcfcheck.c:703:39: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat] printf("\t%ld\t%f\n", stats->dp.vals[i], stats->dp.vals[i]*100./sum); ~~~ ^~~~~~~~~~~~~~~~~ %llu 1 warning generated.