Integrated transcript-specific SNV annotation

WGSA also provides integrated transcript-specific SNV annotation from ANNOVAR, SnpEff and VEP × Ensembl and RefSeq. The resources are provided as both an AWS AMI and a downloadable version.

Use downloadable version

· The link to the downloadable version can be found at https://sites.google.com/site/jpopgen/wgsa.

· The setup is similar as described in Setting up WGSA on a local Linux machine.

· Download the directory “resources2” and its contents (keeping its structure) to a folder, such as “/WGSA”. Download the Java class “search_integrated_output5.class” to the same folder.

· Upload your standard variant input file, such as the “.snp” file outputted by WGSA during the PROCEDURE in Using WGSA via Amazon Web Service (AWS). If it is compressed, decompress it.

· Run the Java class as “java search_integrated_output5 [input_file] [searchEnsembl(true or false)] [searchRefseq(true or false)] <sourcedir>”. The first argument is the name of SNV standard variant file. The second argument is either true or false for searching the integrated annotation with Ensembl. The third argument is either true or false for searching the integrated annotation with Refseq. The forth argument is optional: if the path to the resources2 directory is “/WGSA/resources2”, it can be omitted; otherwise, provide the path to the resources2 directory. For large variant files, specify a large enough memory for the usage of Java by using “-Xmx”. The following is an example:

o java –Xmx29g input.snp true true /WGSA/resources2

· If “searchEnsembl” is set “true”, an “.IntegratedEnsembl” file will be outputted. The description of the columns of this file can be found at Column description ofEnsembl transcript-specific SNV annotation.

· If “searchRefseq” is set “true”, an “.IntegratedRefseq” file will be outputted. The description of the columns of this file can be found at Column description of RefSeqtranscript-specific SNV annotation.

· (Optional) Compress the output file (with gzip).

· Download the output files.

Use AWS AMI

· The setup is similar as described in Using WGSA via Amazon Web Service (AWS)

· Launch an instance from an AMI of WGSA resources2. A list of available AMI can be found at https://sites.google.com/site/jpopgen/wgsa.

· Upload your standard variant input file, such as the “.snp” file outputted by WGSA during the PROCEDURE in Using WGSA via Amazon Web Service (AWS). We recommend compressing the input file before uploading and decompressing it after uploading (with gzip).

· Run the Java class as “java search_integrated_output5 [input_file] [searchEnsembl(true or false)] [searchRefseq(true or false)]”. The first argument is the name of SNV standard variant file. The second argument is either true or false for searching the integrated annotation with Ensembl. The third argument is either true or false for searching the integrated annotation with Refseq. For large variant files, specify a large enough memory for the usage of Java by using “-Xmx”. The following is an example:

o java –Xmx29g input.snp true true

· If “searchEnsembl” is set “true”, an “.IntegratedEnsembl” file will be outputted. The description of the columns of this file can be found at Columndescription of Ensembl transcript-specific SNV annotation.

· If “searchRefseq” is set “true”, an “.IntegratedRefseq” file will be outputted. The description of the columns of this file can be found at Columndescription of RefSeq transcript-specific SNV annotation.