VAMP on cetus
VAMP Installation instructions for cetus cluster at Princeton
The Virus AsseMbly Pipeline (VAMP) (https://bitbucket.org/lance_parsons/vamp) is a system designed to assembly viral genomes from paired-end Illumina sequence data. (Paper in progress, Moriah Szpara and Lance Parsons)
Set PATH in .bash_profile
Add to .bash_profile:
export PATH=$HOME/bin:$HOME/.local/bin:$PATH
Logout and log back in so this takes effect
Python Dependencies
Install distribute
curl -O http://python-distribute.org/distribute_setup.py
python distribute_setup.py --user
easy_install --user pip
Install BioPython, cutadapt, pybedtools, and paired_sequence_utils
pip install BioPython --user
pip install cutadapt --user
pip install pybedtools --user
pip install paired_sequence_utils --user
Other Dependencies
Bedtools
cd ~
wget http://bedtools.googlecode.com/files/BEDTools.v2.16.2.tar.gz
tar xzvf BEDTools.v2.16.2.tar.gz
cd BEDTools-Version-2.16.2
make
cp bin/* ~/bin
Install libgtextutils
cd ~
wget http://hannonlab.cshl.edu/fastx_toolkit/libgtextutils-0.6.1.tar.bz2
tar xjvf libgtextutils-0.6.1.tar.bz2
cd libgtextutils-0.6.1
./configure --prefix=$HOME
make
make install
cd ..
Install Fastx_toolkit
wget http://hannonlab.cshl.edu/fastx_toolkit/fastx_toolkit-0.0.13.2.tar.bz2
tar xjvf fastx_toolkit-0.0.13.2.tar.bz2
cd fastx_toolkit-0.0.13.2
export PKG_CONFIG_PATH=$HOME/lib/pkgconfig
./configure --prefix=$HOME
make
make install
cd ..
Install FastQC
cd ~
wget http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.10.1.zip
unzip fastqc_v0.10.1.zip
chmod 755 FastQC/fastqc
ln -s ~/FastQC/fastqc ~/bin
Bowtie is already installed in /usr/local/bin/
Install and setup VAMP
cd ~
tar xzvf /Genomics/grid/users/lparsons/lance_parsons-vamp-6ee1c60a3cc9.tar.gz
cp lance_parsons-vamp-6ee1c60a3cc9/makefiles/config.mk.template lance_parsons-vamp-6ee1c60a3cc9/makefiles/config.mk
Mugsy Instructions
Mugsy (http://mugsy.sourceforge.net/) is a multiple alignment software that can be used to align similar to genomes to one another. The output of this or another similar alignment program is used after assembly to compare genomes.
Installing Mugsy
cd ~
wget "http://sourceforge.net/projects/mugsy/files/mugsy_x86-64-v1r2.3.tgz/download" -O "mugsy_x86-64-v1r2.3.tgz"
tar xzvf mugsy_x86-64-v1r2.3.tgz
cd mugsy_x86-64-v1r2.3
Edit mugsyenv.sh and add path to the installation area
export MUGSY_INSTALL=$HOME/mugsy_x86-64-v1r2.3
Running Mugsy
Before running mugsy, must source mugsyenv.sh, run qsub with -V parameter
OR add the three lines from mugsyevn.sh to your .bash_profile
source ~/mugsy_x86-64-v1r2.3/mugsyenv.sh
Go to directory with fasta files and use qsub to execute mugsy. The -V
parameter ensure that the proper environment variables are available to Mugsy and the -cwd
parameter ensure things run from the current directory and not your home directory.
cd path/to/genomes
qsub -V -cwd mugsy --directory . --prefix mugsy_alignment NC_001806_1.fasta GU734771_1.fasta GU734772_1.fasta
SNPEffector
SNPEffector (http://snpeff.sourceforge.net) is used to analyze the differences found during the alignment and summarized by compare_genomes.py
Download and install core program (http://snpeff.sourceforge.net/download.html)
cd ~
wget "http://sourceforge.net/projects/snpeff/files/snpEff_v2_1b_core.zip/download" -O "snpEff_v2_1b_core.zip"
unzip snpEff_v2_1b_core.zip
ln -s snpEff_2_1b snpEff
Setup New Custom Genome (http://snpeff.sourceforge.net/supportNewGenome.html)
Copy genome files to snpEff data directory
mkdir ~/snpEff/data
mkdir ~/snpEff/data/NC_001806_1
cp /path/to/genomes/NC_001806_1.fasta ~/snpEff/data/NC_001806_1/sequences.fa
cp /apth/to/genomes/NC_001806_1.gb.gtf ~/snpEff/data/NC_001806_1/genes.gtf
Add the genome to the config file (http://snpeff.sourceforge.net/supportNewGenome.html#conf)
Add the following lines to ~/snpEff/snpEff.config
# HSV1 Strain 17 genome, RefSeq NC001806.1
NC_001806_1.genome : HSV1_strain_17
Create the database
java -jar ~/snpEff/snpEff.jar build -gtf22 -v NC_001806_1 -c ~/snpEff/snpEff.config
Running SnpEff on output
java -jar ~/snpEff/snpEff.jar -c ~/snpEff/snpEff.config NC_001806_1 GU734771_1.vcf
Output from SnpEff
- Table to
STDOUT
of variants and predictions - Summary in
snpEff_summary.html
- Gene summary in
snpEff_genes.txt