Velvet

Application Software

Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454. Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired-end read and long read information, when available, to retrieve the repeated areas between contigs. Velvet consists of two programs: velveth and velvetg.

For more detailed information about Velvet refer to:

Velvet Manual
Paper (cite when publishing results)
Website

To run on the Coeus cluster load the module (refer to Example SLURM sbatch Scripts section below). When the module is loaded you can copy test data sets and useful scripts to your home directory with the following commands:

> module load Biosciences/Velvet/1.2.10

Running velveth

Velveth takes in a number of sequence files, produces a hashtable, then outputs two files in an output directory (creating it if necessary), Sequences and Roadmaps, which are necessary to velvetg. The syntax is as follows:

> velveth /scratch/$USER/output_directory hash_length [-file_format][-read_type] filename

The hash length, also known as k-mer length, corresponds to the length, in base pairs, of the words being hashed. Refer to website for more information on choosing hash length.

Supported file formats:

(refer to manual for full list)

fasta(default)
fastq
fasta.gz
fastq.gz
sam

Read categories:

(refer to manual for full list)

short (default)
shortPaired
long (for Sanger, 454 or even reference sequences)

Running the following for command line help:

> velveth | less

Running velvetg

Velvetg is the core of Velvet where the de Bruijn graph is built then manipulated. Note that although velvetg saves some files during the process to avoid useless recalculations, the parameters are not saved from one run to the next.

The syntax for running velvetg is as follows:

> velvetg /scratch/$USER/output_directory hash_length [[-file_format][-read_type] filename]

Velvetg flags:

(refer to manual for comprehensive list)

-cov_cutoff <floating-point|auto> : removal of low coverage nodes AFTER tour bus or allow the system to infer it

-ins_length <integer> : expected distance between two paired end reads (default: no read pairing)

-read_trkg <yes|no> : tracking of short read positions in assembly (default: no tracking)

-min_contig_lgth <integer> : minimum contig length exported to contigs.fa file (default: hash length * 2)

-amos_file <yes|no> : export assembly to AMOS file (default: no export)

-exp_cov <floating point|auto> : expected coverage of unique regions or allow the system to infer it

-long_cov_cutoff <floating-point> : removal of nodes with low long-read coverage AFTER tour bus

Running the following for command line help:

> velvetg | less

Example SLURM sbatch Scripts

To use Velvet on the Coeus cluster you must submit a job through the SLURM job scheduler. To do so create a script. These jobs are based on an example from the The Velvet Manual

sub_velvet_ex1.sh:

#!/bin/bash

#SBATCH --job-name Velvet_example

#SBATCH --partition medium

#SBATCH --output=velvet_%j

module purge

module load Biosciences/Velvet/1.2.10

srun velveth /scratch/$USER/velvet_example 21 -shortPaired $VELVET_DATA/test_reads.fa

srun velvetg /scratch/$USER/velvet_example

srun velvetg /scratch/$USER/velvet_example -cov_cutoff 5 -read_trkg yes -amos_file yes

Then submit the sbatch script:

> sbatch sub_velvet_ex1.sh

Utilizing multi-threading

Velvet has been compiled to utilize OpenMP. Note that OpenMP allows the use of multiple CPUs on a single node, not the use of multiple nodes. Refer to the below example script:

#!/bin/bash

#SBATCH --job-name Velvet_OPENMP_example

#SBATCH --nodes 1

#SBATCH --partition medium

#SBATCH --output=velvet_openmp_%j

module purge

module load Biosciences/Velvet/1.2.10

export OMP_THREAD_LIMIT=7

export OMP_NUM_THREADS=6

srun velveth /scratch/$USER/velvet_example 21 -shortPaired $VELVET_DATA/test_reads.fa

srun velvetg /scratch/$USER/velvet_example

srun velvetg /scratch/$USER/velvet_example -cov_cutoff 5 -read_trkg yes -amos_file yes

Report abuse