Command-line Options

Usage:

rnnotator.pl <library1> [library2] [...] [OPTIONS]

Library format:

-strS fqs : strand-specific single-end library

-strC fqs : strand-specific composite read library

-strP ins_len fqs : strand-specific paired-end library

-nonS fqs : non strand-specific single-end library

-nonC fqs : non strand-specific composite read library

-nonP ins_len fqs : non strand-specific paired-end library

-nonSL seqs : non strand-specific single-end LONG library (FASTA). Must be

used together with one of the short-read options.

Note: compressed FASTQ files are acceptable (with the *.bz2 extension).

GENERAL OPTIONS:

-n INT : number of processors to use (default: 2)

-o STR : output directory (default: current directory)

-min_contig_length INT : minimum final contig length (default: 100)

-l STR : name to give the log file (default: assembly.log)

-sge STR : run on SGE, the path to the SGE cluster settings to source

--keep_rundir : Keep alignments or other temporary files in the running directory

--version : print the rnnotator version number

ADVANCED PREPROCESSING OPTIONS:

-rRNA on/off : remove rRNA reads (default: off)

-rRNA_fa STR : rRNA FASTA, when rRNA is on

-rRNA_gs STR : rRNA Genus species, when no rRNA fasta (ex: "Zea mays")

-low_qual on/off : remove low quality reads (default: on)

-low_comp on/off : remove low complexity reads (default: on)

-adapter on/off : remove adapter-containing reads (default: on)

-adapter_file STR : FASTA file containing adapter sequences (default: none)

-derep on/off : remove duplicate reads (default: on)

-trim on/off : trim reads (default: on)

-trim_len INT : length to trim reads to, when trim is on (default: auto)

-kfilter on/off : remove reads containing rare kmers (default: off)

-min_kmer_occur INT : minimum number of kmer occurences for rare kmer filtering (default: 3)

-kmer_length INT : kmer length for rare kmer filtering (default: 24)

-norm_depth INT : normalization depth (e.g. 200, default: 0 - no normalization)

ADVANCED ASSEMBLY OPTIONS:

-r RANGE : range of hash lengths to use in the format, start-end:increment (default: auto)

-a STR : assembler to use (velvet, oases, Ray, idba, minia) (default: velvet)

-m STR : merger to use (minimus2, oases) (default: minimus2)

--vmatch_off : flag to turn the vmatch step off (by default it is on)

-s STR : scaffolder to use (sopra, bambus, meraculous, velvet) (default: none)

-scaffold on/off : whether or not to scaffold contigs during velvetg (default: off)

-min_overlap_merge INT : minimum overlap for merging contigs (default: 40)

ADVANCED POLISHING OPTIONS:

-sa STR : aligner to use when aligning stranded reads to assembled contigs (blat, bwa) (default: bwa)

-split_min_cnt INT : minimum depth for transcribed segments when splitting contigs (default: 3)

-precursor_thresh FLT : threshold used to label a transcript as precursor for a particular locus (default: 0.1)

-min_overlap_extend INT : minimum overlap for extending contigs (default: 200)

-min_overlap_loci INT : minimum overlap for clustering into loci (default: 200)

ACCURACY ASSESSMENT OPTIONS:

-max_intron INT : maximum intron length, for completeness, contiguity (default: 75000)

-g STR : genome in FASTA or 2bit format, used for reference-based joining and accuracy

-t STR : transcripts in FASTA format, for checking completeness, contiguity

-ga STR : gene annotation in tabular format (name, chrom, strand, start, end, exonStarts, exonEnds)

--group STR : Group information of each column of counting table for statistics analysis (0: Control, 1: Treated)