Rnnotator for Transcirptome Assembling

Rnnotator is a de novo assembly of RNA-Seq data. It was designed to assemble Illumina single or paired-end reads. It is also able to incorporate strand-specific RNA-Seq reads into the assembly in order to further improve the assembly.
Rnnotator consists of three major components: preprocessing of reads, assembly, and post-processing of contigs. The read preprocessing step may optionally perform several tasks including: removing low-quality reads, low-complexity reads, adapter-containing reads, duplicate reads, reads containing rare k-mers, rRNA containing reads, and read trimming. After read preprocessing, Rnnotator performs eight assemblies using the assembler of your choice (Velvet, Oases, etc.). Each assembly uses a different hash length for the De Bruijn graph. The assemblies will be run either sequentially or in parallel, depending upon the -n parameter setting. After performing multiple assemblies, Rnnotator removes redundant contigs and further assembles the contigs where significant overlaps are found. 

Please notice: Rnnotator and Vmatch are licensed software. They should be only used within Brown for academic projects. If you use the software , please cite the original publication: http://www.biomedcentral.com/1471-2164/11/663  

The simple guide is largely based on the Rnnotator user manual. The original software user manual in pdf format can be found in:
/gpfs/runtime/opt/rnnotator/2.4.12/src/Rnnotator-2.4.12/Rnnotator2.4.12Manual.pdf

Here I use 1 million 100bp paired-end 40K 100bp single-end Illumina data generated by CASAVA 1.8 to show you how to run the software.

# first login to oscar, and go to scratch
ssh oscar

# request a high memory interactive computing node
interact -n 8 -m 64g -t 12:00:00
Cores:    8
Walltime: 12:00:00
Memory:   64g
Queue:    timeshare
qsub -I -q timeshare -l walltime=12:00:00,nodes=1:ppn=8,mem=64g
qsub: waiting for job 1372816.mgt to start
qsub: job 1372816.mgt ready

----------------------------------------
Begin PBS Prologue Mon Dec  5 12:24:16 EST 2011 1323105856
Job ID:         1372816.mgt
Username:       ldong
Group:          illumina
Nodes:          node244
End PBS Prologue Mon Dec  5 12:24:17 EST 2011 1323105857
----------------------------------------
# Create a folder to work with
mkdir scratch/rnnotator-test && cd scratch/rnnotator-test

# Load  rnnonator module

module load rnnotator 


# Copy example Illumina data into current folderl
cp /gpfs/data/shared/biomed/example_fastq/rna_seq/casava1.8/* .

# Take a look at the example data, there are three files, read1 and read2 for paired-end and single read file
ls -l 
total 509952
-rw-r----- 1 ldong ccvstaff 260028550 Dec 27 10:42 1M_read1.fastq
-rw-r----- 1 ldong ccvstaff 260028550 Dec 27 10:42 1M_read2.fastq
-rw-r----- 1 ldong ccvstaff   2600272 Dec 27 10:42 40K_single_read.fastq

# For paired-end data, the software expected that read 1 and read 2 are in the same file and follow one after another in pairs. An example is shown below. Also, the quality score coding is 64 based. 
@1044:5:1:1071:20262/1
GGTCAATCTCACGATTTGATGGAANAGCTCGCCACCGGGGCAGAGTTCGAGGATGATATAGTAGTATTGACGTGCC
+
bbbbbbbbbbb_a`bbbbbbbbb^BZY[U[bbbbbbbbbab_bab_a`b]b^_bb`]aa`]XT`Z_\__]^_K_BB
@1044:5:1:1071:20262/2
GCAACCAGCGTGCCAACATCCTGAAAGAAGTGCAGATCATGCGCAATCTCGATCACCCCAATATCGTCAAGATGAT
+
bbbabbb_bba^^`bbbbb`bbbc_b`\\b^ab_aaaaa`bcacac_c``aa``ac^`a`aZ^^a^BBBBBBBBBB

# By default Illumima software CASAVA 1.8 generates two file for paired-end run, and the quality score is 33 based. We need change it. Also the read id fields also need to re-format.
# This command will merge read1 and read2 file together into file "1m-pair" and also change the quality score coding to 64 based.
merge_fastq_score64_for_rnnotator.pl 1M_read1.fastq 1M_read2.fastq 1m-pair

# Also re-format the single read file. This command will generate file "40K_single_read.fastq_score64.fastq"
convert_fastq_score64_for_rnnotator.pl 40K_single_read.fastq

# Now, we will start rnnotator using both non-strand paired-end and non-strand single read as intput and use 8 threads, and use "out-1mpair-and-40ksingle" as output folder. 
# For all the available options, run command: rnnotator.pl on command line.
rnnotator.pl -nonP 400 1m-pair -nonS 40K_single_read.fastq_score64.fastq -n 8 -o out-1mpair-and-40ksingle

# The command should finish in about one hour, all output will be in folder "out-1mpair-and-40ksingle". Now, take a look at the output files:
ls -l out-1mpair-and-40ksingle
total 1.3G
-rw-r----- 1 ldong ccvstaff 474M Dec 26 22:26 1m-pair.artifact_filtered.fq
-rw-r----- 1 ldong ccvstaff 1.7M Dec 26 22:26 1m-pair.artifact_filtered_bad.fq
-rw-r----- 1 ldong ccvstaff 428M Dec 26 22:30 1m-pair_.dereplicated.fq
-rw-r----- 1 ldong ccvstaff 2.0K Dec 26 22:30 1m-pair_.derep_bad.fq
-rw-r----- 1 ldong ccvstaff 2.5M Dec 26 22:30 40K_single_read.fastq_score64.fastq.artifact_filtered.fq
-rw-r----- 1 ldong ccvstaff 6.8K Dec 26 22:30 40K_single_read.fastq_score64.fastq.artifact_filtered_bad.fq
-rw-r----- 1 ldong ccvstaff 2.2M Dec 26 22:30 40K_single_read.fastq_score64.fastq_.dereplicated.fq
-rw-r----- 1 ldong ccvstaff    0 Dec 26 22:30 40K_single_read.fastq_score64.fastq_.derep_bad.fq
-rw-r----- 1 ldong ccvstaff 1.1K Dec 26 22:30 derep_log.txt
-rw-r----- 1 ldong ccvstaff  18M Dec 26 23:11 velvet_i1_contigs.fa
-rw-r----- 1 ldong ccvstaff  18M Dec 26 23:11 velvet_contigs.fa
-rw-r----- 1 ldong ccvstaff  14M Dec 26 23:12 vmatch_contigs.fa
-rw-r----- 1 ldong ccvstaff 7.8M Dec 26 23:17 merged_contigs.fa
-rw-r----- 1 ldong ccvstaff 7.7M Dec 26 23:23 extended_contigs.fa
-rw-r----- 1 ldong ccvstaff 7.2M Dec 26 23:23 loci.fa
-rw-r----- 1 ldong ccvstaff 1.7M Dec 26 23:25 loci.fa.pac
-rw-r----- 1 ldong ccvstaff 777K Dec 26 23:25 loci.fa.ann
-rw-r----- 1 ldong ccvstaff   16 Dec 26 23:25 loci.fa.amb
-rw-r----- 1 ldong ccvstaff 1.7M Dec 26 23:25 loci.fa.rpac
-rw-r----- 1 ldong ccvstaff 2.6M Dec 26 23:25 loci.fa.bwt
-rw-r----- 1 ldong ccvstaff 2.6M Dec 26 23:25 loci.fa.rbwt
-rw-r----- 1 ldong ccvstaff 869K Dec 26 23:25 loci.fa.sa
-rw-r----- 1 ldong ccvstaff 869K Dec 26 23:25 loci.fa.rsa
-rw-r----- 1 ldong ccvstaff 245M Dec 26 23:27 1m-pair.trim36.all.loci.fa.sam
-rw-r----- 1 ldong ccvstaff 1.2M Dec 26 23:28 40K_single_read.fastq_score64.fastq.trim36.all.loci.fa.sam
-rw-r----- 1 ldong ccvstaff 460K Dec 26 23:28 counts.txt
-rw-r----- 1 ldong ccvstaff 668K Dec 26 23:28 loci.fa.fai
-rw-r----- 1 ldong ccvstaff 619K Dec 26 23:33 norm_counts.txt
-rw-r----- 1 ldong ccvstaff 7.4M Dec 26 23:34 labeled.fa
-rw-r----- 1 ldong ccvstaff 7.4M Dec 26 23:34 final_contigs.fa
-rw-r----- 1 ldong ccvstaff  15K Dec 26 23:34 assembly.log

# Let's take a look at the assembly log:
less assembly.log
26-Dec-2011 22:18 Rnnotator version: 2.4.12
26-Dec-2011 22:18 Hostname: smp007
26-Dec-2011 22:18 Username: ldong
26-Dec-2011 22:18 ******************************************************************************
26-Dec-2011 22:18 Assembly Parameters: 
26-Dec-2011 22:18 Script: /gpfs/runtime/opt/rnnotator/2.4.12/bin/rnnotator.pl
26-Dec-2011 22:18 Current directory: /gpfs/scratch/ldong/rnnotator-test
26-Dec-2011 22:18 Output directory: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle
26-Dec-2011 22:18 Number of threads: 8
26-Dec-2011 22:18 Assembler: velvet
26-Dec-2011 22:18 hash range: auto
26-Dec-2011 22:18 Scaffolder: none
26-Dec-2011 22:18 Aligner to use when aligning stranded reads to contigs: bwa
26-Dec-2011 22:18 Trim length: auto
26-Dec-2011 22:18 Read filters: low-quality low-complexity adapter duplicates trimming 
26-Dec-2011 22:18 Coverage cuttoff when splitting contigs: 3
26-Dec-2011 22:18 Minimum contig length: 100
26-Dec-2011 22:18 Precursor theshold: 0.1
26-Dec-2011 22:18 Overlap length for clustering: 200
26-Dec-2011 22:18 Overlap length for extension: 200
26-Dec-2011 22:18 ******************************************************************************
26-Dec-2011 22:18 Rnnotator started
26-Dec-2011 22:18 Reading fqs from each library...
26-Dec-2011 22:18 nonP1:
26-Dec-2011 22:18 ins_len: 400
26-Dec-2011 22:18 fqs: 1m-pair
26-Dec-2011 22:18 nonS1:
26-Dec-2011 22:18 fqs: 40K_single_read.fastq_score64.fastq
26-Dec-2011 22:18 Number of libraries: 2
26-Dec-2011 22:18 Preprocessing reads for each library...
26-Dec-2011 22:18 --------------------- Preprocessing nonP 1 ---------------------
26-Dec-2011 22:18 Started read pre-processing...
26-Dec-2011 22:18 Steps to run:
26-Dec-2011 22:18 Remove low complexity
26-Dec-2011 22:18 Remove low quality
26-Dec-2011 22:18 Remove adapter
26-Dec-2011 22:18 Remove duplicates
26-Dec-2011 22:18 Trim
26-Dec-2011 22:18 Estimated number of reads in input: 2000228
26-Dec-2011 22:18 Checking lengths of input sequences...
26-Dec-2011 22:18 randomly sampling reads to determine median read length from: 1m-pair
26-Dec-2011 22:18 median read length for read 1: 100
26-Dec-2011 22:18 median read length for read 2: 100
26-Dec-2011 22:18 Determinging Auto trimming lengths...
26-Dec-2011 22:18 determining trimming length for 1m-pair...
26-Dec-2011 22:18 Using quality scores to find the best length to trim reads to
26-Dec-2011 22:18 randomly selecting sequences....
26-Dec-2011 22:18 separating into read 1 and read 2....
26-Dec-2011 22:18 finding trim length....
26-Dec-2011 22:18 the best trimming length for read 1 was 100
26-Dec-2011 22:18 the best trimming length for read 2 was 100
26-Dec-2011 22:18 reads will be trimmed to: 100
26-Dec-2011 22:18 Starting artifact filtering of 1m-pair...
26-Dec-2011 22:18 splitting input file...
26-Dec-2011 22:19 running in parallel using 8 cores...
26-Dec-2011 22:26 combining output...
26-Dec-2011 22:26 fraction of bad reads 6852/2000228 = 0.34 %
26-Dec-2011 22:26 fraction of reads with adapter 6/2000228 = 0.00 %
26-Dec-2011 22:26 fraction of low quality reads 0/2000228 = 0.00 %
26-Dec-2011 22:26 fraction of low complexity reads 4335/2000228 = 0.22 %
26-Dec-2011 22:26 Removing duplicate reads...
26-Dec-2011 22:30 fraction of reads removed which were duplicates 201042/1993604 = 10.08 %
26-Dec-2011 22:30 Fraction of reads removed from complete set 207666/2000228 = 10.38 %
26-Dec-2011 22:30 Finished read pre-processing.
26-Dec-2011 22:30 ----------------------- Finished nonP 1 ------------------------
26-Dec-2011 22:30 --------------------- Preprocessing nonS 1 ---------------------
26-Dec-2011 22:30 Started read pre-processing...
26-Dec-2011 22:30 Steps to run:
26-Dec-2011 22:30 Remove low complexity
26-Dec-2011 22:30 Remove low quality
26-Dec-2011 22:30 Remove adapter
26-Dec-2011 22:30 Remove duplicates
26-Dec-2011 22:30 Trim
26-Dec-2011 22:30 Estimated number of reads in input: 9980
26-Dec-2011 22:30 Checking lengths of input sequences...
26-Dec-2011 22:30 randomly sampling reads to determine median read length from: 40K_single_read.fastq_score64.fastq
26-Dec-2011 22:30 median read length for read 1: 100
26-Dec-2011 22:30 Starting artifact filtering of 40K_single_read.fastq_score64.fastq...
26-Dec-2011 22:30 splitting input file...
26-Dec-2011 22:30 running in parallel using 8 cores...
26-Dec-2011 22:30 combining output...
26-Dec-2011 22:30 fraction of bad reads 6/9980 = 0.06 %
26-Dec-2011 22:30 fraction of reads with adapter 0/9980 = 0.00 %
26-Dec-2011 22:30 fraction of low quality reads 0/9980 = 0.00 %
26-Dec-2011 22:30 fraction of low complexity reads 26/9980 = 0.26 %
26-Dec-2011 22:30 Removing duplicate reads...
26-Dec-2011 22:30 fraction of reads removed which were duplicates 490/9954 = 4.92 %
26-Dec-2011 22:30 Fraction of reads removed from complete set 516/9980 = 5.17 %
26-Dec-2011 22:30 Finished read pre-processing.
26-Dec-2011 22:30 ----------------------- Finished nonS 1 ------------------------
26-Dec-2011 22:30 Assembly started...
26-Dec-2011 22:30 1 assembly(ies) will be performed.
26-Dec-2011 22:30 Setting velvet parameters...
26-Dec-2011 22:30 median read length for nonP1: 100
26-Dec-2011 22:31 median read length for nonS1: 100
26-Dec-2011 22:31 Running velveth v: 1.1.04
26-Dec-2011 22:31 Running velvetg v: 1.1.04
26-Dec-2011 22:31 starting velveth for hash length 95.
26-Dec-2011 22:31 starting velveth for hash length 89.
26-Dec-2011 22:31 starting velveth for hash length 83.
26-Dec-2011 22:31 starting velveth for hash length 77.
26-Dec-2011 22:31 starting velveth for hash length 71.
26-Dec-2011 22:31 starting velveth for hash length 65.
26-Dec-2011 22:31 starting velveth for hash length 59.
26-Dec-2011 22:31 starting velveth for hash length 53.
26-Dec-2011 22:40 finished velveth for hash length 95.
26-Dec-2011 22:40 starting velvetg for hash length 95.
26-Dec-2011 22:42 finished velveth for hash length 89.
26-Dec-2011 22:42 starting velvetg for hash length 89.
26-Dec-2011 22:44 finished velvetg for hash length 95.
26-Dec-2011 22:45 finished velveth for hash length 83.
26-Dec-2011 22:45 starting velvetg for hash length 83.
26-Dec-2011 22:49 finished velvetg for hash length 83.
26-Dec-2011 22:49 finished velvetg for hash length 89.
26-Dec-2011 22:55 finished velveth for hash length 77.
26-Dec-2011 22:55 starting velvetg for hash length 77.
26-Dec-2011 22:56 finished velveth for hash length 65.
26-Dec-2011 22:56 starting velvetg for hash length 65.
26-Dec-2011 22:57 finished velvetg for hash length 77.
26-Dec-2011 22:59 finished velvetg for hash length 65.
26-Dec-2011 23:03 finished velveth for hash length 53.
26-Dec-2011 23:03 starting velvetg for hash length 53.
26-Dec-2011 23:05 finished velveth for hash length 71.
26-Dec-2011 23:05 starting velvetg for hash length 71.
26-Dec-2011 23:06 finished velveth for hash length 59.
26-Dec-2011 23:06 starting velvetg for hash length 59.
26-Dec-2011 23:08 finished velvetg for hash length 71.
26-Dec-2011 23:10 finished velvetg for hash length 53.
26-Dec-2011 23:10 finished velvetg for hash length 59.
26-Dec-2011 23:10 Combining velvet contigs into one file.
26-Dec-2011 23:11 Removing assembly directory...
26-Dec-2011 23:11 Contig statistics (minimum contig length: 100)
26-Dec-2011 23:11 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/velvet_i1_contigs.fa
26-Dec-2011 23:11 number of sequences: 53921
26-Dec-2011 23:11 number of total bases: 15760885
26-Dec-2011 23:11 N50: 307
26-Dec-2011 23:11 median length: 220
26-Dec-2011 23:11 shortest contig: 105
26-Dec-2011 23:11 longest contig: 6225
26-Dec-2011 23:11 # of long contigs (>= 1000 bp): 1038
26-Dec-2011 23:11 # of medium contigs (>= 500 bp and < 1000 bp): 4236
26-Dec-2011 23:11 # of short contigs (>= 100 bp and < 500 bp): 48647
26-Dec-2011 23:11 Contig statistics (minimum contig length: 100)
26-Dec-2011 23:11 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/velvet_contigs.fa
26-Dec-2011 23:11 number of sequences: 53921
26-Dec-2011 23:11 number of total bases: 15760885
26-Dec-2011 23:11 N50: 307
26-Dec-2011 23:11 median length: 220
26-Dec-2011 23:11 shortest contig: 105
26-Dec-2011 23:11 longest contig: 6225
26-Dec-2011 23:11 # of long contigs (>= 1000 bp): 1038
26-Dec-2011 23:11 # of medium contigs (>= 500 bp and < 1000 bp): 4236
26-Dec-2011 23:11 # of short contigs (>= 100 bp and < 500 bp): 48647
26-Dec-2011 23:11 Running vmatch version: 2.1.6
26-Dec-2011 23:11 Removing redundant, or overlapping contigs (vmatch 2.1.3 or later).
26-Dec-2011 23:12 Removing temporary vmatch files...
26-Dec-2011 23:12 Contig statistics (minimum contig length: 100)
26-Dec-2011 23:12 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/vmatch_contigs.fa
26-Dec-2011 23:12 number of sequences: 42875
26-Dec-2011 23:12 number of total bases: 12659855
26-Dec-2011 23:12 N50: 307
26-Dec-2011 23:12 median length: 221
26-Dec-2011 23:12 shortest contig: 105
26-Dec-2011 23:12 longest contig: 6225
26-Dec-2011 23:12 # of long contigs (>= 1000 bp): 828
26-Dec-2011 23:12 # of medium contigs (>= 500 bp and < 1000 bp): 3425
26-Dec-2011 23:12 # of short contigs (>= 100 bp and < 500 bp): 38622
26-Dec-2011 23:12 Merging contigs using minimum overlap: 40
26-Dec-2011 23:12 Merging... removing exact duplicates...
26-Dec-2011 23:12 Merging... running minimus2...
26-Dec-2011 23:17 Finished merging contigs.
26-Dec-2011 23:17 Contig statistics (minimum contig length: 100)
26-Dec-2011 23:17 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/merged_contigs.fa
26-Dec-2011 23:17 number of sequences: 22520
26-Dec-2011 23:17 number of total bases: 7130044
26-Dec-2011 23:17 N50: 343
26-Dec-2011 23:17 median length: 205
26-Dec-2011 23:17 shortest contig: 105
26-Dec-2011 23:17 longest contig: 10702
26-Dec-2011 23:17 # of long contigs (>= 1000 bp): 752
26-Dec-2011 23:17 # of medium contigs (>= 500 bp and < 1000 bp): 1857
26-Dec-2011 23:17 # of short contigs (>= 100 bp and < 500 bp): 19911
26-Dec-2011 23:17 Contig extension started
26-Dec-2011 23:17 Starting contig extension with min overlap: 200 and stranded? 0
26-Dec-2011 23:17 running vmatch...
26-Dec-2011 23:17 appending singletons...
26-Dec-2011 23:17 extending individual clusters...
26-Dec-2011 23:23 Finished contig extension
26-Dec-2011 23:23 number of clusters extended 95 clusters...
26-Dec-2011 23:23 Contig statistics (minimum contig length: 100)
26-Dec-2011 23:23 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/extended_contigs.fa
26-Dec-2011 23:23 number of sequences: 22471
26-Dec-2011 23:23 number of total bases: 7111135
26-Dec-2011 23:23 N50: 342
26-Dec-2011 23:23 median length: 205
26-Dec-2011 23:23 shortest contig: 105
26-Dec-2011 23:23 longest contig: 10702
26-Dec-2011 23:23 # of long contigs (>= 1000 bp): 754
26-Dec-2011 23:23 # of medium contigs (>= 500 bp and < 1000 bp): 1839
26-Dec-2011 23:23 # of short contigs (>= 100 bp and < 500 bp): 19878
26-Dec-2011 23:23 Clustering by locus...
26-Dec-2011 23:24 Contig statistics (minimum contig length: 100)
26-Dec-2011 23:24 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/loci.fa
26-Dec-2011 23:24 number of sequences: 22471
26-Dec-2011 23:24 number of total bases: 7111135
26-Dec-2011 23:24 N50: 342
26-Dec-2011 23:24 median length: 205
26-Dec-2011 23:24 shortest contig: 105
26-Dec-2011 23:24 longest contig: 10702
26-Dec-2011 23:24 # of long contigs (>= 1000 bp): 754
26-Dec-2011 23:24 # of medium contigs (>= 500 bp and < 1000 bp): 1839
26-Dec-2011 23:24 # of short contigs (>= 100 bp and < 500 bp): 19878
26-Dec-2011 23:24 Counting...
26-Dec-2011 23:24 Starting counting...
26-Dec-2011 23:24 Library SE PE_total PE_agree PE_conflict
26-Dec-2011 23:24 all 5681 1108172 912424 195748
26-Dec-2011 23:28 Generating RAF from sam...
26-Dec-2011 23:28 converting SAM->BAM /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/1m-pair.trim36.all.loci.fa.sam...
26-Dec-2011 23:28 converting SAM->BAM /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/40K_single_read.fastq_score64.fastq.trim36.all.loci.fa.sam...
26-Dec-2011 23:28 converting BAM->RAF /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/40K_single_read.fastq_score64.fastq.trim36.all.loci.fa.sam.bam.raf...
26-Dec-2011 23:28 converting BAM->RAF /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/1m-pair.trim36.all.loci.fa.sam.bam.raf...
26-Dec-2011 23:31 Determining mappable gene lengths...
26-Dec-2011 23:33 Normalizing counts by gene length...
26-Dec-2011 23:33 Normalizing counts by # of reads in the sample...
26-Dec-2011 23:33 Labeling contigs with expression information and identifying potential precursors...
26-Dec-2011 23:34 total transcripts: 22471
26-Dec-2011 23:34 precursor transcripts: 58
26-Dec-2011 23:34 non-precursor transcripts: 22413
26-Dec-2011 23:34 Filtering by min contig length (100)...
26-Dec-2011 23:34 Contig statistics (minimum contig length: 100)
26-Dec-2011 23:34 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/final_contigs.fa
26-Dec-2011 23:34 number of sequences: 22471
26-Dec-2011 23:34 number of total bases: 7111135
26-Dec-2011 23:34 N50: 342
26-Dec-2011 23:34 median length: 205
26-Dec-2011 23:34 shortest contig: 105
26-Dec-2011 23:34 longest contig: 10702
26-Dec-2011 23:34 # of long contigs (>= 1000 bp): 754
26-Dec-2011 23:34 # of medium contigs (>= 500 bp and < 1000 bp): 1839
26-Dec-2011 23:34 # of short contigs (>= 100 bp and < 500 bp): 19878
26-Dec-2011 23:34 Assembly finished!


Comments