PowerPoint Presentations

    Ingenuity Pathway Analysis (IPA) Guide

    Summer Bioinformatics Course

    Recent Publications

    MicroRNA Seq

    Gene Ontology Analysis

    Phylogeny Analysis

    NCBI SRA Download

    Recent site activity

    Rnnotator for Transcirptome Assembling

    Rnnotator is a de novo assembly of RNA-Seq data. It was designed to assemble Illumina single or paired-end reads. It is also able to incorporate strand-specific RNA-Seq reads into the assembly in order to further improve the assembly.
    Rnnotator consists of three major components: preprocessing of reads, assembly, and post-processing of contigs. The read preprocessing step may optionally perform several tasks including: removing low-quality reads, low-complexity reads, adapter-containing reads, duplicate reads, reads containing rare k-mers, rRNA containing reads, and read trimming. After read preprocessing, Rnnotator performs eight assemblies using the assembler of your choice (Velvet, Oases, etc.). Each assembly uses a different hash length for the De Bruijn graph. The assemblies will be run either sequentially or in parallel, depending upon the -n parameter setting. After performing multiple assemblies, Rnnotator removes redundant contigs and further assembles the contigs where significant overlaps are found. 

    Please notice: Rnnotator and Vmatch are licensed software. They should be only used within Brown for academic projects. If you use the software , please cite the original publication: http://www.biomedcentral.com/1471-2164/11/663  

    The simple guide is largely based on the Rnnotator user manual. The original software user manual in pdf format can be found in:
    /gpfs/runtime/opt/rnnotator/2.4.12/src/Rnnotator-2.4.12/Rnnotator2.4.12Manual.pdf

    Here I use 1 million 100bp paired-end 40K 100bp single-end Illumina data generated by CASAVA 1.8 to show you how to run the software.

    # first login to oscar, and go to scratch
    ssh oscar

    # request a high memory interactive computing node
    interact -n 8 -m 64g -t 12:00:00
    Cores:    8
    Walltime: 12:00:00
    Memory:   64g
    Queue:    timeshare
    qsub -I -q timeshare -l walltime=12:00:00,nodes=1:ppn=8,mem=64g
    qsub: waiting for job 1372816.mgt to start
    qsub: job 1372816.mgt ready
    
    ----------------------------------------
    Begin PBS Prologue Mon Dec  5 12:24:16 EST 2011 1323105856
    Job ID:         1372816.mgt
    Username:       ldong
    Group:          illumina
    Nodes:          node244
    End PBS Prologue Mon Dec  5 12:24:17 EST 2011 1323105857
    ----------------------------------------
    
    # Create a folder to work with
    mkdir scratch/rnnotator-test && cd scratch/rnnotator-test

    # Load  rnnonator module

    module load rnnotator 


    # Copy example Illumina data into current folderl
    cp /gpfs/data/shared/biomed/example_fastq/rna_seq/casava1.8/* .

    # Take a look at the example data, there are three files, read1 and read2 for paired-end and single read file
    ls -l 
    total 509952
    -rw-r----- 1 ldong ccvstaff 260028550 Dec 27 10:42 1M_read1.fastq
    -rw-r----- 1 ldong ccvstaff 260028550 Dec 27 10:42 1M_read2.fastq
    -rw-r----- 1 ldong ccvstaff   2600272 Dec 27 10:42 40K_single_read.fastq

    # For paired-end data, the software expected that read 1 and read 2 are in the same file and follow one after another in pairs. An example is shown below. Also, the quality score coding is 64 based. 
    @1044:5:1:1071:20262/1
    GGTCAATCTCACGATTTGATGGAANAGCTCGCCACCGGGGCAGAGTTCGAGGATGATATAGTAGTATTGACGTGCC
    +
    bbbbbbbbbbb_a`bbbbbbbbb^BZY[U[bbbbbbbbbab_bab_a`b]b^_bb`]aa`]XT`Z_\__]^_K_BB
    @1044:5:1:1071:20262/2
    GCAACCAGCGTGCCAACATCCTGAAAGAAGTGCAGATCATGCGCAATCTCGATCACCCCAATATCGTCAAGATGAT
    +
    bbbabbb_bba^^`bbbbb`bbbc_b`\\b^ab_aaaaa`bcacac_c``aa``ac^`a`aZ^^a^BBBBBBBBBB

    # By default Illumima software CASAVA 1.8 generates two file for paired-end run, and the quality score is 33 based. We need change it. Also the read id fields also need to re-format.
    # This command will merge read1 and read2 file together into file "1m-pair" and also change the quality score coding to 64 based.
    merge_fastq_score64_for_rnnotator.pl 1M_read1.fastq 1M_read2.fastq 1m-pair

    # Also re-format the single read file. This command will generate file "40K_single_read.fastq_score64.fastq"
    convert_fastq_score64_for_rnnotator.pl 40K_single_read.fastq

    # Now, we will start rnnotator using both non-strand paired-end and non-strand single read as intput and use 8 threads, and use "out-1mpair-and-40ksingle" as output folder. 
    # For all the available options, run command: rnnotator.pl on command line.
    rnnotator.pl -nonP 400 1m-pair -nonS 40K_single_read.fastq_score64.fastq -n 8 -o out-1mpair-and-40ksingle

    # The command should finish in about one hour, all output will be in folder "out-1mpair-and-40ksingle". Now, take a look at the output files:
    ls -l out-1mpair-and-40ksingle
    total 1.3G
    -rw-r----- 1 ldong ccvstaff 474M Dec 26 22:26 1m-pair.artifact_filtered.fq
    -rw-r----- 1 ldong ccvstaff 1.7M Dec 26 22:26 1m-pair.artifact_filtered_bad.fq
    -rw-r----- 1 ldong ccvstaff 428M Dec 26 22:30 1m-pair_.dereplicated.fq
    -rw-r----- 1 ldong ccvstaff 2.0K Dec 26 22:30 1m-pair_.derep_bad.fq
    -rw-r----- 1 ldong ccvstaff 2.5M Dec 26 22:30 40K_single_read.fastq_score64.fastq.artifact_filtered.fq
    -rw-r----- 1 ldong ccvstaff 6.8K Dec 26 22:30 40K_single_read.fastq_score64.fastq.artifact_filtered_bad.fq
    -rw-r----- 1 ldong ccvstaff 2.2M Dec 26 22:30 40K_single_read.fastq_score64.fastq_.dereplicated.fq
    -rw-r----- 1 ldong ccvstaff    0 Dec 26 22:30 40K_single_read.fastq_score64.fastq_.derep_bad.fq
    -rw-r----- 1 ldong ccvstaff 1.1K Dec 26 22:30 derep_log.txt
    -rw-r----- 1 ldong ccvstaff  18M Dec 26 23:11 velvet_i1_contigs.fa
    -rw-r----- 1 ldong ccvstaff  18M Dec 26 23:11 velvet_contigs.fa
    -rw-r----- 1 ldong ccvstaff  14M Dec 26 23:12 vmatch_contigs.fa
    -rw-r----- 1 ldong ccvstaff 7.8M Dec 26 23:17 merged_contigs.fa
    -rw-r----- 1 ldong ccvstaff 7.7M Dec 26 23:23 extended_contigs.fa
    -rw-r----- 1 ldong ccvstaff 7.2M Dec 26 23:23 loci.fa
    -rw-r----- 1 ldong ccvstaff 1.7M Dec 26 23:25 loci.fa.pac
    -rw-r----- 1 ldong ccvstaff 777K Dec 26 23:25 loci.fa.ann
    -rw-r----- 1 ldong ccvstaff   16 Dec 26 23:25 loci.fa.amb
    -rw-r----- 1 ldong ccvstaff 1.7M Dec 26 23:25 loci.fa.rpac
    -rw-r----- 1 ldong ccvstaff 2.6M Dec 26 23:25 loci.fa.bwt
    -rw-r----- 1 ldong ccvstaff 2.6M Dec 26 23:25 loci.fa.rbwt
    -rw-r----- 1 ldong ccvstaff 869K Dec 26 23:25 loci.fa.sa
    -rw-r----- 1 ldong ccvstaff 869K Dec 26 23:25 loci.fa.rsa
    -rw-r----- 1 ldong ccvstaff 245M Dec 26 23:27 1m-pair.trim36.all.loci.fa.sam
    -rw-r----- 1 ldong ccvstaff 1.2M Dec 26 23:28 40K_single_read.fastq_score64.fastq.trim36.all.loci.fa.sam
    -rw-r----- 1 ldong ccvstaff 460K Dec 26 23:28 counts.txt
    -rw-r----- 1 ldong ccvstaff 668K Dec 26 23:28 loci.fa.fai
    -rw-r----- 1 ldong ccvstaff 619K Dec 26 23:33 norm_counts.txt
    -rw-r----- 1 ldong ccvstaff 7.4M Dec 26 23:34 labeled.fa
    -rw-r----- 1 ldong ccvstaff 7.4M Dec 26 23:34 final_contigs.fa
    -rw-r----- 1 ldong ccvstaff  15K Dec 26 23:34 assembly.log

    # Let's take a look at the assembly log:
    less assembly.log
    26-Dec-2011 22:18 Rnnotator version: 2.4.12
    26-Dec-2011 22:18 Hostname: smp007
    26-Dec-2011 22:18 Username: ldong
    26-Dec-2011 22:18 ******************************************************************************
    26-Dec-2011 22:18 Assembly Parameters: 
    26-Dec-2011 22:18 Script: /gpfs/runtime/opt/rnnotator/2.4.12/bin/rnnotator.pl
    26-Dec-2011 22:18 Current directory: /gpfs/scratch/ldong/rnnotator-test
    26-Dec-2011 22:18 Output directory: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle
    26-Dec-2011 22:18 Number of threads: 8
    26-Dec-2011 22:18 Assembler: velvet
    26-Dec-2011 22:18 hash range: auto
    26-Dec-2011 22:18 Scaffolder: none
    26-Dec-2011 22:18 Aligner to use when aligning stranded reads to contigs: bwa
    26-Dec-2011 22:18 Trim length: auto
    26-Dec-2011 22:18 Read filters: low-quality low-complexity adapter duplicates trimming 
    26-Dec-2011 22:18 Coverage cuttoff when splitting contigs: 3
    26-Dec-2011 22:18 Minimum contig length: 100
    26-Dec-2011 22:18 Precursor theshold: 0.1
    26-Dec-2011 22:18 Overlap length for clustering: 200
    26-Dec-2011 22:18 Overlap length for extension: 200
    26-Dec-2011 22:18 ******************************************************************************
    26-Dec-2011 22:18 Rnnotator started
    26-Dec-2011 22:18 Reading fqs from each library...
    26-Dec-2011 22:18 nonP1:
    26-Dec-2011 22:18 ins_len: 400
    26-Dec-2011 22:18 fqs: 1m-pair
    26-Dec-2011 22:18 nonS1:
    26-Dec-2011 22:18 fqs: 40K_single_read.fastq_score64.fastq
    26-Dec-2011 22:18 Number of libraries: 2
    26-Dec-2011 22:18 Preprocessing reads for each library...
    26-Dec-2011 22:18 --------------------- Preprocessing nonP 1 ---------------------
    26-Dec-2011 22:18 Started read pre-processing...
    26-Dec-2011 22:18 Steps to run:
    26-Dec-2011 22:18 Remove low complexity
    26-Dec-2011 22:18 Remove low quality
    26-Dec-2011 22:18 Remove adapter
    26-Dec-2011 22:18 Remove duplicates
    26-Dec-2011 22:18 Trim
    26-Dec-2011 22:18 Estimated number of reads in input: 2000228
    26-Dec-2011 22:18 Checking lengths of input sequences...
    26-Dec-2011 22:18 randomly sampling reads to determine median read length from: 1m-pair
    26-Dec-2011 22:18 median read length for read 1: 100
    26-Dec-2011 22:18 median read length for read 2: 100
    26-Dec-2011 22:18 Determinging Auto trimming lengths...
    26-Dec-2011 22:18 determining trimming length for 1m-pair...
    26-Dec-2011 22:18 Using quality scores to find the best length to trim reads to
    26-Dec-2011 22:18 randomly selecting sequences....
    26-Dec-2011 22:18 separating into read 1 and read 2....
    26-Dec-2011 22:18 finding trim length....
    26-Dec-2011 22:18 the best trimming length for read 1 was 100
    26-Dec-2011 22:18 the best trimming length for read 2 was 100
    26-Dec-2011 22:18 reads will be trimmed to: 100
    26-Dec-2011 22:18 Starting artifact filtering of 1m-pair...
    26-Dec-2011 22:18 splitting input file...
    26-Dec-2011 22:19 running in parallel using 8 cores...
    26-Dec-2011 22:26 combining output...
    26-Dec-2011 22:26 fraction of bad reads 6852/2000228 = 0.34 %
    26-Dec-2011 22:26 fraction of reads with adapter 6/2000228 = 0.00 %
    26-Dec-2011 22:26 fraction of low quality reads 0/2000228 = 0.00 %
    26-Dec-2011 22:26 fraction of low complexity reads 4335/2000228 = 0.22 %
    26-Dec-2011 22:26 Removing duplicate reads...
    26-Dec-2011 22:30 fraction of reads removed which were duplicates 201042/1993604 = 10.08 %
    26-Dec-2011 22:30 Fraction of reads removed from complete set 207666/2000228 = 10.38 %
    26-Dec-2011 22:30 Finished read pre-processing.
    26-Dec-2011 22:30 ----------------------- Finished nonP 1 ------------------------
    26-Dec-2011 22:30 --------------------- Preprocessing nonS 1 ---------------------
    26-Dec-2011 22:30 Started read pre-processing...
    26-Dec-2011 22:30 Steps to run:
    26-Dec-2011 22:30 Remove low complexity
    26-Dec-2011 22:30 Remove low quality
    26-Dec-2011 22:30 Remove adapter
    26-Dec-2011 22:30 Remove duplicates
    26-Dec-2011 22:30 Trim
    26-Dec-2011 22:30 Estimated number of reads in input: 9980
    26-Dec-2011 22:30 Checking lengths of input sequences...
    26-Dec-2011 22:30 randomly sampling reads to determine median read length from: 40K_single_read.fastq_score64.fastq
    26-Dec-2011 22:30 median read length for read 1: 100
    26-Dec-2011 22:30 Starting artifact filtering of 40K_single_read.fastq_score64.fastq...
    26-Dec-2011 22:30 splitting input file...
    26-Dec-2011 22:30 running in parallel using 8 cores...
    26-Dec-2011 22:30 combining output...
    26-Dec-2011 22:30 fraction of bad reads 6/9980 = 0.06 %
    26-Dec-2011 22:30 fraction of reads with adapter 0/9980 = 0.00 %
    26-Dec-2011 22:30 fraction of low quality reads 0/9980 = 0.00 %
    26-Dec-2011 22:30 fraction of low complexity reads 26/9980 = 0.26 %
    26-Dec-2011 22:30 Removing duplicate reads...
    26-Dec-2011 22:30 fraction of reads removed which were duplicates 490/9954 = 4.92 %
    26-Dec-2011 22:30 Fraction of reads removed from complete set 516/9980 = 5.17 %
    26-Dec-2011 22:30 Finished read pre-processing.
    26-Dec-2011 22:30 ----------------------- Finished nonS 1 ------------------------
    26-Dec-2011 22:30 Assembly started...
    26-Dec-2011 22:30 1 assembly(ies) will be performed.
    26-Dec-2011 22:30 Setting velvet parameters...
    26-Dec-2011 22:30 median read length for nonP1: 100
    26-Dec-2011 22:31 median read length for nonS1: 100
    26-Dec-2011 22:31 Running velveth v: 1.1.04
    26-Dec-2011 22:31 Running velvetg v: 1.1.04
    26-Dec-2011 22:31 starting velveth for hash length 95.
    26-Dec-2011 22:31 starting velveth for hash length 89.
    26-Dec-2011 22:31 starting velveth for hash length 83.
    26-Dec-2011 22:31 starting velveth for hash length 77.
    26-Dec-2011 22:31 starting velveth for hash length 71.
    26-Dec-2011 22:31 starting velveth for hash length 65.
    26-Dec-2011 22:31 starting velveth for hash length 59.
    26-Dec-2011 22:31 starting velveth for hash length 53.
    26-Dec-2011 22:40 finished velveth for hash length 95.
    26-Dec-2011 22:40 starting velvetg for hash length 95.
    26-Dec-2011 22:42 finished velveth for hash length 89.
    26-Dec-2011 22:42 starting velvetg for hash length 89.
    26-Dec-2011 22:44 finished velvetg for hash length 95.
    26-Dec-2011 22:45 finished velveth for hash length 83.
    26-Dec-2011 22:45 starting velvetg for hash length 83.
    26-Dec-2011 22:49 finished velvetg for hash length 83.
    26-Dec-2011 22:49 finished velvetg for hash length 89.
    26-Dec-2011 22:55 finished velveth for hash length 77.
    26-Dec-2011 22:55 starting velvetg for hash length 77.
    26-Dec-2011 22:56 finished velveth for hash length 65.
    26-Dec-2011 22:56 starting velvetg for hash length 65.
    26-Dec-2011 22:57 finished velvetg for hash length 77.
    26-Dec-2011 22:59 finished velvetg for hash length 65.
    26-Dec-2011 23:03 finished velveth for hash length 53.
    26-Dec-2011 23:03 starting velvetg for hash length 53.
    26-Dec-2011 23:05 finished velveth for hash length 71.
    26-Dec-2011 23:05 starting velvetg for hash length 71.
    26-Dec-2011 23:06 finished velveth for hash length 59.
    26-Dec-2011 23:06 starting velvetg for hash length 59.
    26-Dec-2011 23:08 finished velvetg for hash length 71.
    26-Dec-2011 23:10 finished velvetg for hash length 53.
    26-Dec-2011 23:10 finished velvetg for hash length 59.
    26-Dec-2011 23:10 Combining velvet contigs into one file.
    26-Dec-2011 23:11 Removing assembly directory...
    26-Dec-2011 23:11 Contig statistics (minimum contig length: 100)
    26-Dec-2011 23:11 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/velvet_i1_contigs.fa
    26-Dec-2011 23:11 number of sequences: 53921
    26-Dec-2011 23:11 number of total bases: 15760885
    26-Dec-2011 23:11 N50: 307
    26-Dec-2011 23:11 median length: 220
    26-Dec-2011 23:11 shortest contig: 105
    26-Dec-2011 23:11 longest contig: 6225
    26-Dec-2011 23:11 # of long contigs (>= 1000 bp): 1038
    26-Dec-2011 23:11 # of medium contigs (>= 500 bp and < 1000 bp): 4236
    26-Dec-2011 23:11 # of short contigs (>= 100 bp and < 500 bp): 48647
    26-Dec-2011 23:11 Contig statistics (minimum contig length: 100)
    26-Dec-2011 23:11 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/velvet_contigs.fa
    26-Dec-2011 23:11 number of sequences: 53921
    26-Dec-2011 23:11 number of total bases: 15760885
    26-Dec-2011 23:11 N50: 307
    26-Dec-2011 23:11 median length: 220
    26-Dec-2011 23:11 shortest contig: 105
    26-Dec-2011 23:11 longest contig: 6225
    26-Dec-2011 23:11 # of long contigs (>= 1000 bp): 1038
    26-Dec-2011 23:11 # of medium contigs (>= 500 bp and < 1000 bp): 4236
    26-Dec-2011 23:11 # of short contigs (>= 100 bp and < 500 bp): 48647
    26-Dec-2011 23:11 Running vmatch version: 2.1.6
    26-Dec-2011 23:11 Removing redundant, or overlapping contigs (vmatch 2.1.3 or later).
    26-Dec-2011 23:12 Removing temporary vmatch files...
    26-Dec-2011 23:12 Contig statistics (minimum contig length: 100)
    26-Dec-2011 23:12 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/vmatch_contigs.fa
    26-Dec-2011 23:12 number of sequences: 42875
    26-Dec-2011 23:12 number of total bases: 12659855
    26-Dec-2011 23:12 N50: 307
    26-Dec-2011 23:12 median length: 221
    26-Dec-2011 23:12 shortest contig: 105
    26-Dec-2011 23:12 longest contig: 6225
    26-Dec-2011 23:12 # of long contigs (>= 1000 bp): 828
    26-Dec-2011 23:12 # of medium contigs (>= 500 bp and < 1000 bp): 3425
    26-Dec-2011 23:12 # of short contigs (>= 100 bp and < 500 bp): 38622
    26-Dec-2011 23:12 Merging contigs using minimum overlap: 40
    26-Dec-2011 23:12 Merging... removing exact duplicates...
    26-Dec-2011 23:12 Merging... running minimus2...
    26-Dec-2011 23:17 Finished merging contigs.
    26-Dec-2011 23:17 Contig statistics (minimum contig length: 100)
    26-Dec-2011 23:17 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/merged_contigs.fa
    26-Dec-2011 23:17 number of sequences: 22520
    26-Dec-2011 23:17 number of total bases: 7130044
    26-Dec-2011 23:17 N50: 343
    26-Dec-2011 23:17 median length: 205
    26-Dec-2011 23:17 shortest contig: 105
    26-Dec-2011 23:17 longest contig: 10702
    26-Dec-2011 23:17 # of long contigs (>= 1000 bp): 752
    26-Dec-2011 23:17 # of medium contigs (>= 500 bp and < 1000 bp): 1857
    26-Dec-2011 23:17 # of short contigs (>= 100 bp and < 500 bp): 19911
    26-Dec-2011 23:17 Contig extension started
    26-Dec-2011 23:17 Starting contig extension with min overlap: 200 and stranded? 0
    26-Dec-2011 23:17 running vmatch...
    26-Dec-2011 23:17 appending singletons...
    26-Dec-2011 23:17 extending individual clusters...
    26-Dec-2011 23:23 Finished contig extension
    26-Dec-2011 23:23 number of clusters extended 95 clusters...
    26-Dec-2011 23:23 Contig statistics (minimum contig length: 100)
    26-Dec-2011 23:23 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/extended_contigs.fa
    26-Dec-2011 23:23 number of sequences: 22471
    26-Dec-2011 23:23 number of total bases: 7111135
    26-Dec-2011 23:23 N50: 342
    26-Dec-2011 23:23 median length: 205
    26-Dec-2011 23:23 shortest contig: 105
    26-Dec-2011 23:23 longest contig: 10702
    26-Dec-2011 23:23 # of long contigs (>= 1000 bp): 754
    26-Dec-2011 23:23 # of medium contigs (>= 500 bp and < 1000 bp): 1839
    26-Dec-2011 23:23 # of short contigs (>= 100 bp and < 500 bp): 19878
    26-Dec-2011 23:23 Clustering by locus...
    26-Dec-2011 23:24 Contig statistics (minimum contig length: 100)
    26-Dec-2011 23:24 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/loci.fa
    26-Dec-2011 23:24 number of sequences: 22471
    26-Dec-2011 23:24 number of total bases: 7111135
    26-Dec-2011 23:24 N50: 342
    26-Dec-2011 23:24 median length: 205
    26-Dec-2011 23:24 shortest contig: 105
    26-Dec-2011 23:24 longest contig: 10702
    26-Dec-2011 23:24 # of long contigs (>= 1000 bp): 754
    26-Dec-2011 23:24 # of medium contigs (>= 500 bp and < 1000 bp): 1839
    26-Dec-2011 23:24 # of short contigs (>= 100 bp and < 500 bp): 19878
    26-Dec-2011 23:24 Counting...
    26-Dec-2011 23:24 Starting counting...
    26-Dec-2011 23:24 Library SE PE_total PE_agree PE_conflict
    26-Dec-2011 23:24 all 5681 1108172 912424 195748
    26-Dec-2011 23:28 Generating RAF from sam...
    26-Dec-2011 23:28 converting SAM->BAM /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/1m-pair.trim36.all.loci.fa.sam...
    26-Dec-2011 23:28 converting SAM->BAM /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/40K_single_read.fastq_score64.fastq.trim36.all.loci.fa.sam...
    26-Dec-2011 23:28 converting BAM->RAF /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/40K_single_read.fastq_score64.fastq.trim36.all.loci.fa.sam.bam.raf...
    26-Dec-2011 23:28 converting BAM->RAF /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/1m-pair.trim36.all.loci.fa.sam.bam.raf...
    26-Dec-2011 23:31 Determining mappable gene lengths...
    26-Dec-2011 23:33 Normalizing counts by gene length...
    26-Dec-2011 23:33 Normalizing counts by # of reads in the sample...
    26-Dec-2011 23:33 Labeling contigs with expression information and identifying potential precursors...
    26-Dec-2011 23:34 total transcripts: 22471
    26-Dec-2011 23:34 precursor transcripts: 58
    26-Dec-2011 23:34 non-precursor transcripts: 22413
    26-Dec-2011 23:34 Filtering by min contig length (100)...
    26-Dec-2011 23:34 Contig statistics (minimum contig length: 100)
    26-Dec-2011 23:34 file: /gpfs/scratch/ldong/rnnotator-test/out-1mpair-and-40ksingle/final_contigs.fa
    26-Dec-2011 23:34 number of sequences: 22471
    26-Dec-2011 23:34 number of total bases: 7111135
    26-Dec-2011 23:34 N50: 342
    26-Dec-2011 23:34 median length: 205
    26-Dec-2011 23:34 shortest contig: 105
    26-Dec-2011 23:34 longest contig: 10702
    26-Dec-2011 23:34 # of long contigs (>= 1000 bp): 754
    26-Dec-2011 23:34 # of medium contigs (>= 500 bp and < 1000 bp): 1839
    26-Dec-2011 23:34 # of short contigs (>= 100 bp and < 500 bp): 19878
    26-Dec-2011 23:34 Assembly finished!


    Comments