Post date: Jan 04, 2014 11:14:14 PM
The structural variant reads are in /home/A01963476/data/timema/timema_sv. I will use bwa mem (version 0.7.5a-r405) to align these to the Timema draft genome v 0.2 (the first version with linkage groups). The bwa mem algorithm should determine the orientation of the reads (but if it fails, this could be why). The job numbers are 75490-75509. Here are the commands I am using (default values are given below as well):
cd /home/A01963476/data/timema/timema_sv/
bwa mem -t 8 -k 20 -r 1.3 -U 20 -T 30 -r '@RG ID:timema-R23A-ART005' lg_timemaGenome.fasta R23A-ART005_1.fastq R23A-ART005_2.fastq > R23A-ART005.sam
## defaults from the usage instructions
Usage: bwa mem [options] <idxbase> <in1.fq> [in2.fq]
Algorithm options:
-t INT number of threads [1]
-k INT minimum seed length [19]
-w INT band width for banded alignment [100]
-d INT off-diagonal X-dropoff [100]
-r FLOAT look for internal seeds inside a seed longer than {-k} * FLOAT [1.5]
-c INT skip seeds with more than INT occurrences [10000]
-S skip mate rescue
-P skip pairing; mate rescue performed unless -S also in use
-A INT score for a sequence match [1]
-B INT penalty for a mismatch [4]
-O INT gap open penalty [6]
-E INT gap extension penalty; a gap of size k cost {-O} + {-E}*k [1]
-L INT penalty for clipping [5]
-U INT penalty for an unpaired read pair [17]
Input/output options:
-p first query file consists of interleaved paired-end sequences
-R STR read group header line such as '@RG\tID:foo\tSM:bar' [null]
-v INT verbose level: 1=error, 2=warning, 3=message, 4+=debugging [3]
-T INT minimum score to output [30]
-a output all alignments for SE or unpaired PE
-C append FASTA/FASTQ comment to SAM output
-M mark shorter split hits as secondary (for Picard/GATK compatibility)