Post date: Aug 20, 2014 7:6:14 PM
I am working with one of Patrik's post-docs (Moe) on a project to look at genome-wide divergence between Timema populations and species with expected different degrees of divergence. As part of this project Patrik had genomes sequenced for 384 individuals (each over four lanes). The sequence data are in /home/A01963476/data/timema/moe_radiation_wgrs, and there are two *fastq.gz files per individual x lane as these are paired reads. The directory also includes an indIds.txt files that provides the sample names for each file. I am using bwa mem to align the reads to version 0.3 of the Timema genome. This might be a bit tricky as some taxa are more closely related to T. cristinae than others. I am using a local version of the wrap_qsub_rc_runbwa.pl script to run all of the alignments. Here is an example of the command for bwa for a single set of sequences.
bwa mem -t 20 -k 20 -w 100 -r 1.3 -T 30 -R '@RG\tID:WTCHG_135840_296\tPL:ILLUMINA\tLB:WTCHG_135840_296\tSM:moe09C2' /home/A01963476/data/timema/draft_genome/draft0.3/mod_lg_timemaGenome.fasta /home/A01963476/data/timema/moe_radiation_wgrs/WTCHG_135840_296_1.fastq.gz /home/A01963476/data/timema/moe_radiation_wgrs/WTCHG_135840_296_2.fastq.gz > /home/A01963476/data/timema/moe_radiation_wgrs/alignments/aln_WTCHG_135840_296.sam 2> /home/A01963476/data/timema/moe_radiation_wgrs/alignments/error_WTCHG_135840_296.log