alignments

Here are the steps I followed to redo the alignments using bwa mem and then finally doing variant calling.

STEP 1 bwa alignment

In this step we are aligning the fastqfiles to the fasta genome and the end result is sam files. I used the bwa mem command to do align these files to the dovetail whole genome of timema. To do this I used the perl wrapper script wrap_qsub_slurm_bwa_mem.pl saved in the same folder.

I did these alignments for the fastq files I downloaded from the cluster.

The fastq files are here : /uufs/chpc.utah.edu/common/home/u6007910/projects/timema_adaptation/alignments/fastqfiles

The samfiles are here : /uufs/chpc.utah.edu/common/home/u6007910/projects/timema_adaptation/alignments/samsaifiles

The whole genome alignment is here :

/uufs/chpc.utah.edu/common/home/u6000989/data/timema/tcrDovetail/version3/re_mod_map_timema_06Jun2016_RvNkF702.fasta

I modified Zach's wrapper perl script to run the bwa alignments (the mains cript uses bwa aln to do alignments).

This script is saved as: /uufs/chpc.utah.edu/common/home/u6007910/projects/timema_adaptation/scripts/wrap_qsub_slurm_bwa_mem.pl.

Usage: perl wrap_qsub_slurm_bwa_mem.pl *.fastq

This script will create sam files from the fastq files. These samfiles will be used for conversion to bam files and then variant calling.

Here are commands and parameters I used for alignment:

bwa mem -t 12 -k 15 -r 1.3 -T 30 $genome $file > mem"."$file".".sam \n

OPTIONS:

-t = Number of threads [1]

-k = Minimum seed length. Matches shorter than INT will be missed. The alignment speed is usually insensitive to this value unless it significantly deviates 20. [19]

-r = Trigger re-seeding for a MEM longer than minSeedLen*FLOAT. This is a key heuristic parameter for tuning the performance. Larger value yields fewer seeds, which leads to faster alignment speed but lower accuracy. [1.5]

-T = Don’t output alignment with score lower than INT. This option only affects output. [30]

$ genome = genome file

$file = fastq input file

> = output file.sam

STEP 2 index and sort

In this step I am using the samfiles generated in previous steps and converting them to bam format (basically a simpler format).

I have samfiles for each species saved in the folder: /uufs/chpc.utah.edu/common/home/u6007910/projects/timema_adaptation/alignments/samsaifiles/mem/. I used samtools to do this and here is the samtools command I used:

samtools view -b -S -o $out"."bam $file

samtools sort $out"."bam -o $out"."sorted.bam

samtools index $out"."sorted.bam

To do this I used the perl wrapper script wrap_qsub_slurm_sam2bam.pl saved in the same folder.

Page updated

Google Sites

Report abuse