Post date: May 01, 2017 3:34:32 AM
30iv17. ZG and I assembled all individuals from this lane to the genome. We will need to run more assemblies -- all individuals who have been GBSed that haven't been aligned to the genome (except the recently sequenced inds for the Career grant from our longterm project).
----
The parsed reads from the lane specific to the GWA analysis are in Parsed reads for GWA lane are in /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/lycaeides_gbs/Parsed_Gwas/. This includes 255 individuals. Counts (sequences per individual) are in readCounts.txt. These generally look good (I haven't computed summary stats. on them yet), except GNP08-10M.fastq (480 reads).
We are using bwa (version 0.7.10-r789) to align the data for the 255 individuals to the L. melissa reference genome. This involves two algorithms, bwa aln and bwa samse. These are the same algorithms we have used in the past for GBS data with Lycaeides.
The alignments will be in /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/lycaeides_gbs/AssembliesGwa/.
The submission script is:
/uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/lycaeides_gbs/Scripts/wrap_qsub_slurm_bwa.pl
I executed this from
/uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/lycaeides_gbs/AssembliesGwa/:
perl ../Scripts/wrap_qsub_slurm_bwa.pl
It runs over the 255 files in gwabams (a list of the relevant fastq files).
Here is the set of commands for one file/individual:
cd
/uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/lycaeides_gbs/AssembliesGwa
bwa aln -n 4 -l 20 -k 2 -t 8 -q 10 -f alnYG12-85M.sai
/uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_genome/final.assembly.fasta
YG12-85M.fastq
bwa samse -n 1 -r
'@RG\tID:lyc-YG12-85M\tPL:ILLUMINA\tLB:lyc-YG12-85M\tSM:lyc-YG12-85M' -f
alnYG12-85M.sam
/uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_genome/final.assembly.fasta
alnYG12-85M.sai YG12-85M.fastq
These match what we have used for the other recent alignments (everything in the current Assemblies* directories).
This will generate the sam files. Next, we will need to compress them (convert to bam) and run variant calling once we have all of the alignment files (individuals) we want.
Here are number of individuals aligned at present
20 BHP
20 BIG
44 BKM
24 CAV
40 CLH
18 CPE
8 CRP
50 CSP
20 DCR
18 DOP
40 EGP
20 FCR
20 GLA
20 GNP
18 GVL
18 KHL
33 LAE
20 LCA
20 LKS
12 MBM
52 MTR
19 OCY
9 REF
20 REW
16 SCC
20 SDC
21 SHC
18 SLA
20 SMR
14 SOF
44 SOP
20 STB
13 STM
20 SUV
4 SWC
23 SWM
20 SYC
38 TIC
20 VCP
8 WAL
20 YBG