Post date: May 30, 2014 10:34:48 PM
I am using samtools/bcftools to generate a simple estimate of the site allele frequencies in experimental populations (combined survivors from the two plant treatments, separate for SLA and GLA). I first generated a bcf file using on the experimental bugs and only the loci that I called as variable with the full data set (these are in locuslist.txt),
samtools mpileup -C 50 -d 50 -f /labs/evolution/data/lycaeides/melissa_genome/final.assembly.fasta -l locuslist.txt -q 10 -Q 15 -D -g -I /labs/evolution/data/lycaeides/lycaeides_gbs/AssembliesHost/melissaGbs2genome/alnSELEXP-*sorted.bam > varExp.bcf
I then split the bcf file into two bcf files, one for each experimental population (not treatment), note glalist.txt and slalist.txt contain the experimental bugs from GLA and SLA, respectively.
bcftools view -b -s glalist.txt varExp.bcf > glaVar.bcf
bcftools view -b -s slalist.txt varExp.bcf > slaVar.bcf
I then used bcftools to convert these to vcf files with all loci of the focal loci (even if they weren't variable in the sample), with one vcf file per population.
bcftools view -c -p 0.01 -P full -t 0.001 -l locuslist.txt glaVar.bcf > glaVar.vcf
bcftools view -c -p 0.01 -P full -t 0.001 -l locuslist.txt slaVar.bcf > slaVar.vcf
I then used a simple perl one-liner to extract the ML allele frequency estimates from the vcf files
perl -ne 'print "$1\n" if /AF1=([^,;]+)/' slaVar.vcf > slaAfs.txt
perl -ne 'print "$1\n" if /AF1=([^,;]+)/' glaVar.vcf > glaAfs.txt
Next I need to plot the joint frequency spectrum.