Post date: Aug 17, 2017 2:57:44 PM
Using a revised version of bgc (1.04b), which has a simpler scheme for updating hybrid index (assuming alpha and beta = 0) and now uses genotype likelihood format data, I am analyzing Maria's frog hybrid zone data.
The genotype likelihood files came from Maria, and are in /uufs/chpc.utah.edu/common/home/u6000989/projects/frogs/clines/. I wanted to cut the original SNP set down to ancestry informative markers, in this case using a minimum difference of 0.2. So, I first estimated allele frequencies with estpEM (p_parent1.txt and p_parent2.txt), and then generated the list of SNPs to keep (in getLociToKeep.R):
## identify a subset of loci to keep based on differences in parenatl allele frequencies, which were inferred from estpEM
p1<-read.table("p_parent1.txt",header=F)
p2<-read.table("p_parent2.txt",header=F)
dp<-abs(p1[,3]-p2[,3])
sum(dp>0.2)
##[1] 8002
## 8002 snps with parental allele freq dif > 0.2
hidif<-dp>.2
write.table(as.numeric(hidif),"difSnps20pct.txt",row.names=FALSE,col.names=FALSE,quote=FALSE)
I then ran subSetSnps.pl on the three genotype files to generate the sub* files, e.g.,
perl subSetSnps.pl difSnps20pct.txt parent1.gl
generates sub_parent1.gl.
The final SNP set had 8002 SNPs.
I then used bgc to fit the genomic clines, via the wrapper script wrap_qsub_slurm_bgc.pl. I ran 5 chains, here is one example command (note that hi-index samples are initially generated with a 1000 step burnin and 3000 additional steps with no thinning, and then re-sampled):
cd /scratch/general/lustre/bgc/
bgc -a /uufs/chpc.utah.edu/common/home/u6000989/projects/frogs/clines/sub_parent1.gl -b /uufs/chpc.utah.edu/common/home/u6000989/projects/frogs/clines/sub_parent2.gl -h /uufs/chpc.utah.edu/common/home/u6000989/projects/frogs/clines/sub_hybrids.gl -F bgcout_frog4 -O 0 -x 25000 -n 5000 -t 5 -p 1 -q 1 -N 1 -m 0 -d 1 -s 1 -I 1 -T 2
mv ./bgcout_frog4.hdf5 /uufs/chpc.utah.edu/common/home/u6000989/projects/frogs/clines/