Post date: Jun 01, 2016 9:10:37 PM
I am using estpEM to estimate population allele frequencies as a precursor to estimating genotypes and ancestry frequencies. First, I am working only with the 22 autosomes. Here is what I did.
1. Split nosex_lychybanc_notvryrare.gl by population (results are in /uufs/chpc.utah.edu/common/home/u6000989/projects/lyc_hybanc/variants/bypop/).
perl splitPops.pl ../nosex_lychybanc_notvryrare.gl
2. Run estpEM for all populations, uses a perl wrapper.
perl runEstpEm.pl lychybAutos_*gl
Here is an example,
estpEM -i lychybAutos_WAL.gl -o p_lychybAutos_WAL.txt -e 0.001 -m 20 -h 2
I then moved the allele frequency files to /uufs/chpc.utah.edu/common/home/u6000989/projects/lyc_hybanc/popfreqs/
3. Genotype point estimates (posterior means) were then inferred using the gl files with HWE priors based on the allele frequencies. I wrote a perl wrapper script for this around gl2genest.pl. Note that this version of gl2genest.pl also does some formatting for popanc.
perl runGenest.pl lychybAutos_*gl
Which runs this (as an example),
perl gl2genest.pl lychybAutos_SYC.gl p_lychybAutos_SYC.txt
The pntest* files are here: /uufs/chpc.utah.edu/common/home/u6000989/projects/lyc_hybanc/popfreqs.