Post date: Aug 20, 2014 3:43:29 PM
I wrote a script to extract the genotype likelihoods from the filtered variant (vcf) files and generate a genotype likelihood file. The script is /home/A01963476/projects/timema_wgexperiment/variants/subvcf2gl.pl. This script (at least as I ran it this time) also discards any variants with minor allele frequency < 1%. The result file with the genotype likelihoods (3 per ind. and locus) is timema500g.gl (in the same directory), and there is also a file (af_timema500g.txt) with the alternative allele frequencies. One more note, there are 491 individuals in these files, thus it appears that a few individuals did not make it through the alignment and filtering processes (I don't really know why, but I am not going to worry about this for now).
We ended up retaining 8,519,957 SNVs for the 491 individuals.