data for Maria's frog hybrid zone

Post date: Apr 21, 2017 2:31:11 PM

I am running some analyses for Maria on a hybrid zone involving two frogs from central America. The project is here: /uufs/chpc.utah.edu/common/home/u6000989/projects/frogs/

1. She sent me the variants.gl file in the variants sub-directory. I used splitPops.pl to split this into 15 population files.

2. I ran estpEM via runEstp.pl from the variants sub-directory to obtain ML allele frequency estimates for each population and for the full data set. I used the default options with tolerance of 0.001 and a maximum of 20 iterations. The estimates for all 103825 SNPs are in the freqs sub-directory and start with p_*.

3. I used the orderFreqs.R script to generate a single combined allele frequency files, frog_pop_frequencies.txt. There is one row per SNP in the same order they were in and one column per order (ordered by your Geo column, see sortPops.txt). I gave this file to Maria.

4. I generated a set of common variants for entropy by running getCommon.pl on variants.gl which yielded common_variants.gl. This includes 39,018 common (MAF > 5% based on all samples) SNPs.

5. I then sub-set this to retain a random SNP per contig (i.e., "unlinked" SNPs). This was done by first selecting a random subset of SNPs in R:

x<-as.matrix(read.table("commonSnps.txt",header=F))

uni<-unique(x[,1])

nc<-length(uni)

keep<-rep(NA,15444)

for(i in 1:nc){

a<-which(x[,1]==uni[i])

if(length(a) > 1){

keep[i]<-sample(a,1)

}

else{

keep[i]<-a

}

write.table(keep,"commonSubToKeep.txt",row.names=F,col.names=F,quote=F)

And then by running getKeep.pl on common_variants.gl to generate sub_common_variants.gl, which contains 15,444 SNPs. This is the file I will use for entropy.

Page updated

Google Sites

Report abuse