Now that K=2, K=3 runs for entropy are done, I have to do downstream stuff to calculate admixture proportions and then plot these values. Here is what I di:
Folder: /uufs/chpc.utah.edu/common/home/u6007910/projects/lyc_dubois/entropy/mcmc
Step 1: Run estpop_entropy to convert the chains for each source K (2,3,4,5) run to get mean, median and CIs for the admixture proportions ("q files")
Usage: /uufs/chpc.utah.edu/common/home/u6000989/bin/estpost_entropy -o q_K2.txt -p q -s 0 -w 0 ento_lyc_hybridsCh1K3.hdf5 ento_lyc_hybridsCh2K3.hdf5 ento_lyc_hybridsCh3K3.hdf5
This should be repeated for each K (3, 4, 5...). This will create the q_K2.txt file to which I manually added the header mean, median, CI_LB, CI_UB. This file will contain the summary of admixture proportions across the 3 chains I ran for entropy. There will be admixture proportions for each population source (K) and these will be repeated for the 835 individuals used in the analysis. Now, I wanted to generate some kind of summary of these admixture proportions so I plotted them in R.
Step 2: Summary of admixture proportions, initial pass of plots in R
Here is the code I used to do this:
#plotting the admixture proportion means for K=2, here the means are repeated twice (2 X 835 individuals = 1670)
q2 <- read.csv("q_K2.txt", header=TRUE)
head(q2) #sanity check
pops <- read.table("temp_pop.txt, header=FALSE) #order of 835 populations
pops2 <- rbind(pops, pops) #create a column of pops repeated twice for each individual in the q file
q2pops <- cbind(pops2, q2)
head(q2pops)
#plot
col.rainbow <- rainbow(23) #get 23 colors for 23 populations in the data
plot(q2pops$mean[1:835], col=col.rainbow[q2pops$V1[1:835]])
dev.copy(pdf, "qplots_k2.pdf")
dev.off()
#repeat for admixture proportion means for K=3, here the means are repeated thrice (3 X 835 individuals = 2505)
q3 <- read.csv("q_K3.txt", header=TRUE)
head(q3) #sanity check
pops <- read.table("temp_pop.txt, header=FALSE) #order of 835 populations
pops3 <- rbind(pops, pops, pops) #create a column of pops repeated thrice for each individual in the q file
q3pops <- cbind(pops3, q3)
head(q3pops)
#plot
col.rainbow <- rainbow(23) #get 23 colors for 23 populations in the data
plot(q3pops$mean[1:835], col=col.rainbow[q3pops$V1[1:835]])
dev.copy(pdf, "qplots_k3.pdf")
dev.off()
#plot admixture barplots for K=2
q2mat<-cbind(q2pops$mean[1:835], q2pops$mean[836:1670])
q2mat_t<-t(q2mat)
#plot
barplot(q2mat_t, col=c("lightgreen","hotpink"), space=0, xlim=c(1,835),border=NA,width=1,aex=F, names.arg=pops$V1, ylab="K=2-admixture proportions",xlab="Individuals")
dev.copy(pdf, "admixpropplot_K2.pdf")
dev.off()