Post date: Dec 03, 2013 3:31:46 PM
The entropy MCMC analyses finished with similar DIC across chains (with the slight exception of k = 5). I estimated genotypes by averaging the posterior distribution across the three chains for each k from 2 to 5. The mean genotype, mngen_k2to5.txt (as well as genotypes and admixture proportions for each k), is in projects/lycaeides_hostplant/entropy/results/. I then calculated genetic covariances among individuals and used these as the basis for PCA. The first PC separated melissa in the far west (western Nevada and California) from those in most of the Great Basin and Rockies, whereas the second PC delineated the southern melissa populations (BHP and TPT). Alfalfa feeding populations do not cluster together. One other cool thing, it looks like a single SLA individual (astragalus feeding) clusters more with eastern populations, perhaps indicative of recent migration. The PC figure is in entropy/figs/pcplotPlnts.pdf, and the R workspace is in entropy/results/entpca.rwsp. Here is the R code I used,
## read data or load work space
nind<-525
nloc<-14051
g<-matrix(scan("mngen_k2to5.txt",n=nind*nloc,sep=" "),nrow=nloc,ncol=nind,byrow=T)
## calculate N x N genotype covariance matrix
gmn<-apply(g,1,mean)
gmnmat<-matrix(gmn,nrow=nloc,ncol=nind)
gprime<-g-gmnmat
gcovarmat<-matrix(NA,nrow=nind,ncol=nind)
for(i in 1:nind){
for(j in i:nind){
if (i==j){
gcovarmat[i,j]<-cov(gprime[,i],gprime[,j])
}
else{
gcovarmat[i,j]<-cov(gprime[,i],gprime[,j])
gcovarmat[j,i]<-gcovarmat[i,j]
}
}
}
## ## pca on the genotype covariance matrix
pcgcov<-prcomp(x=gcovarmat,center=TRUE,scale=FALSE)
ids<-read.table("../data/melIds.txt",header=F)
ids<-as.character(ids[,1])
ids[which(ids=='SV0')]<-'SVY'
popmns1<-tapply(X=pcgcov$x[,1],INDEX=as.character(ids[,1]),mean)
popmns2<-tapply(X=pcgcov$x[,2],INDEX=as.character(ids[,1]),mean)
plnt<-read.table("../data/popplant.txt",header=FALSE)
out<-summary(pcgcov)
pc1<-round(100*out[[6]][2,1],1)
pc2<-round(100*out[[6]][2,2],1)
pdf("../figs/pcplot.pdf",width=8,heigh=8)
par(pty='s')
par(mar=c(5,5,0.5,0.5))
plot(pcgcov$x[,1],pcgcov$x[,2],col="gray",xlab=paste("PC1 (",pc1,"%)"),ylab=paste("PC2 (",pc2,"%)"),cex.lab=1.3,cex.axis=1.6,type='n')
for(j in 1:dim(plnt)[1]){
X<-which(ids==as.character(plnt[j,1]))
if (as.character(plnt[j,2])=='M'){
myc<-"lightblue"
}
else{
myc<-"orange"
}
points(pcgcov$x[X,1],pcgcov$x[X,2],col=myc)
}
text(popmns1,popmns2,unique(ids))
dev.off()
pdf("../figs/pcplotPlnts.pdf",width=8,heigh=8)
par(pty='s')
par(mar=c(5,5,0.5,0.5))
plot(pcgcov$x[,1],pcgcov$x[,2],col="gray",xlab=paste("PC1 (",pc1,"%)"),ylab=paste("PC2 (",pc2,"%)"),cex.lab=1.3,cex.axis=1.6,type='n')
for(j in 1:dim(plnt)[1]){
X<-which(ids==as.character(plnt[j,1]))
if (is.na(plnt[j,3])==TRUE){
myc<-"gray"
}
else{
if (as.character(plnt[j,3])=='M'){
myc<-"lightblue"
}
else if (as.character(plnt[j,3])=='A'){
myc<-"orange"
}
else if (as.character(plnt[j,3])=='L'){
myc<-"pink"
}
else if (as.character(plnt[j,3])=='G'){
myc<-"forestgreen"
}
}
points(pcgcov$x[X,1],pcgcov$x[X,2],col=myc)
}
text(popmns1,popmns2,unique(ids))
legend(-0.5,-0.25,legend=c("medicago","astragalus","lupinus","glycyrrhiza","native sp."),fill=c("lightblue","orange","pink","forestgreen","gray"),bty='n')
dev.off()