sparse factor analysis-3ix13

Post date: Sep 03, 2013 8:45:55 PM

Engelhart and Stephens (2010; doi:10.1371/journal.pgen.1001117) propose sparse factor analysis as an alternative to admixture proportions or pca to summarize population structure. This is an alternative way to decompose and summarize a genotype matrix (in this case they use the genotype matrix, not the genotype covariance matrix). They include software for this analysis (sfa_linux, in /usr/local/bin/). There are a few options I don't really understand and I simply ran the program with k = 3 factors and the common genotype data, as follows:

sfa_linux -gen gmat.txt -g 1521 -n 15076 -k 3 -iter 50 -rand 433 -o out

I then used R to plot the results of the loading matrix (I think), as follows:

## sfa

nloci<-15076

nind<-1521

sfa<-read.table("out_lambda.out",header=F)

## plot pca common

a<-2

b<-1

mycolors<-c("orange","orangered","forestgreen","rosybrown","gold","darkblue","lightblue","salmon","black","gray","brown","violet","darkred")

leginfo<-read.table("../admixprops/results/legend.txt",header=F)

pdf("sfaplot.pdf",width=16,height=8)

par(mfrow=c(1,2))

plot(sfa[,a],sfa[,b],pch=20,cex=0.5,type='n',xlab="factor 2",ylab="factor 1",cex.lab=1.3)

for(i in 1:13){

A<-which(leginfo[,2]==i)

text(sfa[A,a],sfa[A,b],leginfo[A,3],cex=0.6,col=mycolors[i])

}

a<-3

plot(sfa[,a],sfa[,b],pch=20,cex=0.5,type='n',xlab="factor 3",ylab="factor 1",cex.lab=1.3)

for(i in 1:13){

A<-which(leginfo[,2]==i)

text(sfa[A,a],sfa[A,b],leginfo[A,3],cex=0.6,col=mycolors[i])

}

legend(1.7,0.15,legend=c("anna","ricei","idas","long","sublv","mel-gb","mel-rm","mel-an","sr-nev","white","warner","jhole","dubs"),fill=mycolors,cex=0.7)

dev.off()

The sparse factors do not quite correspond to pc's (pc 1 is similar to factor 2, and factors 2 and 3 are highly correlated), but the overall pattern is very similar (sfa plot). I am not sure whether this is worth pursuing more with the current data set (given the similarity to pca, we don't really learn anything new), but if so I need to read more.

Page updated

Google Sites

Report abuse