Post date: Sep 20, 2019 8:3:39 PM
We wanted to think visually and quantitatively about how well the color space used by Timema matches/fills that provided by the hosts. For this, I am working in /uufs/chpc.utah.edu/common/home/u6000989/projects/timema_genarch/color_overlap/
The data comprise 1205 plants and 1528 Timea; files = 2015host_colour.data.csv and 2015Timema_phenotype_gwas_radiation.csv, respectively.
I have an initial comparison of composite (across timema species and across all host pictures) Timema and host color spaces.
I first fit kernel densities to the host and Timema data sets. These give relative probabilities/densities across color space for each data set. Note that I used the same bounds for each when fitting these, which basically means I used the bounds defined by the most extreme plant pictures. I then simply calculated the coefficient of determination (r2) for the proportion of the Timema color space relative densities explained by the host color space color densities:
Answer = 51.2%
Note that this is probably sensitive to how big the total color space is. If we focused in on the denser parts it could go up or down some.
This doesn't account for uncertainty in any way and doing so is tricky because of the dependency across the space and on the individuals that went into the initial density estimation. To account for this, I did a two-step bootstrap where I re-sampled individuals (plants or bugs) to estimate the densities (relative probabilities) and then also re-sampled squares within the densities to estimate r2. This gives the following point estimate and bootstrap CIs:
43.5% (31.9%-56.4%)
The R code is in colorSpace.R and pasted below:
library(MASS)
library(scales)
timema <- read.csv("2015Timema_phenotype_gwas_radiation.csv", header=TRUE)
host <- read.csv("2015host_colour.data.csv", header=TRUE)
xlb<--0.11;xub<-0.42
ylb<-0;yub<-0.91
kdTimema<-kde2d(timema$latRG,timema$latGB,lims=c(xlb,xub,ylb,yub))
kdHost<-kde2d(host$RG,host$GB,lims=c(xlb,xub,ylb,yub))
o<-lm(as.vector(kdTimema$z) ~ as.vector(kdHost$z))
oo<-summary(o)
oo$r.squared
#[1] 0.5122362
## bootstrap, individuals and grid squares
N<-1000
r2<-rep(NA,N)
for(i in 1:N){
tx<-sample(timema$latRG,length(timema$latRG),replace=TRUE)
ty<-sample(timema$latGB,length(timema$latGB),replace=TRUE)
hx<-sample(host$RG,length(host$RG),replace=TRUE)
hy<-sample(host$GB,length(host$GB),replace=TRUE)
kdt<-kde2d(tx,ty,lims=c(xlb,xub,ylb,yub))
kdh<-kde2d(hx,hy,lims=c(xlb,xub,ylb,yub))
tz<-as.vector(kdt$z)
hz<-as.vector(kdh$z)
bs<-sample(1:length(tz),length(tz),replace=TRUE)
btz<-tz[bs]
bhz<-hz[bs]
o<-lm(btz~bhz)
oo<-summary(o)
r2[i]<-oo$r.squared
}
pdf("colorSpace.pdf",width=10,height=10)
par(mfrow=c(2,2))
par(mar=c(4,5,3,2))
cl<-1.4
cm<-1.5
plot(host$RG,host$GB,xlim=c(xlb,xub),ylim=c(ylb,yub),col=as.character(host$hex),pch=19,xlab="RG",ylab="GB",cex.lab=cl)
title(main="(A) Host plant colors",cex.main=cm)
plot(timema$latRG,timema$latGB,xlim=c(xlb,xub),ylim=c(ylb,yub),col=as.character(timema$hex),pch=19,xlab="RG",ylab="GB",cex.lab=1.4)
title(main=expression(paste("(B) ",italic("Timema"),"colors")),cex.main=cm)
plot(host$RG,host$GB,xlim=c(xlb,xub),ylim=c(ylb,yub),col="lightgray",pch=19,xlab="RG",ylab="GB",cex.lab=cl)
points(timema$latRG,timema$latGB,col=as.character(timema$hex),pch=19)
title(main=expression(paste("(C) Color space overlap")),cex.main=cm)
plot(host$RG,host$GB,xlim=c(xlb,xub),ylim=c(ylb,yub),col="lightgray",pch=19,xlab="RG",ylab="GB",cex.lab=cl)
points(timema$latRG,timema$latGB,col=alpha("red",.4),pch=19)
title(main=expression(paste("(D) Color space overlap")),cex.main=cm)
dev.off()