I am using random forest with breeding values for caterpillar and plant traits.
Some major questions:
Can I use Random forest to predict cat breeding value from plant genotype breeding value?
If I cbind the IR and plant traits together and run random forest does prediction strength of caterpillar traits increases with IR data?
Can we predict plant traits from IR traits?
Relevant output is ranking of top predictors and OOB values
Trait breeding values were calculated and are located at:
/uufs/chpc.utah.edu/common/home/u6000989/projects/mtrunc_mel/gemma/output/prediction
These were transferred into my mtrun file with
scp -r /uufs/chpc.utah.edu/common/home/u6000989/projects/mtrunc_mel/gemma/output/prediction/bv_pred_cat.txt /uufs/chpc.utah.edu/common/home/u6015714/mtrun/data
And onto my computer with
scp -r u6015714@kingspeak.chpc.utah.edu:/uufs/chpc.utah.edu/common/home/u6015714/mtrun/data/pntest_mtrunc_pruned.txt ~/Documents/mtrun
The relevant files are bv_pred_cat.txt, bv_pred_plant.txt and bv_pred_ir.txt.
Each of these files contains 94 rows, which are the individual plants which are in the same order as the genome. The traits are also in the same order as the other files.
Bv_cat has 4 columns (cat traits)
Bv_plant has 9 columns of plant traits
Bv IR has 19 columns of unknown IR spec freq, will refer to these as IR 1-19 for now.
Code for getting it all set up:
setwd("/home/tsaley/Documents/mtrun")
#load bv matrix
bv_plant<- read.table("bv_pred_plant_headers.txt", header=T)
bv_cat<- read.table("bv_pred_cat_with_headers.txt", header=T)
bv_ir<- read.table("bv_pred_ir_headers.txt", header=T)
#check dimensions
dim(geno)
#transpose genotype matrix
geno_t<-t(geno)
#check dimesnions
dim(geno_t)
#check dimensions of trait
dim(trait)
#create matrix with trait and genotype
cat8_plant<-cbind(bv_cat$p_cat_wgt_8, bv_plant)
#check dimensions
dim(cat8_plant)
#load Random forest
install.packages("randomForest")
library(randomForest)
#run random forest
cat8_plant<-randomForest(cat8_plant[,1] ~ ., data=cat8_plant, importance=TRUE, proximity=TRUE)