BSLMM analyses for color and survival in T. chumash 2019 experiment-melanistic reference genome

Post date: Sep 23, 2019 8:32:34 PM

I am re-mapping color (RG and GB) across all of the individuals and survival within treatment. I also threw in sex within treatment for fun/as a control. Note that some individuals had screwy IDs, so I will want to update/rerun this later (or just focus on the alignment to the melanic genome, which I will start next).

1. Drop low-coverage individuals and match phenotype/genotype data and write subsets. All done in R from the Variants_mel subdirectory.Code is in formatPhenoGeno.R:

## format genotype data and phenotype data for gemma

## read in genotype point estimates

L<-11990

N<-454

g<-matrix(scan("pntest_filtered2x_tchum_mel_gbs_2019.txt",n=N*L,sep=" "),nrow=L,ncol=N,byrow=TRUE)

keep<-scan("../Variants/hiCov2x.txt") ## single column, denotes whether 1 or not 0 mean cov for an individual > 2

## true for 437, going to drop the others

gids<-read.table("../Variants/IdsGen.txt",header=FALSE)

## read phenotype data

phtab<-read.table("../Variants/2019_Tchumash_transplant_table.csv",sep=",",header=TRUE)

ph<-matrix(NA,nrow=N,ncol=8)

ph<-as.data.frame(ph)

for(i in 1:N){

a<-which(as.character(phtab[,1])==as.character(gids[i,1]))

if(length(a)==1){

ph[i,]<-phtab[a,c(1,3:6,11,15:16)]

}

colnames(ph)<-colnames(phtab)[c(1,3:6,11,15:16)]

#number codes

# Geneder: M = 2, F = 1

# Color: G = 1, GB = 2, I = 3, M = 4

# Treatment: AC = 1, MM = 2

## subset those that have good genetic data

g_hc<-g[,keep==1]

ph_hc<-ph[keep==1,]

dim(g_hc)

#[1] 11990 437

dim(ph_hc)

#[1] 437 8

## write output for color mapping

## no need to control for gender, explains 1.8% of RG and <1% for GB

write.table(file="pheno_color.txt",ph_hc[,7:8],row.names=FALSE,col.names=FALSE,quote=FALSE)

write.table(file="g_tchum.txt",g_hc,row.names=FALSE,col.names=FALSE,quote=FALSE)

## now by treatment, gender and survival (former is kind of for fun)

AC<-which(ph_hc$Treatment==1)

MM<-which(ph_hc$Treatment==2)

write.table(file="pheno_AC.txt",ph_hc[AC,c(3,6)],row.names=FALSE,col.names=FALSE,quote=FALSE)

write.table(file="g_tchum_AC.txt",g_hc[,AC],row.names=FALSE,col.names=FALSE,quote=FALSE)

write.table(file="pheno_MM.txt",ph_hc[MM,c(3,6)],row.names=FALSE,col.names=FALSE,quote=FALSE)

write.table(file="g_tchum_MM.txt",g_hc[,MM],row.names=FALSE,col.names=FALSE,quote=FALSE)

## note, survival is reasonably balanced

apply(ph_hc[MM,c(3,6)],2,mean)

# Gender Survival

#1.4037559 0.1971831

apply(ph_hc[AC,c(3,6)],2,mean)

# Gender Survival

#1.4398148 0.2453704

2. Gemma BSLMM analyses (Version 0.95alpha, 03/04/2016). From here I am in the Gemma_mel subdirectory.

First, I generated the proper genotype input files with mkGenoFile.pl, which makes the mod* versions (not allele IDs are arbitrary). Then I ran the analysis with RunGemma.sh:

#!/bin/sh

#SBATCH --time=96:00:00

#SBATCH --nodes=1

#SBATCH --ntasks=24

#SBATCH --account=gompert-kp

#SBATCH --partition=gompert-kp

#SBATCH --job-name=gemma

#SBATCH --mail-type=FAIL

#SBATCH --mail-user=zach.gompert@usu.edu

module load gsl

module load gemma

cd /uufs/chpc.utah.edu/common/home/u6000989/projects/timema_genarch/color_exp_2019_gbs/Gemma_mel

perl RunGemmaFork.pl mod_g*

This runs RunGemmaFork.pl (10 chains, color, sex and survival),

#!/usr/bin/perl

# run gemma jobs

use Parallel::ForkManager;

my $max = 20;

my $pm = Parallel::ForkManager->new($max);

my $ph;

foreach $g (@ARGV){

if($g =~ m/AC/){

$ph = "pheno_AC.txt";

}

elsif($g =~ m/MM/){

$ph = "pheno_MM.txt";

}

else{

$ph = "pheno_color.txt";

}

$base = $g;

$base =~ s/mod_geno_//;

$base =~ s/\.txt//;

foreach $ch (0..9){

system "sleep 2\n";

foreach $trait (1..2){

$pm->start and next; ## fork

$out = "o_$base"."_ph$trait"."_ch$ch";

system "gemma -g $g -p $ph -bslmm 1 -n $trait -maf 0.0 -w 200000 -s 1000000 -o $out\n";

$pm->finish;

}

$pm->wait_all_children;

Page updated

Google Sites

Report abuse