Post date: Aug 20, 2019 3:42:58 PM
I am using gemma to map performance (survival and weight) in T. knulli (and T. petita, but I am mostly dealing with T. knulli here). Scripts and files are in /uufs/chpc.utah.edu/common/home/u6000989/projects/timema_confiers/redwood_gwa.
Note that for interpretation, my old LG11 appears to mostly correspond with Sclu3Hs_12033 in the new HiC stripe genome.
1. Genotype estimates were taken from entropy over 3 chains and k = 2,3.
estpost_entropy -o g_tknul.txt -p gprob -s 0 -w 0 o_tknul_k*hdf5
file = o_tknul_k2_ch1.hdf5
file = o_tknul_k2_ch2.hdf5
file = o_tknul_k2_ch3.hdf5
file = o_tknul_k3_ch1.hdf5
file = o_tknul_k3_ch2.hdf5
file = o_tknul_k3_ch3.hdf5
parameter dimensions for gprob: loci = 21909, ind = 138, genotypes = 3, chains = 6
estpost_entropy -o g_tpet.txt -p gprob -s 0 -w 0 o_tpet_k*hdf5
file = o_tpet_k2_ch1.hdf5
file = o_tpet_k2_ch2.hdf5
file = o_tpet_k2_ch3.hdf5
file = o_tpet_k3_ch1.hdf5
file = o_tpet_k3_ch2.hdf5
file = o_tpet_k3_ch3.hdf5
parameter dimensions for gprob: loci = 28690, ind = 69, genotypes = 3, chains = 6
2. I then massaged the phenotype and genotype data in R. Note that I removed the effects of sex and stage/age on weight and split the files by treatment, and with or without including BCTURN (a more distant population). This is all in formatPhenoGeno.R.
Run:
formatPhenoGeno.R
A bit more formatting was done with:
perl mkGenoFile.pl geno_*
3. I then ran 10 chains for each of the four data sets (C or RW with and without BCTURN) in gemma (Version 0.95alpha, 03/04/2016)
sbatch RunGemma.sh
which runs:
#!/usr/bin/perl
#
# run gemma jobs
#
use Parallel::ForkManager;
my $max = 20;
my $pm = Parallel::ForkManager->new($max);
foreach $g (@ARGV){
$ph = $g;
$ph =~ s/mod_g/ph/;
$base = $g;
$base =~ s/mod_geno_//;
$base =~ s/\.txt//;
foreach $ch (0..9){
system "sleep 2\n";
foreach $trait (1..3){
$pm->start and next; ## fork
$out = "o_$base"."_ph$trait"."_ch$ch";
system "gemma -g $g -p $ph -bslmm 1 -n $trait -maf 0.0 -w 200000 -s 1000000 -o $out\n";
$pm->finish;
}
}
}