Step 1. Running entropy for DBS and its parent populations
Directory: /uufs/chpc.utah.edu/common/home/u6007910/projects/lyc_dubois/entropy/complete/dubois_parents
I copied the subset_filtered_variantsLycaeides_modids.gl file to the infiles directory here to create the new input file for entropy for just DBS and the parental pops (see below)
pops = dbs, ckv, frc, sin, bld, lan
#individuals = 335
Use splitpops.pl to split the subset_filtered_variantsLycaeides_modids.gl file. Keep only the DBS and parents file which we need to rerun this analyses.
Go in vim and remove first line and the 1111 line in all the Lmpop files
#leave BLD as it is to keep the scaf position
sed 's/^[0-9]+\:[0-9]+//' Lmpop_CKV.gl | cut -d" " -f2-> ckv.gl
sed 's/^[0-9]+\:[0-9]+//' Lmpop_FRC.gl | cut -d" " -f2-> frc.gl
sed 's/^[0-9]+\:[0-9]+//' Lmpop_LAN.gl | cut -d" " -f2-> lan.gl
sed 's/^[0-9]+\:[0-9]+//' Lmpop_SIN.gl | cut -d" " -f2-> sin.gl
paste -d' ' Lmpop_BLD.gl ckv.gl dbs.gl frc.gl lan.gl sin.gl > dbs_parents.gl
To run ENTROPY
1. Run gl2genest.pl for each population to get input file for entropy.
perl gl2genest.pl subset_filtered_variantsLycaeides.gl
2. I transferred the output files to the startingvals folder and then in this folder I ran:
#make sure to edit the input file name and no. of SNPs and individuals in initq.R script
mkdir startingvals
mv pntest_dbs_parents.txt startingvals
R CMD BATCH initq.R
This created ldak files for each k {2-8}.
3. Next, I ran entropy. I am using an option to run multiple jobs using the perl parallel forkmanager.
To run,
sbatch runentropyGompKP.sh
This runs the perl fork script forkRunEntropy.pl. The file contains this:
Usage: perl forkRunEntropy.pl 10
Step 2: getting g and q files
### generating q files (Got the point estimates and quantiles using the following command):
/uufs/chpc.utah.edu/common/home/u6000989/bin/estpost_entropy -o q_K2.txt -p q -s 0 -w 0 ento_lyc_hybridsCh*K2.hdf5
#parameter dimensions for q: ind = 335, populations = 2, samples = 1000, chains = 3
Create pop order files:
First i needed to create a file which has the population order. I created this file in the/uufs/chpc.utah.edu/common/home/u6007910/projects/lyc_dubois/entropy/complete/dubois_parents/dbs_jhl_mel/infile folder as follows
1. In bash: cat Lmpop_M.gl | head -2 | tail -1 > temp_pops.txt
2. Go to R:
x <- read.table("temp_pops.txt", sep = " ", header=F)
newx <- t(x)
write.table(newx, file="temp_popt.txt", quote=F, row.names=F,col.names=F)
3). In bash: cat temp_popt.txt | grep -E [A-Z] | cut -c 1-3 > poporder.txt
4. Copy the poporder file to the mcmc folder for further plotting.
Step 3: Doing hybrid index plots