Post date: Jun 29, 2016 10:42:37 PM
Aaron's GBS data were used to try to find sex-linked scaffolds as I did for Lycaeides. The data are in /uufs/chpc.utah.edu/common/home/u6000989/data/timema/aaronGBS-GWA/. Samtools was used to infer depth per individual and SNP:
#!/bin/sh -f
#SBATCH --time=8:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --account=gompert
#SBATCH --partition=kingspeak
#SBATCH --job-name=samtools
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=zach.gompert@usu.edu
echo ------------------------------------------------------
echo -n 'Job is running on node '; cat $SLURM_JOB_NODELIST
echo ------------------------------------------------------
echo SLURM: job identifier is $SLURM_JOBID
echo SLURM: job name is $SLURM_JOB_NAME
echo ------------------------------------------------------
module load gcc
module load gsl
module load hdf5
module load samtools
cd /uufs/chpc.utah.edu/common/home/u6000989/data/timema/aaronGBS-GWA/alignments/
samtools depth -f ../femaleBams -Q 20 > depthFemales.txt
samtools depth -f ../maleBams -Q 20 > depthMales.txt
I then used the perl script calScafSexDepth.pl to calculate total depth (across reads, positions and individuals) for each scaffold. The outfile from this is fmdepth.txt (scaffold, female count, male count). This was done in /uufs/chpc.utah.edu/common/home/u6000989/data/timema/aaronGBS-GWA/depth/:
perl calScafSexDepth.pl depthFemales.txt depthMales.txt
The following (akin to what I did for Lycaeides) was then done to identify and order the X-chromosome. Two key points should be noted: (1) I don't have sex data on the offspring, which means I had to use them all, (2) nothing joined when I used what I thought were just the female informative markers, but I got reasonable results when using both (could be related to using all offspring).
1. Identify putative sex-linked scaffolds
Get list of autosomes:
cut -f 1,2 mod_tcrLgs.txt -d " " | sort | uniq | grep ^[0-9] > ~/data/timema/aaronGBS-GWA/autosomes.txt
Find scaffolds with high female to male coverage ratio (note data exist for 395 males and 197 females, so base expectation for autosomes is ~2:1 and X is ~1:1)
x<-as.matrix(read.table("fmdepth.txt",header=F))
## subset of "large" scaffolds
rds<-(x[,2]+x[,3])
a<-which(rds > 100000)
scafs<-x[a,1][which(x[a,3]/x[a,2] < 1.5)]
## cross-check with existing autosomes
auto<-read.table("../autosomes.txt",header=F)
sclg<-matrix(-9,nrow=length(scafs),ncol=2)
sclg[,1]<-scafs
for(i in 1:length(scafs)){
z<-which(auto[,2] == sclg[i,1])
if(length(z) == 1){
sclg[i,2]<-auto[z,1]
}
}
sexsc<-sclg[which(sclg[,2] == 0),1]
write.table(sexsc,"putSexScaf.txt",row.names=FALSE,col.names=FALSE,quote=FALSE)
29 putative X scaffolds
2. New input file with just sex-scaffolds (both sexes b/c don't know)
perl ../scripts/makeSexMultiFamLinkage.pl offspringGenotypesfam*txt
380 SNPs in combinedSnpListSex
3. Filter on missing, not seg distortion (p-value = 0)
java -cp ~/source/lepmap2/bin/ Filtering data=lm_tcristSex.txt epsilon=0.01 dataTolerance=0 missingLimit=10 keepAlleles=1 > data_tcristlmSex.txt
***
Filtering markers based on segregation distortion
Removed 0 markers from family famA
Removed 0 markers from family famB
Removed 0 markers from family famC
Maternally informative markers = 176 (of 380)
Paternally informative markers = 227
Maternally or paternally informative markers = 380
***
4. Create LG, only maternally informative markers are meaningful (real)
java -cp ~/source/lepmap2/bin/ SeparateChromosomes data=data_tcristlmSex.txt lodLimit=1.5 sizeLimit=50 > sexLgs_tcrist.txt
join singles does nothing
5. Assign or unassign scaffolds to LGs
map<-read.table("sexLgs_tcrist.txt",header=FALSE)
snps<-read.table("combinedSnpListSex.txt",header=FALSE)
o<-cbind(map,snps)
colnames(o)<-c("sxlg","scaf","pos")
write.table(o,"sexLgs.txt",row.names=FALSE,col.names=TRUE,quote=FALSE)
perl assignScafFilter.pl sexLgs.txt
Assigned LGs for 17 scaffolds, failed to assign LGs for 12 scaffolds
this includes 93.4% of snps (355 assigned, 25 no)
6. Order X markers
tail -n +2 mod_sexLgs.txt | cut -f 1 -d " " | perl -p -i -e 's/1/13/' > sex_map.txt
java -cp ~/source/lepmap2/bin/ OrderMarkers map=sex_map.txt data=data_tcristlmSex.txt minError=0.01 numThreads=4 initRecombination=0.05 0.05 learnRecombinationParameters=1 1 > order_tcrX.txt