Post date: Nov 06, 2016 3:40:55 AM
The GBS data for this project are in /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/. Sequencing information can be found here. This includes 672 individuals with four lanes of sequencing. Sample sizes per line and generation are as follows:
Sub-lines A and B denote a split of F14 following generation F4 (the P's can be ignored, they just reflect how information was coded on the jars). L14 is the main line of interest, a single sample from a line that almost took off (L11) is included as a potential comparison.
The sequences are in the fastq sub-directory and the barcode files are in the barcodes sub-directory.
I split each fastq file into a series of files with 150,000,000 lines (37,500,000 sequence records) to facilitate parsing.
split -l 150000000 gomp016_S5_L005_R1_001.fastq subgomp016
split -l 150000000 gomp017_S6_L006_R1_001.fastq subgomp017
split -l 150000000 gomp018_S7_L007_R1_001.fastq subgomp018
split -l 150000000 gomp019_S8_L008_R1_001.fastq subgomp019
I then used a wrapper script to run parse_barcodes768.pl on each subfile (subgomp0*a*). This is run separately for each original library/fastq file. Here is the example for gomp016. Note that this writes to /scratch/general/lustre/ and then copies back files (hopefully this will speed things up).
perl ../scripts/wrap_qsub_slurm_parse.pl /uufs/chpc.utah.edu/common/home/gompert-group1/data/callosobruchus/gbs_L14/barcodes/bc_gomp016.csv subgomp016a*
cd /scratch/general/lustre/parseg/
perl /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/scripts/parse_barcodes768.pl /uufs/chpc.utah.edu/common/home/gompert-group1/data/callosobruchus/gbs_L14/barcodes/bc_gomp016.csv /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/fastq/subgomp016ag
rsync -avz parsed_subgomp016ag /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/parsed/
cd /scratch/general/lustre/parseg/
perl /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/scripts/parse_barcodes768.pl /uufs/chpc.utah.edu/common/home/gompert-group1/data/callosobruchus/gbs_L14/barcodes/bc_gomp016.csv /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/fastq/subgomp016ah
rsync -avz parsed_subgomp016ah /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/parsed/
cd /scratch/general/lustre/parseg/
perl /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/scripts/parse_barcodes768.pl /uufs/chpc.utah.edu/common/home/gompert-group1/data/callosobruchus/gbs_L14/barcodes/bc_gomp016.csv /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/fastq/subgomp016ai
rsync -avz parsed_subgomp016ai /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/parsed/
cd /scratch/general/lustre/parseg/
perl /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/scripts/parse_barcodes768.pl /uufs/chpc.utah.edu/common/home/gompert-group1/data/callosobruchus/gbs_L14/barcodes/bc_gomp016.csv /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/fastq/subgomp016aj
rsync -avz parsed_subgomp016aj /uufs/chpc.utah.edu/common/home/u6000989/data/callosobruchus/gbs_L14/parsed/