Post date: Apr 12, 2016 10:6:23 PM
The data from the first round(s) of sequencing for the mapping families was collected from sunflower (/data/local/dec10_ncgr/zg_parsed_reads/lyc_melcross_all.fastq) and is now in king:/uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/. Based on my lab notebook entry from 17xi11, this includes the data from the crosses from UW2-3, UW2-4, and UW4 (crosses only). This should include 56,044,392 sequences. I also obtained the raw sequence data from the final re-run, UW5-3. I am parsing the barcodes for the latter now (I split s_3_UW5-3.txt into xa* files):
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xaa
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xab
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xac
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xad
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xae
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xaf
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xag
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xah
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xai
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xaj
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xak
cd /uufs/chpc.utah.edu/common/home/u6000989/data/lycaeides/melissa_mappingfams/
perl scripts/parse_barcodes768.pl bcodes_UW5-3.txt xal
Note that the barcode sequence data (from the UW barcode set) is for the final round (5-3) of sequencing. Also, I will likely want to convert the quality scores to 1.8+ for consistency. Specifically, based on quality score ranges (see the nifty chart here), the sequences in lyc_melcross_all.fastq are in Sanger format (identical to Illumina 1.8+, and thus fine), whereas the those in s_3_UW5-3.txt are in 1.3+ or 1.5+ (doesn't matter which) and need converted to 1.8+. Here is a command that can be used for conversion:
sed -e '4~4y/@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghi/!"#$%&'\''()*+,-.\/0123456789:;<=>?@ABCDEFGHIJ/' myfile.fastq # add -i to save the result to the same input file