Post date: Oct 22, 2013 6:2:57 PM
In this project I will quantify the genetic architecture of performance variation on different host plants and relate this to genetic variation in natural populations on alfalfa and other host plants. Thus, this project includes Illumina sequence data from butterflies involved in a lab-rearing experiment and collected from natural populations (the latter includes all L. melissa that were part of the admixture project). This includes three new lanes of sequence data (really six as each library was sequenced twice). The new sequence data are in data/lycaeides/lycaeides_gbs/ in the subdirectory Sequences (lane[234]_Undetermined_R1_cat.fastq) or SA13102/Project_JA13216 (TXState1[123]*). The associated barcode files are in the Barcodes directory (each used twice). I am running the parse_barcodes768.pl script in the Scripts directory to replace barcode ids with individuals ids for each of the six fastq files. The resulting parsed sequences will be in the Parsed_Melissa subdirectory (this is currently running on the dorc cluster).
edit-30x13: I ran out of wall time so I had to try this again. I had requested 72 hours, which I thought would be sufficient. I now upped this to 96 hours, sent the jobs to navier, and have them writing to a local directory in case the problem was an I/O bottleneck. The other possibility is that too many barcodes had to be corrected.