Post date: May 31, 2014 9:40:56 PM
The file lycaeides_SNP_annotation.txt contains the annotation information for the SNPs in the larval performance experiment. This includes whether they are in genes (exons, introns and UTR), coding sequences (CDS) or UTRs, and the distance of each SNP to each of these. I created this from the scaffold*gff files using a slightly modified script from Victor,
~/labs/evolution/data/lycaeides/melissa_genome/Annotation$ ./retrieve_gene_coding_TE_SNPs.pl -i locuslist.txt -o lycaeides_SNP_annotation.txt -n 6
Based on this we have 29.3% of SNPs in genes, 0.1% in UTRs, and 11.9% in CDS. The mean annotation edit distnace is 0.22 (meadian = 0.17, 2.5th quantile = 0.01, 97.5th quantile 0.67), and 139,484 of the annotation distances are less than 0.5 (this is better than for Timema).