genome annotations and tests for enrichment

Post date: Jun 10, 2014 2:55:34 PM

Here are genome-level annotations summaries:

N (missing data) = 56.06%

Non-N = 43.94%

genic = 16.12% (13,158 genes)

coding = 3.53% (62,064 CDS)

non-coding = 12.58%

2094 UTR's

62,157 exons

intergenic = 27.82%

Note, these do not sum to one, but that is because genic = coding + non-coding.

Median scaffold size: 9651 bp (2474 excluding N's).

718 scaffolds > 100 kb, 2 > 1 Mb.

Total genome length = 360,254,725 bp (158,297,813 without N's).

This information comes from/labs/evolution/data/lycaeides/melissa_genome/Annotation/genomeAnnotation.txt.

Enrichment tests--

We have four key structural annotations: genic, coding sequence, UTR, and repeat region. Along these lines, I asked whether the SNPs with the largest absolute model-averaged effects were over-represented in any of these categories. I considered two top quantiles 99.9th (about 80 loci) and 99.95th (about 40 loci) as these give roughly the number of SNPs that could have non-zero effects on the traits (based on no. snp estimates, at least roughly). I used binomial probability distributions, with p = proportion of snp's in a category, to test for enrichment.

Here are the instances with significant enrichment:

Survival, plant x population treatments: None

Wgt, plant x population treatments:

q 99.9, repeat, GLA x Ac, Obs = 4, Expected = 1.67, p = 0.0265

q 99.95, repeat, GLA x Ac, Obs = 4, Expected = 0.85, p = 0.0015

q 99.95, repeat, SLA x Ms, Obs = 2, Expected = 0.80, p = 0.0452

Survival, combined

q 99.9, repeat, Ac, Obs = 5, Expected = 1.72, p = 0.008

q 99.9, repeat, all, Obs = 4, Expected = 1.71, p = 0.029

q 99.9, repeat, SLA, Obs = 5, Expected = 1.65, p = 0.006

q 99.95, repeat, Ac, Obs = 4, Expected = 0.86, p = 0.002

Wgt, combined

q 99.9, repeat, Ms, obs = 4, Expected = 1.69, p = 0.027

q 99.95, repeat, Ms, obs = 3, Expected = 0.844, p = 0.0099

q 99.95, repeat, SLA, obs = 2, Expected = 0.829, p = 0.048

q 99.95, repeat, all, obs = 2, Expected = 0.82, p = 0.048

Thus, all of the significant enrichments are for repeat elements!

Page updated

Google Sites

Report abuse