23andcolm‎ > ‎

Day 12 - Promethease revisited

posted Jul 14, 2011, 1:00 PM by Colm O'Dushlaine   [ updated Jul 19, 2011, 7:21 AM ]
I mentioned this on day 5. Pretty good for Windows users but slow. I wrote a script that emulates the process and runs quite quickly on UNIX. 23andme genotypes are all on the forward strand and the script checks that if the SNPedia genotypes are on the minus strand the alleles are reverse complemented. Note: Promethease does this also. So, in summary:

- loop over all 23andme SNPs
- check if any genotype information in SNPedia
- [if yes] ensure genotypes are matched for strand and reverse complement if necessary
- [if yes] parse out disease/association annotation from wiki data

I've set the script up to only give me an output when there is something interesting for my genotype. Example of output is:

 snp_and_genotype strand magnitude disease_annotations
 rs1000113(C;C) plus na [ normal ]
 rs10008492(C;C) plus na [ normal ]
 rs10033464(G;G) plus 0 }} [ norm ]
 rs1004819(C;T) minus na [ 1.5x_risk ]
 rs10050860(C;C) plus 0  [ normal_risk ]
 rs10086908(T;T) plus 0  [ normal_risk ]
 rs10090154(C;C) plus 0  [ normal ]
 rs1010(A;G) minus na [ 1.75x_risk_of_MI ]
 rs1012729(A;A) plus na [ normal ]
 rs10134944(C;C) plus 0  [ normal ]
 rs1015362(A;G) minus na [ 2-4x_higher_risk_of_sun_sensitivity_if_part_of_risk_haplotype ]
 ...

This is crude, but a quick look:

 grep -vP "common|normal|norm|None" genome_Colm_O_Dushlaine_Full_20110124134637.txt.out | gawk '$2 != "?" { print }' | more


Most interesting here are:
  • rs11983225(T;T), http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=rs11983225 shows that T/T is ad a 0% frequency in CEU. It seems to be associated with a 7x reduced likelihood of responding to certain antidepressants. So that's not good if ever I need them! Also http://www.snpedia.com/index.php/Rs7787082(G;G)
  • rs1570360(A;A): 3x increased risk of sudden infant death syndrome. I'm 30 so ok I think
  • Confirmation of many phenotypes (taste bitter, blue eyes etc.) reported by 23andme
  • Some cardiac and stroke-related SNPs. These are mostly common and I would expect this given the incidence in the Irish population
  • rs2270641(G;G), which is pretty rare in CEU and increased Schizophrenia rist 3.7 fold. That study was small, however and we now know that alot of SNPs are implicated
  • rs2834167(A;A): 2.67x_increased_risk_for_systemic_sclerosis. Might have something to do with my Raynaud's-like symptoms
  • rs4606(C:C): complex association with anxiety disorders. I definitely have anxiety, but this association is not a simple one
  • Some variants associated with neuroblastoma. This is a condition generally restricted to children (~2% of cases in adults (>18yrs))
  • rs6596075(C:C): 2x risk of Crohn's disease
  • rs7442295(A;A): increased risk of hyperuracemia, but this is common in Europeans, see http://www.snpedia.com/index.php/Rs7442295
  • rs762551(A;A): CYP1A2*1F_homozygote;_Fast_metabolizer (or Caffeine). Definitely true!
Overall, many of these variants are well-covered by 23andme but it is certainly nice to do something like this as it makes the sources more transparent I think. You can go straight to SNPedia and look at the papers referenced. It also has a nice simple digestion of the implicated variants.

So this approach works well I think, though a very much stripped down version of Promethease. Apologies to Mike Cariaso for pestering him so much when I was working out how to query SNPedia properly. Get in touch if you want the script, or download it here. You are free to use it in a non-commercial context.

Comments