LESSON 04: Exploring the Y Part 2

SNPS and Y Haplogroups

Thanks to Angie Bush for her contributions to this lesson.

Y-SNPs are different than Y-STRs and they serve a very different purpose for genetic genealogists. Y again stands for the Y- Chromosome and SNP stands for Single-Nucleotide Polymorphism (pronounced "snip"). Y-SNPs are used to designate the Y-Haplogroup. Haplogroups are the big branches of the ancient human family tree. They have letter names from A-R. Unfortunately the naming does not always reflect the age. The most recent branch of the Y- tree is "R." Each Y-SNP mutation or change in the Y's DNA sequence at a specific location happened just once in the history of mankind (with very few exceptions) for each Y-SNP and all men bearing that SNP (mutation) are distantly related. If we started with the proverbial “ADAM” the first mutation in Y-SNPs happened maybe 60,000 years ago and separated Haplogroup A into A and B. Each time a mutation happens it separates the tree into finer and finer branches. By following the tree we can trace any man from "ADAM" to the most recent or “terminal SNP.” A terminal SNP is just the furthest down the tree branch we get a positive result.

WHY HAPLOGROUPS MATTER

Even if two men have the same values on 67/67 (67 of 67) markers they cannot be related in the past 1,000 years if they belong to different haplogroups. Here's an easy way to think of it -- the oldest branch of the human tree went along until there was a mutation in a Y-SNP and then every single male that descended from that man carries that mutation. Then every 3,000 years or so another mutation creeps in so you can sort all the men in the world into groups that identify their ancient origins. As we get further and further up the tree (there are more and more mutations) you can follow the branch you are on so this helps to break men into finer and finer groups. 

Here is where a picture is worth a thousand words:

Journey of Man Interactive Map

Unfortunately for all of us there are older and newer naming protocols that the various companies and organizations use to designate the ever finer subdivisions of the Y-DNA hapolgroup. So a name at 23andMe for a Y-Haplogroup may look different but be the same as a different name at FTDNA. Recently the genetic genealogy community in an effort to simplify the labels have moved to a protocol where the Haplogroup followed by the terminal SNP is the recommended way to report a haplogroup.

R1b1a2a1a1b3c at FTDNA

R1b1b2a1a2d3* at 23andMe 

R-L2 ISOGG recommendation

All of the above are describing the same haplogroup. This has been very confusing for all and especially for Newbies.



Looking at a screenshot from the Wheaton project again let's look at the 4th column for Haplogroups. Group "B" and group "C" are both haplogroup R1b1a21a1b however it is there they part ways. The first part of the R1b is the same, meaning that back 5,000 +/- years ago they shared a common ancestor. The haplogroups printed in red are predicted and those in green have been tested. However as new SNPs are discovered, reading that jumble of letters gets harder and harder so it is easier to identify them by their terminal SNP R “L2.” 

The graphic below may be a little hard to follow but it shows the groups in the "Wheaton Surname Project" and how they are related over 7500 years ago



This shows the ancient origins of the different Wheaton progenitors. Wheaton A-D groups share the haplogroup R1b1a as their common ancestral origin (orange box at top). Two separate mutations then defined those in the green and blue boxes (P312 and U106). The approximate date of each mutation is shown. So Groups A, B and D in the green box shared a common ancestor about 3,300 years ago and then several men had mutations light green (L21) and green (U152). Each SNP refers to a mutation in which one group breaks off from the other into those that carry the mutation and those that do not. In the old way of naming each mutation was denoted by a number or letter. With the longhand name such as "R1b1a21a1b" you could trace the actual path from the earliest SNP to the terminal one. Here is an example from ISOGG for the R Haplogroup. 

To date the Wheaton project has groups representing the most frequent haplogroups found in Great Britain: "I" & "R." I highly suggest working with your surname project administrator and/or that of one of the major haplogroup projects  for advice on SNP testing. Note: you may belong to multiple projects at FTDNA. So you could belong to the "R1b1 and  Subclades Project" the "Rehoboth Massachusetts Project" and the "Wheaton Surname Project." 

In case you are wondering subclades are just sub-groups of haplogroups--"subclades" is easier to say than a sub-group of a haplogroup. It helps to remember that if a man is "R"this is their clade or haplogroup and L2 would be their subclade. Another way to think of it is clan and subclan since these represent groups of men tied together by a common ancestor (man) as in "x" number of great-grandfathers back in time. So your subclade may represent a mutation that occurred with your 20th great-grandfather but your clade or major Haplogroup might be the mutation of your 300th great-grandfather. 

Many Y-SNPs are included in 23andMe's test. The National Genographic 2.0 test includes extensive Y-SNPs. FTDNA has many, many many SNPs available for individual testing. Again I suggest you work with a project administrator to determine the appropriate and most cost-effective manner to find your terminal SNP. Note: all project administrators are volunteers and are not employed by FTDNA and do not receive any compensation.

More depth in Y-DNA is covered in Lesson 14 and Lesson 15. In the meantime I recommend What is a Haplogroup by Roberta Estes. I have now added another Lesson on the Y which can be found here.

Additional resources:

Terminal SNPs versus haplogroup subclade names by CeCe Moore

Is the Y Pool Too Shallow by Roberta Estes

Predicitng Y-DNA Haplogroups in One-step by Steve P. Morse

Y Chromosome Haplotypes

Eupedia Distribution of Y Haplogroups Some of the very best flow charts, maps and graphics. Just scroll through the side bar and click.

Britains DNA Chromo 2 Demo with interesting graphics

LESSON 5: Introduction to atDNA


Content copyright 2013. All rights reserved.

Comments