SNPS and Y Haplogroups
Thanks to Angie Bush for her contributions to this lesson.
are different than Y-STRs and they serve a very different purpose for genetic genealogists. Y again stands for the Y- Chromosome and SNP stands for Single-Nucleotide Polymorphism (pronounced "snip"). Y-SNPs are used to designate the Y-Haplogroup and in some cases are now being used to refine branches within a surname group. Haplogroups are the big branches of the ancient human family tree. They have letter
names from A-R. Unfortunately the naming does not always reflect the
age. The most recent branch of the Y- tree is "R." Each Y-SNP mutation or change in the Y's DNA sequence at a specific
location happened just once in the history of mankind (with very few exceptions) for each Y-SNP and all men
bearing that SNP (mutation) are distantly related. If we started with the
proverbial “ADAM” the first mutation in Y-SNPs happened maybe 60,000
years ago and separated Haplogroup A into A and B. Each time a mutation
happens it separates the tree into finer and finer branches. By
following the tree we can trace any man from "ADAM" to the most recent or
“terminal SNP.” A terminal SNP is just the furthest down the tree branch we get a positive result.
WHY HAPLOGROUPS MATTER
Even if two men have the same values on 67/67 (67 of 67) markers they cannot be related in the past 1,000 years if they belong to different haplogroups. Here's an easy way to think of it -- the oldest branch of the human tree went along until there was a mutation in a Y-SNP and then every single male that descended from that man carries that mutation. Then every 3,000 years or so another mutation creeps in so you can sort all the men in the world into groups that identify their ancient origins. As we get further and further up the tree (there are more and more mutations) you can follow the branch you are on so this helps to break men into finer and finer groups.
Here is where a picture is worth a thousand words:
Unfortunately for all of us there are older and newer naming protocols that the various companies and organizations use to designate the ever finer subdivisions of the Y-DNA hapolgroup. So a name at 23andMe for a Y-Haplogroup may look different but be the same as a different name at FTDNA. Recently the genetic genealogy community in an effort to simplify the labels have moved to a protocol where the Haplogroup followed by the terminal SNP is the recommended way to report a haplogroup.
R1b1a2a1a1b3c at FTDNA
R1b1b2a1a2d3* at 23andMe
R-L2 ISOGG recommendation
All of the above are describing the same haplogroup. This has been very confusing for all and especially for Newbies.
Looking at a screenshot from the Wheaton project again let's look at the 4th column for Haplogroups. Group "B" and group "C" are both haplogroup R1b1a21a1b however it is there they part ways. The first part of the R1b is the same, meaning that back 5,000 +/- years ago they shared a common ancestor. The haplogroups printed in red are predicted and those in green have been tested. However as new SNPs are discovered, reading that jumble of letters gets harder and harder so it is easier to identify them by their terminal SNP R “L2.”
The graphic below may be a little hard to follow but it shows the groups in the "Wheaton Surname Project" and how they are related over 7500 years ago.
This shows the ancient origins of the different Wheaton progenitors. Wheaton A-D groups share the haplogroup R1b1a as their common ancestral origin (orange box at top). Two separate mutations then defined those in the green and blue boxes (P312 and U106). The approximate date of each mutation is shown. So Groups A, B and D in the green box shared a common ancestor about 3,300 years ago and then several men had mutations light green (L21) and green (U152). Each SNP refers to a mutation in which one group breaks off from the other into those that carry the mutation and those that do not. In the old way of naming each mutation was denoted by a number or letter. With the longhand name such as "R1b1a21a1b" you could trace the actual path from the earliest SNP to the terminal one. Here is an example from ISOGG for the R Haplogroup.
To date the Wheaton project has groups representing the most frequent haplogroups found in Great Britain: "I" & "R." I highly suggest working with your surname project administrator and/or that of one of the major haplogroup projects for advice on SNP testing. Note: you may belong to multiple projects at FTDNA. So you could belong to the "R1b1 and Subclades Project" the "Rehoboth Massachusetts Project" and the "Wheaton Surname Project."
In case you are wondering subclades are just sub-groups of haplogroups--"subclades" is easier to say than a sub-group of a haplogroup. It helps to remember that if a man is "R"this is their clade or haplogroup and L2 would be their subclade. Another way to think of it is clan and subclan since these represent groups of men tied together by a common ancestor (man) as in "x" number of great-grandfathers back in time. So your subclade may represent a mutation that occurred with your 20th great-grandfather but your clade or major Haplogroup might be the mutation of your 300th great-grandfather.
Many Y-SNPs are included in 23andMe's test. The National Genographic 2.0 test includes extensive Y-SNPs. FTDNA has many, many many SNPs available for individual testing. Again I suggest you work with a project administrator to determine the appropriate and most cost-effective manner to find your terminal SNP. Note: all project administrators are volunteers and are not employed by FTDNA and do not receive any compensation.
Terminal SNPs versus haplogroup subclade names by CeCe Moore