Y-DNA Ancestral Lines of Ashkenazi Jews

Based upon the methodology described below and using the sample set described here compiled in December 2018, the current Ashkenazi Jewish population includes at least 47 ancestral Y-DNA lines that are identifiable to date through Family Tree DNA's Big Y testing.  

Those ancestral lines are as follows (pages with SNP trees for each cluster are linked below):
As shown on the linked pages, many of these Ashkenazi Y-DNA ancestral lines follow a common pattern.  First, the lines typically date back to about the second half of the first millenium CE, i.e., each line typically split into two or more branches about 1,000 to 1,500 years ago.  For the most part, each of the branches from these ancestral lines includes men who, based upon their close autosomal matches to Ashkenazi Jews, are likely to be descended from Ashkenazi Jews on their direct male lines.  This strongly suggests that the shared direct male ancestors of such branches belonged to the Ashkenazi or proto-Ashkenazi population in the second half of the first millenium CE.  

Second, there is typically substantial branching within those subbranches that presumably dates back to the time when the Ashkenazi population began its massive expansion out of a small bottlenecked population, about 700 to 1,000 years ago.   

Because most of the 47 ancestral lines identified above had branched at least once prior to the post-bottleneck expansion -- and some of the lines had branched several times by that point -- it appears that most of the current Ashkenazi population is descended on their direct male lines from perhaps 120 to 150 Ashkenazi Y-DNA lines that existed as of the time of the bottleneck.  

Y-DNA is, of course, passed down only on the direct male line.  Accordingly, transmission of Y-DNA over a 700- to 1,000-year period would, at an average of 25 years per generation, require 28 to 40 consecutive male generations.  As a result, it is very likely that a significant number of Ashkenazi Y-DNA lines that existed as of the time of the bottleneck no longer exist.  Similarly, it is likely that there are some Ashkenazi Y-DNA lines that survive today only as single branches, and that there are other Ashkenazi Y-DNA lines that are not found in the sample set used for this study.  Further testing and analysis will likely identify some such branches.    

In some instances, the Ashkenazi Y-DNA ancestral lines include branches that share a direct male ancestor who lived perhaps 3,000 to 4,000 years ago.  To the extent that both branches are comprised primarily of Ashkenazi men, this strongly suggests that their shared direct male ancestor belonged to a Jewish or proto-Jewish population in the Near East.

More frequently, however, the Ashkenazi ancestral lines start in the second half of the first millenium CE, without brother branches that date back to Biblical times.  Where such lines include multiple branches that are each comprised predominantly of Ashkenazi men, one can infer that the shared direct male ancestor of those branches likely belonged to the Ashkenazi (or proto-Ashkenazi) population.  

The analysis is more complicated where such lines include multiple branches, not all of which are comprised predominantly of Ashkenazi men.  If there is one Ashkenazi branch and one non-Ashkenazi branch, as a logical matter it would be equally likely that the shared direct male ancestor of the two branches was Ashkenazi or non-Ashkenazi; as the proportion of non-Ashkenazi branches to Ashkenazi branches increases, the likelihood that the shared direct male ancestor was not Ashkenazi increases.  

At one point, it was commonly assumed that Y-DNA clusters commonly believed to have originated in the Near or Middle East -- such as haplogroups E, J1, and J2 -- were of Near Eastern (and Jewish) origins, while Y-DNA clusters commonly found among Europeans -- such as R1a and R1b -- likely entered the Ashkenazi population in Europe.  

Over the past 15 years or so, studies based on higher-level Y-DNA testing have shown these assumptions as to the origins of Ashkenazi R1a and R1b Y-DNA clusters to be overly simplistic, and often incorrect.  

For example, the R1a-Y2619 Ashkenazi Levite cluster (to which this website is devoted) was once hypothesized to have European origins because R1a is very common in Europe.  However, the Eupedia distribution map for R1a-Z93, the upstream cluster to which R1a-Y2619 Ashkenazi Levites belong, shows that R1a-Z93 is most common from the Near East through India.  The 2017 Behar study discussed here confirmed that the R1a-Y2619 Ashkenazi Levite cluster has Near Eastern origins by finding that the R-M582 Y-DNA cluster to which R1a-Ashkenazi Levites and their closest matches belong includes several Iranian branches and share a direct male ancestor who lived about 3,000 years ago.

Similarly, because R1b is predominant in Western Europe and the British Isles, it was often assumed that R1b lines entered the Ashkenazi population in Europe in the past millenium or so.  However, there is one branch of R1b -- R1b-Z2103 -- that, as shown by a Eupedia distribution map, is very common in the Near East.  There are two R1b Ashkenazi clusters -- R1b-Y19852 and R1b-FGC14600 -- that belong to the R1b-Z2103 branch and therefore are likely to be of Near Eastern origins.  As discussed below, three of the main R1b clusters discussed in this analysis -- R1b-FGC20759, R1b-FGC8580, and R1b-L4 -- share the same ages and branching characteristics of other Ashkenazi clusters, and therefore are either of Near Eastern origins or reflect a European contribution to Ashkenazi Y-DNA dating back to within centuries of the formation of the Ashkenazi/proto-Ashkenazi population.    

As more testing is done and analyzed, we will have a better idea as to which Ashkenazi Y-DNA lines likely originate in the Near East, and which lines more likely reflect Y-DNA lines that entered the Ashkenazi population in Europe.

Based upon the current sample set, the following Ashkenazi ancestral Y-DNA lines share the characteristic -- common to most of the major Ashkenazi Y-DNA lines -- of initial branching in the second half of the first millenium CE, followed by considerable branching beginning about 700 years ago: 

(1) E-Y14891; 
(2) E-Y6923; 
(3) the E-BY932 subcluster of E-PF1975; 
(4) E-BY7450; 
(5) E-Z17697; 
(6) E-BY11082; 
(7) the G-FGC35913 subcluster of G-BY764; 
(8) the G-FGC31712 subcluster of G-FGC249; 
(9) the G-L201 subcluster of G-L1324; 
(10) the I-Y11261 subcluster of I-S23612; 
(11) I-Y23115; 
(12) J1-L816; 
(13)-(14) the J1-BY101 and J1-ZS4763 subclusters of J1-L823; 
(15) J1-S12192; 
(16)-(17) the J1-CTS4459 and J1-ZS4307 subclusters of J1-Z18297; 
(18) the J1-ZS10589 subcluster of J1-PF7263; 
(19) the J1-FGC5215 subcluster of J1-FGC5230; 
(20) J1-BY67; 
(21) the J1-ZS1682 subcluster of J1-F450; 
(22) J2-L556; 
(23) J2-Z30930; 
(24)-(25) the J2-L254 and J2-FGC30508 subclusters of J2-FGC4992; 
(26) J2-BY268; 
(27) J2-Z43500; 
(28) Q-Y2198; 
(29) Q-YP1003; 
(30) R1a-Y2619; 
(31) R1b-FGC20759; 
(32) R1b-A11711; 
(33) R1b-FGC8580; and 
(34) R1b-L4.  
(T-Y31472 has two Ashkenazi branches that date back to the first millenium CE.)

Such Ashkenazi Y-DNA ancestral lines therefore appear likely to be Near Eastern in origin, although further information concerning upstream branching would in many instances be necessary to confirm the likely geographic origins of such lines.

Based upon the current sample set, the following Ashkenazi ancestral Y-DNA lines do not evidence the typical pattern of initial branching in the second half of the first millenium CE:
 
(1) the G-L201 subcluster of G-L1324; 
(2) I-BY424; 
(3) L-PAGES00116; 
(4) R1a-Y2632; 
(5) R1a-Y1013; 
(6) R1b-Y19862; 
(7) R1b-Z18106; 
(8) R1b-FGC21047; 
(9) R1b-FGC14600; 
(10) R1b-L408; 
(11) the R2-FGC13201 subcluster of R2-F1092; 
(12) T-BY11520; and 
(13) the T-PAGES00113 subcluster of T-CTS8862. 

Accordingly, these Ashkenazi ancestral Y-DNA lines are more likely than the Ashkenazi Y-DNA lines identified in the first table above to have entered the Ashkenazi population in Europe.  However, there is a significant likelihood that further test results for some of these lines may show either or both (1) earlier Ashkenazi branching or (2) upstream matches of Near Eastern origins.   

Observations

1.  The Ashkenazi ancestral lines identified in this analysis through a SNP-based analysis of a sample set compiled based upon autosomal matches coincide to a great extent with the ancestral lines identified by Wim Penninx on his website jewishdna.net by using STR marker values.  Accordingly, the analysis set forth herein is duplicative to a considerable extent of Wim's painstaking analyses, but it also provides independent confirmation of the validity of his STR-based clustering.  (I thank Wim for his thoughtful comments and suggestions.) 

2.  The term "ancestral lines" is used here to refer to those Y-DNA lines that include two or more branches that are each comprised in significant part by men who are likely of Ashkenazi descent on their direct male lines, as evidenced by their substantial autosomal matches to people known to have four Ashkenazi grandparents.  Most of the ancestral lines discussed in this analysis first branched about 1,000 to 1,500 years ago; several other ancestral lines first branched about 700 years ago, at the time that the Ashkenazi population expanded out of the severe bottleneck.   

Reference to Y-DNA ancestral lines, rather than to Y-DNA clusters, eliminates the need to make a judgment call as to which clusters within an ancestral line should be treated as separate clusters.  
  
3.  The rate of Big Y testing varies considerably among Ashkenazi Y-DNA clusters.  This likely results from the facts that, inter alia: (1) men with relatively close matches on Y-DNA STR testing are more likely to do upgraded testing, including Big Y testing; and (2) men in certain clusters (such as the R1a-Y2619 Ashkenazi Levite cluster to which this website is devoted) have made concerted efforts to encourage Big Y testing among men in such clusters.

As a result, certain Ashkenazi Y-DNA clusters continue to be underrepresented in Big Y testing, making it more difficult to identify and date those clusters.  Further testing is likely to flesh out information concerning these clusters.  

Based upon the results of additional Big Y testing, the number of Ashkenazi Y-DNA ancestral lines is likely to increase (as men from less common or undertested branches do testing).

4.  Because the initial sample set was gathered by using autosomal matches to Ashkenazi Jews to identify men who were likely of Ashkenazi descent on their direct male lines, the sample set will exclude those men who are of Sephardic or Mizrahi descent on their direct male lines unless those men also have significant Ashkenazi ancestry.  

Because (1) the Y-DNA clusters studied herein go back far longer than the autosomal matches used to compile the sample set and (2) FTDNA's Y-DNA Haplotree provides information concerning the number of men in each cluster, the Y-DNA clusters identified herein are highly likely to include a substantial proportion of men who are Sephardic on their direct male lines (and some men who are Mizrahi on their direct male lines).  This is true, however, only for Y-DNA clusters that are found in both the Ashkenazi population, on the one hand, and the Sephardic and/or Mizrahi population, on the other hand.  

5.  FTDNA's Y-DNA Haplotree includes the results for a substantial number of Ashkenazi men who have not submitted their data to YFull for inclusion in YFull's YTree.  As a result, (1) the FTDNA Y-DNA Haplotree identifies considerable branching not included in the YFull tree, and (2) YFull's estimates of the time to a shared direct male ancestor for a branch is sometimes based on a small sample size (which increases the margin of error for age estimations).  

Methodology

I used the following methodology to prepare the trees and supporting information posted on the pages linked hereto:

1.  I used the methodology described here to identify the FTDNA-reported Y-DNA SNPs for the cohorts of Family Finder-tested men who share 100 cM, 80 cM, or 50 cM of autosomal DNA, respectively, with at least one of 25 people with four Ashkenazi grandparents.

2.  SNPs reported by FTDNA based upon STR testing date back tens of thousands of years and are therefore of limited use in clustering an Ashkenazi (or proto-Ashkenazi) population which dates back 1,500 to 2,000 years, or a Jewish (or proto-Jewish) population that dates back perhaps 3,000 to 3,500 years.  Accordingly, I disregarded these far-upstream SNPs when making my initial efforts to identify Ashkenazi Y-DNA clusters.  

FTDNA uses one to four upstream SNPs to identify the haplogroups of men who have done only STR testing.  (The men whose haplogroups are identified in red text on FTDNA's project pages have done only STR testing.)  

The haplogroups reported by FTDNA based on STR testing in the Y-DNA haplogroups that contain a significant number of Ashkenazi Jews are as follows:

E: E-L117, E-M2, E-M35, and E-M96
G: G-M201
I: I-M170, I-M223, I-M253, and I-P37
J1: J-M267
J2: J-M172
Q: Q-M242
R1a: R-M512, R-M198, R-Z283
R1b: R-M269
R2: R-M124
T: T-M70

3.  I then used FTDNA's Y-DNA Haplotree to determine, for each Y-DNA SNP included in the sample set used for this analysis, the total number of men in FTDNA's database who were, as of early January 2019, reported as "Branch Participants" for that SNP (i.e., the number of men who have that SNP as their terminal SNP).

If dozens or hundreds of men are reported as Branch Participants for a SNP, that indicates that the SNP has been reported based on STR test results, Geno 2.0 results, SNP packs, or Walk Through the Y testing, rather than on Big Y testing.  Except in rare circumstances, SNPs shared by a large number of men are likely to date back several thousands of years to tens of thousands of years, making those SNPs generally irrelevant to the analysis performed herein.

4.  I next compared the number of men reported as Branch Participants for each SNP with the number of men in the sample set reported as having that SNP.  For the reasons discussed here, as a general matter the men in the sample set will constitute only a proportion -- perhaps 25% to 50% -- of the Ashkenazi men who have done Big Y testing.  

Thus, if about 25% to 50% of the total number of men reported by FTDNA as having a particular terminal SNP are included in the sample set (especially at 100 cM), there is a substantial likelihood that men reported with that SNP are of Ashkenazi ancestry on their direct male lines.  Reference to FTDNA's Y-DNA Haplotree demonstrated that the vast majority of such men belonged to Y-DNA clusters that included multiple branches with other men in the sample set, confirming that those SNPs are commonly found in the Ashkenazi population.  

Conversely, there are a substantial number of men in the sample set (especially in haplogroups I and R1b) who were only 50 cM matches for at least one of the 25 probands and who are not in the same Y-DNA clusters as other men in the sample set.  There is a significant likelihood that many -- but by no means all -- of those men are not Ashkenazi (or Jewish) on their direct male lines.  

Because the sample set at 50 cM is demonstrably overinclusive in this regard, I disregarded 50 cM matches if they did not belong to Ashkenazi Y-DNA clusters or were only a small percentage of the Branch Participants reported by FTDNA.  As a consequence, however, some rare Ashkenazi Y-DNA lines have likely been excluded from the results reported herein.

5.  Using the FTDNA Y-DNA Haplotree as of early January 2019 to identify branching, I prepared trees (linked above and from the left-hand column on this page) for each separate ancestral line identified through this analysis.  

In most instances, the identified ancestral lines include more than one branch with a substantial number of Ashkenazi men.  However, some such ancestral lines appear to include non-Ashkenazi -- and non-Jewish -- branches, in addition to Ashkenazi branches.  In a few instances, the identified ancestral lines include only a single branch with Ashkenazi men in the sample set.   

6.  To the right of each posted tree on the linked pages, I have included the data that I used in identifying likely Ashkenazi branches -- tables showing, for each SNP: (a) the SNP name; (b) the number of Branch Participants reported by FTDNA; and (c) the number of men in the sample set who match at least one of the 25 probands at (i) 100 cM, (ii) 80 cM, and (iii) 50 cM.        

7.  Below each posted tree, I have set forth the SNPs that are upstream from the top SNP of the tree, as taken from the FTDNA Y-DNA Haplotree.  I have included the ISOGG designation for each cluster.

For certain SNPs that are included in YFull's Y-Tree, I have set forth: (1) YFull's estimate of a time to a most recent common ancestor ("TMRCA") in years before present ("ybp"); and (2) YFull's range of time to a most recent common ancestor ("TMRCA") at 95% accuracy.

For certain SNPs that are included in Wim Penninx' analyses at JewishDNA.net, I have included: (1) his designation for the cluster (a branch number followed by a number (e.g., AB-067), and an abbreviated SNP chain (e.g., R1a-Z93-M582)); and (2) his estimated TMRCA range stated in terms of years at 95% accuracy).

8.  At the bottom of each page, I have included a table identifying every Y-DNA SNP in the sample set that does not appear on one of the posted trees.

For each SNP that is on the Big Y tree but not on one of the posted trees, I have stated whether such SNP is upstream from SNPs found in one or more Ashkenazi Y-DNA clusters.  If men who are reported based upon STR or Geno 2.0 testing as having these SNPs are of Ashkenazi descent on their direct male line, there is a high probability that such men belong to one of those Ashkenazi Y-DNA clusters.  However, a large proportion of the men reported as having these terminal SNPs are not of Ashkenazi descent on their direct male lines.

I have also identified SNPs that are not on one of the posted trees but that may reflect an Ashkenazi Y-DNA line.  (Such SNPs are those that are found disproportionately in the sample set, thereby indicating likely Ashkenazi origins, but are not part of an Ashkenazi Y-DNA cluster identified in the study.)  Further testing is necessary to confirm whether these SNPs define Ashkenazi branches.

Finally, those SNPs that I have not identified as being upstream from an Ashkenazi Y-DNA cluster or as possibly reflecting an Ashkenazi Y-DNA line are likely not to be Ashkenazi in origin.  In many instances, those Y-DNA lines appear in the sample set because of overinclusiveness in the methodology used to identify likely Ashkenazi Jews.  In other instances, those Y-DNA lines may be Sephardic or Mizrahi in origin, or have entered the Ashkenazi population in the past few centuries.  It is also likely that further test results will show that a few of these Y-DNA lines are minor, undertested Ashkenazi lines.