Y-DNA and mtDNA Haplogroup Frequencies for Ashkenazi Jews

This page sets forth the approximate frequencies of Y-DNA and mtDNA haplogroups among the Ashkenazi Jewish population, based upon compilation and analysis of a subset of Y-DNA and mtDNA test results reported by Family Tree DNA ("FTDNA") as of December 20, 2018.

Background

FTDNA customers may test three types of DNA for purposes of genetic genealogy: (1) Y-DNA, which is passed down on the direct male line; (2) mtDNA, which is passed down on the direct female line; and (3) autosomal DNA, which is passed down from ancestors on all lines.

Y-DNA and mtDNA are inherited from generation to generation with no or very few mutations between parent and child; thus, Y-DNA and mtDNA haplogroups are tens of thousands of years old.

By contrast, each child receives 50% of his or her autosomal DNA from each parent and, on average, 25% of his or her autosomal DNA from each grandparent, 12.5% of his or her autosomal DNA from each great grandparent, and so on.  

Because cousins inherit different segments of autosomal DNA from a shared ancestor, the amount of autosomal DNA that two cousins have both inherited from the shared ancestor will be significantly less than the amount of autosomal DNA that each of the cousins have inherited from that ancestor.

On average matter, second or third cousins who share only a single ancestor will share about 3.125% or about 0.78% in autosomal DNA, respectively, inherited from their shared great grandparent or great great grandparent.  (Because the percentage of autosomal DNA inherited by a person from each ancestor before the parents' generation can vary considerably, these percentages are subject to wide variation.)

The amount of shared autosomal DNA is reported in centiMorgans (cM), a unit of how likely it is that a particular segment of autosomal DNA will recombine in a generation.  Generally speaking, but subject to the limitations discussed below, the more total cM shared by two autosomal matches, the more closely the two matches are related.  

The figures of 3.125% and 0.78% discussed above, the average percentage of autosomal DNA shared by second or third cousins who share only a single ancestor, are equivalent to 224 cM and 56 cM, respectively.

These theoretical amounts of shared autosomal DNA generally understate the amount of autosomal DNA that will be shared by two people who are fully Ashkenazi (defined, for purposes of this analysis, as people with four known Ashkenazi Jewish grandparents).  This is because the Ashkenazi population is endogamous, i.e., every Ashkenazi Jew shares multiple ancestors with every other Ashkenazi Jew.  

Because of endogamy, people who are fully Ashkenazi are highly likely to share autosomal DNA inherited from multiple ancestors; as a result, Ashkenazi Jews will typically share a considerable amount of autosomal DNA with each other.  

Thus, it is common for any two Ashkenazi Jews to share as much autosomal DNA as would be shared, on average, by third cousins without multiple shared ancestors.  For Ashkenazi Jews, however, this shared autosomal DNA will often have been inherited not primarily from a single shared ancestor who lived four generations ago (as would be the case, generally speaking, for third cousins from non-endogamous populations) but from many shared ancestors who may have lived, perhaps, eight or 10 generations ago.

Issues with Using Family Finder Match Lists to Determine Ashkenazi Haplogroup Frequencies

FTDNA provides its customers with lists identifying matches for each type of testing that they have performed -- Y-DNA, mtDNA, and/or autosomal (Family Finder) testing.  

Y-DNA and mtDNA match lists will show only a tested person's closest matches within the same Y-DNA or mtDNA cluster.  Accordingly, a person's FTDNA's Y-DNA and mtDNA match lists do not provide information allowing analyses of Y-DNA or mtDNA clusters other than those to which the tested person belongs.    

However, FTDNA's Family Finder match lists provide information concerning, inter alia, the Y-DNA and mtDNA haplogroups for all of the tested person's autosomal matches who have performed Y-DNA or mtDNA testing, regardless of haplogroup.  Family Finder match lists for people who are fully Ashkenazi include far more people than their Y-DNA and mtDNA match lists.  Accordingly, Family Finder match lists are a resource that can be used for compiling information concerning the frequency of Y-DNA and mtDNA haplogroups in the Ashkenazi population.

There are, however, several issues with using Family Finder match lists to analyze the frequency of Ashkenazi Y-DNA and mtDNA haplogroups.  

1.  Family Finder match lists are underinclusive, in several respects.  As of December 2018, people with four Ashkenazi grandparents have in the neighborhood of 20,000 Family Finder matches, but the number of people of Ashkenazi ancestry who have done Family Finder testing is considerably greater.  Beyond that, there are many people of Ashkenazi ancestry on their direct male and/or female lines who have done Y-DNA testing and/or mtDNA testing through FTDNA but have not done Family Finder testing.  

2.  The Family Finder match lists are overinclusive.  The match lists of even those people with four Ashkenazi grandparents will invariably include many people whose Y-DNA and/or mtDNA lines are not Ashkenazi.  In some instances, those lines may be of non-Jewish origins; in other instances, those lines may be Jewish but non-Ashkenazi (e.g., Sephardic or Mizrahi).  

3.  The Y-DNA and mtDNA haplogroups for Ashkenazi Jews in the Family Finder database may not be fully representative of those in the Ashkenazi population as a whole.  

There are many people of Ashkenazi ancestry who have not done Y-DNA and/or mtDNA testing at all; the absence of Y-DNA and mtDNA results from such people may skew haplogroup frequencies.  

There is also an issue of representativeness even as to those persons who have done Y-DNA and/or mtDNA testing through FTDNA.  On the one hand, people who do DNA testing on themselves are more likely to order DNA testing for family members, who may belong to the same Y-DNA or mtDNA haplogroups as other tested relatives.  On the other hand, people who do DNA testing will generally be aware which of their relatives share their Y-DNA or mtDNA and therefore will not order Y-DNA or mtDNA testing for those relatives (absent a desire to confirm a relationship or to identify recent mutations in their Y-DNA or mtDNA lines).  

4.  FTDNA reports Y-DNA and mtDNA results differently depending on the type/level of testing performed.  

With regard to Y-DNA, haplogroups will be reported at: (1) a very high level of generality for men who have done STR testing (i.e., 12-, 25-, 37-, 67-, and 111-marker testing); (2) at a high or intermediate level of generality for men who have done Geno 2.0 testing or a la carte SNP testing; and (3) at a very high degree of specificity for men who have done Big Y testing (full Y-DNA sequencing).  As a result, men within the same Y-DNA cluster might be reported on the FTDNA match lists as, for example, (1) R-M512 or R-M198 (if the man has done only Y-DNA STR testing), (2) R-F1345 (if the man has done Geno 2.0 testing), (3) R-CTS6 (if the man has tested CTS6 on an a la carte basis), (4) R-Y2630 (if Y2630 is identified as the man's terminal SNP through Big Y testing), or (5) any one of the 30 terminal SNPs below the R-Y2630 level that have been identified as of December 2018.  These discrepancies cause issues with regard to those Y-DNA haplogroups that are generally divided into clades (i.e., haplogroup J includes J1 and J2, and haplogroup R includes R1a, R1b, and R2), and STR-based haplogroups often cannot be used to place men into clusters below the haplogroup level.

With regard to mtDNA, haplogroups will be reported with more specificity depending on whether the tested man or woman has done HVR1 testing, HVR1 & HVR2 testing, or full maternal sequence ("FMS") testing.  As a result, tested persons within the same mtDNA cluster might be reported on the FTDNA match lists as, for example, (1) H (if he or she has done only HVR1 testing), (2) H5 (if he or she has done only HVR1 & HVR2 testing), or (3) H5-T16311! (if he or she has done FMS testing).

Methodology to Address Issues with Using Family Finder Match Lists to Determine Ashkenazi Haplogroup Frequencies

While the issues identified above create limitations on the ability to use Family Finder match lists to determine Ashkenazi Y-DNA and mtDNA haplogroup frequencies, those issues can be mitigated in large part through use of the methodology described below.

1.  To address the fact that a single person's Family Finder match list will fail to identify many fully Ashkenazi people (i.e., people with four Ashkenazi grandparents) who have done Y-DNA or mtDNA testing, one may aggregate Family Finder lists.  

For use in this analysis, I have combined Family Finder match lists from 25 people with four Ashkenazi grandparents (the probands) to create a spreadsheet that includes all of their matches (for a total of 56,162 discrete persons).  Because of the endogamous nature of the Ashkenazi population,  this list is likely to include almost all of the people with four Ashkenazi grandparents who have done Family Finder testing (along with many other people who do not have four Ashkenazi grandparents).

2.  To address the fact that many of the people on the combined match list are likely not Ashkenazi on their direct male and/or direct female lines, one may take advantage of the fact that endogamy results in relatively high levels of shared autosomal DNA in the Ashkenazi population by considering only those matches who share a high amount of autosomal DNA with a person who has four Ashkenazi grandparents.

For use in this analysis, I have excluded from the Y-DNA and mtDNA frequency analysis any person on the match list who does not share a substantial amount of autosomal DNA with at least one of the 25 probands.  For analytical purposes, I initially considered three cohorts of Y-DNA-tested men and mtDNA-tested people who share at least 50 cM, 80 cM, or 100 cM, respectively, with at least one of the 25 probands:

                               50 cM                80 cM              100 cM
Total Tested:          37,381               26,938              21,849
Y-DNA Tested:         8,196                 6,231                5,229
mtDNA Tested:        8,345                 6,276                5,187

As discussed below, there is some variation in the frequencies of Y-DNA and mtDNA haplogroups depending upon whether such frequencies are considered at 50 cM, 80 cM, or 100 cM.  Because people who share the most autosomal DNA with fully Ashkenazi matches are the most likely to be Ashkenazi on all of their lines (including their direct male or female lines), the percentages reported on this website are those among the 100-cM matches.  

The correspondence between 100-cM matches and Ashkenazi direct male or female lines is by no means perfect -- it's highly likely that (1) some people with four Ashkenazi grandparents will have some close relatives who are not Ashkenazi (or not Jewish) on their direct male or direct female lines, and (2) some people with high proportions of Ashkenazi autosomal DNA will be descended on their direct male and/or female lines from ancestors who were not Ashkenazi -- so it's highly likely that the percentages reported herein are somewhat skewed.  (To deal with the former issue, one could use a cut-off of autosomal DNA to remove matches who are very closely related to a proband.  After completing this analysis, I removed any matches of greater than 200 cM; because virtually all such persons who had done Y-DNA and/or mtDNA testing shared at least 100 cM with another proband, using the 200 cM cutoff had no appreciable effect on the haplogroup percentages reported herein.)  

As discussed below, however, the frequencies calculated through this analysis are in line with those set forth in prior papers, which tends to support the reliability of this methodology in broad strokes, notwithstanding the strong likelihood that the 25 probands have a few 100-cM matches who are not Ashkenazi on their direct male or direct female lines.   

3.  A practical issue -- not a methodological one -- is presented by FTDNA's practice of reporting haplogroups with different degrees of generality or specificity depending upon the level of testing performed.

To address this issue, for purposes of this analysis I used the highest level of generality for most of the haplogroups, but broke haplogroup J into J1 and J2 and haplogroup R into R1a, R1b, and R2 to the extent that the information reported on the match lists was sufficient to allow identification of those clades.  (In a few instances, I omitted test results within haplogroups J and R because available information did not readily allow a determination of the clade to which the tested men belong.)  

Because of variations in the manner in which mtDNA haplogroups are reported depending upon the level of testing, the analysis does not break any of the mtDNA haplogroups into clades.

4.  This analysis does not do anything to address the facts that: (1) a substantial number of the people who have done Y-DNA or mtDNA testing through FTDNA have not done Family Finder testing; and (2) most of the Ashkenazi population has not done Y-DNA or mtDNA testing through FTDNA at all.  

Because of the large sample size and the lack of any reason to believe that the people who have done both Family Finder testing and Y-DNA and/or mtDNA testing are not representative of the universe of Ashkenazi Jews who have done Y-DNA or mtDNA testing through FTDNA (or of the Ashkenazi population in general), I have assumed that the frequencies reported herein are generally representative of those of all Ashkenazi Jews who have done Y-DNA or mtDNA testing (and of the Ashkenazi population in general).  

Absent an easy way to identify those people who may have tested close relatives or may have chosen not to test because another close relative has tested, I have assumed that these two possibilities generally cancel each other out.  

Similarly, because there is no way to determine the Y-DNA and mtDNA haplogroups of Ashkenazi Jews who have not done Y-DNA or mtDNA testing, I have assumed that the very large sample size considered in this analysis makes it likely that the samples considered herein are generally representative of the Ashkenazi population as a whole.  

To the extent that, as is likely, any of these three assumptions is not fully accurate, the Y-DNA and mtDNA haplogroup frequencies reported herein will differ somewhat from the frequencies found in the Ashkenazi population as a whole.  

Consistency with Prior Studies 
 
Y-DNA

The Y-DNA frequencies set forth herein are largely consistent with those reported in: (1) a 2004 paper, D. Behar, et al., Contrasting patterns of Y Chromosome variation in Ashkenazi Jewish and host non-Jewish European populations, Hum Genet (2004) 114:354-365which reported the following frequencies of Y-DNA haplogroups among Ashkenazi Jews based upon a sample of 442 men (results from different clusters within haplogroups in the Behar paper are combined); and (2) calculations based on the 519 Y-DNA branches, comprised of 2,730 sets of STR marker values, compiled by Wim Penninx for his Catalogue of Y-DNA Jewish branches (including Ashkenazi, Sephardic, Mizrahi, and Samaritan branches) posted on his website, JewishDNA.net, based upon his painstaking compilation and analysis of Y-DNA results gathered from FTDNA project pages and YSearch.org:

                        Y-DNA Haplogroups
     Current       Behar     Pinninx
E:
21.89%
20.40% 18.39%
F: 0.00% 0.90% 0.00%
G:
9.94%
9.70% 9.94%
H: 0.00% 0.00% 0.07%
I:
2.67%
4.10% 2.38%
J1:
17.77%
19.00% 17.66%
J2:
17.83%
19.00% 20.48%
K: 0.00% 0.20% 0.00%
L:
0.27%
0.20% 0.33%
N:
0.12%
0.20% 0.00%
P: 0.00% 0.50% 0.00%
Q:
4.56%
5.20% 5.57%
R1a:
9.29%
7.50% 8.57%
R1b:
11.50%
10.00% 11.54%
R2:
1.27%
1.40% 1.10%
T:
2.87%
--* 2.53%
 
* As of 2003, what is now haplogroup T was part of haplogroup K

There is a high degree of correlation between the frequencies calculated from the data in the 2004 Behar paper and on JewishDNA.net and those calculated through this methodology, which tends to confirm the validity of this methodology.  

mtDNA

The mtDNA frequencies set forth herein are largely consistent with the dataset considered in  another 2004 paper, D. Behar, et al.MtDNA evidence for a genetic bottleneck in the early history of the Ashkenazi Jewish population, Eur J. Hum. Genet. 2004 May; 12(5): 355-64, which found, based upon 565 Ashkenazi samples, that "[t]he most prevalent Ashkenazi [mtDNA haplogroups] were K (32%), H (21%), N1b (10%), and J1 (7%), followed by other [haplogroups] at minor frequencies."  Using the supplementary data to that paper to calculate percentages for haplogroups as a whole yields the following:

             mtDNA Haplogroups
Current        Behar
A:   0.06% 0.18%
B:   0.06% 0.00%
H: 25.91% 20.35%
HV:   5.01% 6.02%
I:   1.41% 0.88%
J:   7.09% 8.14%
K: 32.49% 32.40%
L:   2.26% 1.77%
M:   0.89% 1.77%
N:   6.90% 10.44%
R:*   2.24% 2.65%
T:   4.34% 4.78%
U:   5.09% 5.66%
V:   3.45% 2.83%
W:   2.18% 1.42%
X:   0.64% 1.06%

* Haplogroup R is designated as preHV1 and preHV2 in the supplementary data to the 2004 Behar paper.

Comparison of Y-DNA Results at 50 cM, 80 cM, and 100 cM

The chart below shows some variances in the frequencies of Y-DNA haplogroups depending on whether the dataset consists of people who share a total of at least 50 cM, at least 80 cM, or at least 100 cM with at least one of the 25 probands:

       50 cM   50 cM (%)        80 cM   80 cM (%)      100 cM 100 cM (%)
A 1 0.01% 1 0.02% 1 0.02%
B 0 0.00% 0 0.00% 0 0.00%
C 1 0.01% 0 0.00% 0 0.00%
D 0 0.00% 0 0.00% 0 0.00%
E 1628 20.36% 1326 21.51% 1138 21.89%
F 0 0.00% 0 0.00% 0 0.00%
G 705 8.82% 585 9.49% 517 9.94%
H 1 0.01% 1 0.02% 0 0.00%
I 380 4.75% 193 3.13% 139 2.67%
J1 1264 15.81% 1079 17.50% 924 17.77%
J2 1318 16.49% 1079 17.50% 927 17.83%
K 0 0.00% 0 0.00% 0 0.00%
L 22 0.28% 16 0.26% 14 0.27%
M 0 0.00% 0 0.00% 0 0.00%
N 24 0.30% 8 0.13% 6 0.12%
O 4 0.05% 0 0.00% 0 0.00%
P 0 0.00% 0 0.00% 0 0.00%
Q 328 4.10% 267 4.33% 237 4.56%
R1a 752 9.41% 579 9.39% 483 9.29%
R1b 1254 15.68% 778 12.62% 598 11.50%
R2 101 1.26% 76 1.23% 66 1.27%
S 0 0.00% 0 0.00% 0 0.00%
T 212 2.65% 177 2.87% 149 2.87%
Total 7995 100.00% 6165 100.00% 5199 100.00%

This chart shows that there is, for the most part, considerable consistency between the frequencies of each Y-DNA haplogroup at shared amounts of 50 cM, 80 cM, and 100 cM.  This suggests that a shared amount of 50 cM with a person with four known Ashkenazi ancestors will often (but by no means always) reflect Ashkenazi ancestry on the direct male line.  

There are, however, three Y-DNA haplogroups in which there are significant downward discrepancies in the frequencies of haplogroups between the cohorts at 50 cM, 80 cM, and 100 cM.  Most significantly in numerical terms, the percentage of men in haplogroup R1b in the dataset decreases from 15.68% at 50 cM to 12.62% at 80 cM to 11.50% at 100 cM.  There is a more substantial percentage decrease in men in haplogroups I (4.75% to 3.13% to 2.67%) and N (0.30% to 0.13% to 0.12%) from 50 cM to 80 cM to 100 cM.  

The fact that there are a significant number of R1b and I men (and a handful of N men) whose proportionate Ashkenazi admixture is less than that of many men in the dataset suggests that the dataset contains a significant number of men in those haplogroups who do not have four Ashkenazi grandparents, which decreases the likelihood that such men have Ashkenazi ancestry on their direct male lines.  (This does not, however, mean that the large number of R1b men or the significant number of I men who fall within the 100 cM cohort are not Ashkenazi on their direct male lines.)  The existence of this discrepancy suggests that it is preferable to use the 100 cM threshold for calculating frequencies, even though the dataset for the 100 cM cohort is considerably smaller than the datasets for the 50 cM and 80 cM cohorts.   

Comparison of mtDNA Results at 50 cM, 80 cM, and 100 cM

The chart below shows some variances in the frequencies of mtDNA haplogroups depending on whether the dataset consists of people who share a total of at least 50 cM, at least 80 cM, or at least 100 cM with at least one of the 25 probands:

        50 cM   50 cM (%)         80 cM   80 cM (%)       100 cM 100 cM (%)
A 24 0.29% 8 0.13% 3 0.06%
B 23 0.28% 7 0.11% 3 0.06%
C 12 0.14% 5 0.08% 0 0.00%
D 6 0.07% 1 0.02% 0 0.00%
E 0 0.00% 0 0.00% 0 0.00%
F 5 0.06% 1 0.02% 0 0.00%
G 1 0.01% 0 0.00% 0 0.00%
H 2452 29.38% 1684 26.83% 1344 25.91%
HV 356 4.27% 294 4.68% 260 5.01%
I 159 1.91% 100 1.59% 73 1.41%
J 620 7.43% 457 7.28% 368 7.09%
K 2258 27.06% 1929 30.74% 1685 32.49%
L 165 1.98% 134 2.14% 117 2.26%
M 73 0.87% 58 0.92% 46 0.89%
N 471 5.64% 406 6.47% 358 6.90%
O 0 0.00% 0 0.00% 0 0.00%
P 0 0.00% 0 0.00% 0 0.00%
Q 0 0.00% 0 0.00% 0 0.00%
R 144 1.73% 128 2.04% 116 2.24%
S 0 0.00% 0 0.00% 0 0.00%
T 454 5.44% 300 4.78% 225 4.34%
U 594 7.12% 368 5.86% 264 5.09%
V 278 3.33% 207 3.30% 179 3.45%
W 178 2.13% 138 2.20% 113 2.18%
X 71 0.85% 50 0.80% 33 0.64%
Y 1 0.01% 1 0.02% 0 0.00%
Total 8345 100.00% 6276 100.00% 5187 100.00%

This chart shows that there is, for the most part, considerable consistency between the frequencies of each mtDNA haplogroup at shared amounts of 50 cM, 80 cM, and 100 cM.  This suggests that a shared amount of 50 cM with a person with four known Ashkenazi ancestors will often (but by no means always) reflect Ashkenazi ancestry on the direct female line.  

There are, however, five mtDNA haplogroups in which there are significant downward discrepancies in the frequencies of haplogroups for the cohorts at 50 cM, 80 cM, and 100 cM.  Most significantly in numerical terms, the percentage of people in haplogroup H in the dataset decreases from 29.38% at 50 cM to 26.83% at 80 cM to 25.91% at 100 cM.  There are also substantial percentage decreases in people in haplogroups A (0.29% to 0.13% to 0.06%), I (1.91% to 1.59% to 1.41%), J (7.43% to 7.28% to 7.09%), T (5.44% to 4.78% to 4.34%), and U (7.12% to 5.86% to 5.09%).  

The fact that there are a significant number of H people (and a handful of people in other haplogroups) whose proportionate Ashkenazi admixture is less than that of many people in the dataset suggests that the dataset contains a significant number of people in those haplogroups who do not have four Ashkenazi grandparents, which decreases the likelihood that they have Ashkenazi ancestry on their direct female line.  (This does not, however, mean that the large number of people in haplogroup H or the significant number of people in haplogroups J, T, and U are not Ashkenazi on their direct female lines.)  Once again, the existence of this discrepancy suggests that it is preferable to use the 100 cM threshold for calculating proportions, even though the dataset for the 100 cM cohort is considerably smaller than the datasets for the 50 cM and 80 cM cohorts.   
Frequency of Y-DNA Haplogroups in the Ashkenazi Population 
E:
21.89%
G:
9.94%
I:
2.67%
J1:
17.77%
J2:
17.83%
L:
0.27%
N:
0.12%
Q:
4.56%
R1a:
9.29%
R1b:
11.50%
R2:
1.27%
T:
2.87%


Frequency of mtDNA Haplogroups in the Ashkenazi Population
A:   0.06%
B:    0.06%
H: 25.91%
HV:   5.01%
I:   1.41%
J:   7.09%
K: 32.49%
L:   2.26%
M:   0.89%
N:   6.90%
R:   2.24%
T:   4.34%
U:   5.09%
V:   3.45%
W:   2.18%
X:   0.64%