..DNA explanation

Explanation of DNA concepts pertinent to
family and population genetics

Powered by Translate


Every human has DNA within the cells.  The DNA is passed from parents to each new
child when the human embryo is first formed.  The DNA is not found in one continuous unit.
Instead there are several dozen identifiable parts called chromosomes.  We are interested, in
particular, in the Y-DNA chromosome.  This chromosome has the unique feature of being
passed almost unchanged from a father to his biological sons.

Because Y-DNA is almost unchanged, we can find the nearest relatives to any male person by
locating persons whose Y-DNA looks almost the same.  The farther back in time one shares a
common ancestor with someone, the more differences will be found on the Y-DNA.


Scientists noticed that it is possible to divide up all the world's men into genetic
groups based on whether they have or do not have certain changes in the Y-DNA, called
mutations.  In particular, they were interested in types of mutations called SNP mutations. 
These mutations once present on the Y-DNA chromosome are never lost if they are true
SNP mutations.

Typically, a person who is initially tested at Family Tree DNA and many other DNA
labs will not have testing for SNPs.  These are ordered later.  Because a SNP mutation is a
change in the structure of a chromosome structure, the lab will report only whether the SNP
mutation is present (+) or is absent (-) 


Scientists determined that it was possible to divide all the men in the world into
distinctive groups, which they called haplogroups.  If one belongs to a particular
haplogroup, one possesses a specific SNP mutation on the Y-DNA chromosome. 
Because all men belong to one of several dozen haplogroups, the scientists assigned each
haplogroup a letter of the alphabet.  Some men will belong to haplogroup A; others to haplogroup
B; others to haplogroup C, and so on. 


Scientists also noticed that some men within a specific haplogroup, such as haplogroup G, share
additional SNP mutations that others within the same haplogroup lack.  So it is possible to
once again categorize the men within a haplogroup, but this time the categories can be called

The clumsy method used to identify these sub-haplogroups involves alternating letters and
numbers.  For example, a particular man within haplogroup G might belong to sub-haplogroup
G2a3b1a1a1.   The oldest shared mutation is found on the far left -- the G.  Everyone within
haplogroup G has this mutation.  The component next to the G (the 2) is the next oldest
mutation shared by only the men in the sub-haplogroup.  Some men in haplogroup G will belong to
G1 and G2, but in this case our man belongs to G2, and this narrows down the number of persons
sharing his mutations.  Once one reaches the far right, the final 1, this is the most recent mutation
shared by men in this particular sub-haplogroup.

The more of these groupings that can be identified, the better.  They give a picture of the
migratory history of our particular ancestors.  It is definitely possible to identify many more of
these shared SNP mutations, providing a very specific genetic history of deep ancestry.


Because Y-DNA is enormously long, the labs have come up with shorthand designations for the
SNPs used to define haplogroups and their sub-haplogroups.  For example, instead of
saying the SNP that defines the final component of G2a3b1a1a1 is at position 10345728 on the
Y-DNA chromosome, they have provided the shortened designation of L13.  And for the first
component, instead of, for example 8910247, they call it M201. 

Because sub-haplogroup designations frequently change as new SNPs are identified, it is
necessary for you to know which SNP item, such as L13, is the most recent and specific one
tested in your situation.

The DNA labs often sell tests that combine a number of SNP tests specific to a haplogroup.
At Family Tree DNA, for example, this is called the deep clade SNP test for a specific
haplogroup.  This field of research is so fast-moving, however, that not all the relevant SNPs are included
in the standard panel of tests yet, and some may have to be ordered separately.


The initial test for most customers of DNA labs consists of a series of what are called markers.
While the SNPs mutations involve permanent changes to the structure of the DNA, what is
tested at these markers instead is the number of times a particular DNA component is
repeated.  The result at a particular marker is reported as a number. 

Each marker has a short designation, such as DYS390, DYS425, YCA, etc.  The value reported
to you associated with the marker indicates how many repeats were found there.  For example,
when marker DYS390 has 22 repeats, one can say DYS390=22.   When the Y-DNA is
passed on to a son, slippage may occur at one of the marker sites.  If another repeat is
added, the value would increase to 23.  If a repeat is lost, the value drops to 21. 

While these marker number changes are valuable in comparing persons in recent centuries, they are
not suitable for grouping persons into the major haplogroup categories because they are not 
permanent changes.

Occasionally the process of passing on Y-DNA to a son may result in more severe
slippage at a particular marker, and this results in loss or gain of multiple repeats.  If this
big change in number of repeats is found to be shared by other men in the sub-haplogroup,
then this can be the basis of yet another subgroup in addition to the the categories based on
shared SNP mutations.  Categories based on marker value oddities are not perfect
choices for grouping because someone may occasionally develop another mutation as to
number of repeats at this same marker.  But marker oddities are, nevertheless, very helpful 
though not perfect.  There are some markers where mutations almost never occur.  At such
markers a one-value change may be the basis of a category.  The complete loss of all
repeats at a marker may also make a useful category.

A category based on a marker oddity can be designated, for example, as the DYS568=9
subgroup when a mutation in number of repeats has changed the normal finding from 12 to


It is often possible to predict which sub-haplogroup is your sub-haplogroup based on
the values seen at the markers.  This is why the labs do marker testing first.  Having these marker
results dramatically narrows down which SNPs needed to be tested.  Some sub-haplogroups
are associated with extremely distinctive marker values.  Others may have number patterns   
seen in multiple sub-haplogroups, and predicting the correct sub-haplogroup for these
latter samples may be difficult.


A final way of categorizing men once the SNP mutations categories and marker oddity
categories are exhausted, is locating persons whose complete set of marker values are very
similar.  Such groupings are called clusters or clades.  Within the haplogroup G project we
have found that persons who have 9 or fewer differences in marker values when comparing
67 markers can be considered part of the same cluster.  Once more than 9 differences are
found, overlapping with other subgroups can occur.  And use of less than 67 markers
has not proven reliable.

Genetic overview at the Genographic Project site:
https://www3.nationalgeographic.com/genographic/overview.html  [other links available there]

Explanation of DNA testing at Relative Genetics site:

U-Tube TV Series titled The Journey of Man which gives an overview of population genetics