Fig. 15 DNA and RNA
Courtesy - Sponk, CC BY-SA 3.0 via Wikimedia Commons
Over the last few decades, the science of genetics has been used to understand the migrations and evolution of H.Sapiens. DNA extracted from fossils (called "ancient DNA") has been used to determine the movement of populations, inheritance and ancestry. This area of study is called paleogenetics (paleo - old/ancient, genetics - from genesis - origin / birth). While DNA analysis cannot be used to date fossils, it can be used to understand the evolution of species.
The next few sections will give an overview of cells and the role of DNA in heredity.
All living organisms are made up of cells. The cell is the smallest fully functional unit of a living organism. A cell has a membrane (cover) that encloses the various parts of a cell such as the nucleus, cytoplasm, mitochondria, ribosomes etc. These subunits inside a cell are called organelles.
Cells are of two types -prokaryotic and eukaryotic. Prokaryotic cells lack a nucleus; eukaryotic cells have a nucleus. Prokaryotes are single celled organisms such as bacteria. Eukaryotic cells can be single or multicellular. In taxonomy, all living things are classified at the highest level into one of 2 domains - Prokaryota or Eukaryota. Humans are in the Eukaryota (multi cellular) domain
A human body has trillions of cells. These cells are of different types such as blood cells, nerve cells, muscle cells, tissue cells etc.
Note: The study of genes and their role in heredity, their mutation and their use in dating fossils is an involved subject. This chapter provides an overview of the topic. See the References at the bottom of this page for more detail.
Fig. 16 Structure of a human cell
Courtesy - LadyofHats, Public domain, via Wikimedia Commons
Fig. 17 Cell, Nucleus, Chromosome and DNA
Courtesy - Sponk, Tryphon, Magnus Manske, User:Dietzel65, LadyofHats (Mariana Ruiz), Radio89, CC BY-SA 3.0 via Wikimedia Commons
The diagram above shows a representation of the cell, nucleus, chromosome and DNA.
Every cell in the human body has 23 pairs of chromosomes. In each pair, one chromosome comes from the mother; the other chromosome comes from the father. 22 numbered pairs of chromosomes are called autosomes. There is 1 chromosome pair that determines the sex of the individual. Males have the X & Y sex chromosome; females have X & X chromosome. Chromosomes have Deoxyribonucleic acid (DNA) that carries all the genetic material of the organism.
In addition, humans have DNA outside the nucleus. This is called mitochondrial DNA (mtDNA).
The Y chromosome is passed from Grandfather to father to son to grandson and so on. The mtDNA is passed from Grandmother to mother/father and from mother to son and/or daughter. This is an important concept because tracing the Y chromosome can show the family tree on the Father's side; tracing the mtDNA can show the family tree on the Mother's side. We will have more to say on this topic in a later section.
Most chromosomes get recombined and hence change across generations. The Y-chromosome is unique. It is passed on from Father to son to grandson etc. for generations with little change, except through random mutations, which are also passed on.
Fig.18 Inheritance
If the offspring inherits a X chromosome from Father and Mother, it will be a girl. If the offspring inherits a X chromosome from the Mother and the Y chromosome from the Father, it will be a boy.
For the 22 numbered chromosome pairs in the offspring, one pair is inherited from the Mother, the other from the Father. For reasons of simplicity, this is not shown in the diagram.
Fig. 19 DNA
Courtesy - US National Library of Medicine
Deoxyribonucleic acid (DNA) consists of a long string of chemicals. Of importance are four chemicals called Adenine (A), Guanine (G), Thymine (T) and Cytosine (C). These are called bases. The human DNA consists of about 3 billion (3 followed by 9 zeros) bases. 99.9% of the DNA is the same for all humans. It is the 0.1% that determines the differences among humans. (Recall from a previous section that 96% - 98% of the DNA is common between humans and chimpanzees). The bases pair with hydrogen bonds. Adenine pairs with Thymine with a double hydrogen bond; Guanine pairs with Cytosine with a triple hydrogen bond
Fig.20 Gene
Courtesy- Thomas Shafee, CC BY 4.0, via Wikimedia Commons
A gene is the basic functional and physical unit of heredity. It consists of a sequence of bases that gives a certain characteristic to the individual. Thus a gene is a part of the DNA. Some genes determine eye color. Others enable language development and so on. Some genes have a few hundred base pairs. Some genes have more than a million (1 followed by 6 zeros) base pairs. The human DNA has about 40,000 genes, out of which 20,000 genes are used. The sequence of bases that make up a gene codes for making proteins that are essential to the body. Some genes are not used for making proteins. These are called non-coding genes
As mentioned above, a gene is a sequence of base pairs (bp) in the DNA that gives rise to specific characteristics in an individual. In some cases, the DNA sequence can vary in some individuals at a specific location ("locus"). Thus the same gene can result in two (or more) different characterizations. An example is the gene that gives a specific blood group. This gene has 6 alleles. These 6 alleles gives rise to one of 4 blood groups - A, B, AB and O. Another example of an allele is the one that gives different eye colours.
This change in the genetic sequence can be in a single position due to a process called Single Nucleotide Polymorphism (SNP) or it can be in several hundreds or thousands of nucleotides.
Note: The basic unit of DNA or RNA is called a Nucleotide. A nucleotide consists of a five-carbon sugar ("ribose" or "deoxyribose"), one of 4 nitro bases (Adenine, Guanine, Thymine, Cytosine in DNA), and a phosphate group. In RNA, Thymine is replaced by Uracil). See the figure above. If a base is changed to another one, say G is replaced by T, it gives rise to an allele. See the figure below.
Fig.21 Haplotype
An example of 3 individuals with different haplotypes due to SNP
Courtesy - National Human Genome Research Institute
A haplotype is a group of alleles that are inherited together from a single parent. They are inherited together because they are close to each other on the chromosome and recombination between them are unlikely.
A haplogroup is a group of similar haplotypes that share a common ancestor with a SNP mutation. A population group in the same haplogroup share a common ancestor on the father's (patriline) side or the mother's (matriline) side. In population genetics, the two commonly studied haplogroups are Y-chromosome haplogroup and the mtDNA-chromosome haplogroup. The Y-DNA and mtDNA change only through chance mutation and can thus be used to trace genetic ancestry.
Y-chromosome DNA have non-recombining (NRY) regions. These are regions in the Y-DNA that do not change across generations of fathers and sons except by mutations. Looking at these specific regions can yield information on ancestry. On the Y-chromosome, SRY, MSY1, MSY2 and MSY3 are examples of non-recombining regions.
Mutations in Y-DNA are passed from father to son
Y-DNA haplogroups are named from A to T and subdivided using numbers and lower case letters.
The CT Y-haplogroup is the origin of all Y-haplogroups. CT is estimated to be about 88,000 to 100,000 years old. CT has sub haplogroups DE & CF which have sub haplogroups D & E and C & F respectively. Haplogroup F is the most prevalent Y-haplogroup in the world today. F constitutes over 90% of all non African paternal lineages.
The predominant Y haplogroups in India are R1a1 (North Indian upper castes and thought to be associated with Indo-European migration), H (ancient Indian populations and considered to be indigenous to India), L (prevalent to South Indians), R2, J2, C and O (primarily among Northeastern Indians).
Fig. 22 Y DNA Haplogroup phylogenetic tree
The time of origin and the place of origin of the haplogroups are not precise. There are conflicting claims. The data shown here has been collected from various sources
Fig. 23 The above figure shows the possible migration of H.Sapiens based on the Y-haplogroup
Courtesy - Chakazul, CC BY-SA 3.0 via Wikimedia Commons
L is the oldest mtDNA haplogroup and all the existing mtDNA haplogroups descend from this group. Specifically, a woman with the L3 haplogroup who migrated out of Africa about 60,000 years ago is believed to be the mother of all the people living outside Africa. We will have more to say about this in a later section.
L3 has given rise to M & N. N has given rise to R. All humans outside Africa have one of these haplogroups (or their sub haplogroups)
In India, the predominant mtDNA haplogroups are M, R and U (which is descended from R). M accounts for 60% of the population while R, (which is descended from an older macrohaplogroup called N) and U account for the rest. Interestingly, Europe has N & R mtDNA haplogroups but does not have M. There is reason to believe that migration from the Middle East and/or South Asia populated Europe.
mtDNA is also largely non-recombining except for some regions that see significant mutations. These are called hypervariable regions.
Mutations in mtDNA are passed from mother to her child. By studying these mutations and the non-recombining regions, it is possible to trace ancestry.
When the DNA profile of two individuals must be compared, molecular biologists look for genetic markers in the DNA. Genetic markers are specific patterns in the DNA that are inherited by children from their parents. Two types of markers are SNP and Short Tandem Repeats (STR). STR refers to short segments of DNA sequence that gets repeated over and over again on the chromosome. The number of repeats depends on the chromosome and the repeat. Y-chromosomes have several STRs. By looking for these STRs and the mutation rate, it is possible to trace the patriline ancestry and nearby branches in the tree due to mutations.
For example, the Y-chromosome has a STR called DYS391 with the base sequence TCTA. The sequence can be repeated 6 to 14 times. A son will have the same number of repeats as the Father, unless there is a mutation which can happen in less than 1% of the time.
Unlike the Y-chromosome, mtDNA does not have STRs. They have hypervariable regions that show a high level of mutation. Two examples are HVR1 and HVR2. Study of these regions have been used to trace maternal ancestry. For example, a study of 111 samples collected from individuals in North and South India showed that 53% had lineages from haplogroup M, 11% had lineages from haplogroup U and 9% from lineages of haplogroup R. About 10% had lineages from European haplogroups. About 6% of the sample could not be defined.
The following must be noted -
Tracing the Y-DNA or mtDNA does not show the whole picture of a person's ancestry. For example, a person's maternal uncle could have a different Y-DNA and would not be counted in the family tree based solely on Y-chromosome.
A person could be of a completely different ancestry, say Chinese, but belong to a Y haplogroup found only in India. This would happen if a person from India migrated to China centuries ago and had offspring with a Chinese woman. His male offspring would have the same haplogroup but could not be considered to be Indian based solely on the Y haplogroup
A factor that is considered to trace the changes in Y-DNA and mtDNA and create the genetic tree of a species (called the phylogenetic tree) is the rate of mutations. This rate depends on the number of mutations per generation per site of the DNA. One study sets the human mtDNA molecular clock at ~9000 years. This means that a new sub haplogroup, with significant (hundreds or thousands) of mutations is formed about every 9000 years. There are variations to this number and some have come up with a larger or smaller value. This study sets it at 7900 years. Based on this value, it is postulated that the L3 mtDNA haplogroup from which all non Africans originated is around 70,000 years old.
The PhyloTree project describes this in detail including the mutations that resulted in a new haplogroup. The diagram below is from this website.
Note: mtMRCA stands for mitochondrial Most Recent Common Ancestor
Fig. 24 mtDNA Haplogroups
Image Courtesy - van Oven M, Kayser M. 2009. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum Mutat 30(2):E386-E394. http://www.phylotree.org.
Fig. 25 Possible migration of H.Sapiens based on mtDNA
Courtesy - User:Maulucioni, CC BY-SA 3.0 via Wikimedia Commons
Tracing the Y-DNA and mtDNA haplogroups has helped trace the migration of Homo Sapiens from Africa to the rest of the world. As mentioned before every person outside Africa is believed to have descended from a woman (or women) with the L3 mtDNA haplogroup who migrated from Africa about 60,000 to 70,000 years ago. The L3 mtDNA haplogroup has not been found outside of Africa. Its sub haplogroups M,N & R (and their sub haplogroups) is present in all the population outside Africa. Some M & N haplogroups have been found in Africa but they are due to reverse migration into Africa and are dated later than L3.
The paper titled, "Out-of-Africa, the peopling of continents and islands: tracing uniparental gene trees across the map" by Oppenheimer is a comprehensive survey of the evidence and literature. The paper addresses the following topics.
Routes
Four routes are possible for migrating from Africa to Asia and Europe -
Across the Straits of Gibraltar from present day Morocco to Spain
Across the Mediterranean Sea from present day Tunisia to Sicily
North from East and Central Africa through Egypt into Sinai and into the Levant (present day Palestine, Jordan, Syria, Lebanon, etc.). This is called the Northern route.
Across the Bab El Mandeb at the southern end of the Red Sea from present day Eritrea in East Africa to Yemen in the Arabian peninsula. This is called the Southern route
There is no fossil or DNA evidence for the first two routes. While mtDNA haplogroups M, N and R are found in South Asia, Europe has only N & R. The N lineages in Europe are younger than the N lineages in South Asia. This shows higher genetic diversity in South Asia compared to Europe and an older age for H.Sapiens in South Asia. This rules out the Northern route. There might have been some migration from Egypt but the fossils of H.Sapiens found in the region are newer than the expected 60,000 - 70,000 years ago. This leaves the Southern route as the most plausible route.
Toba event
About 74000 years ago (74 kya), a supervolcanic eruption happened at the site of the present day Lake Toba in Indonesia. It is not clear if the migration of H.Sapiens out of Africa happened before or after the eruption. Hence this event cannot be used to date the migration.
mtDNA and Y Haplogroup
As mentioned before, the L3 mtDNA haplogroup is found only Africa but all the people outside Africa are descended from the M or N mtDNA haplogroups whose origin is outside Africa. When the Y haplogroup is considered, the sub haplogroups of CF (C & F) and DE (D & E) are the haplogroups for all people outside Africa. Both of them point to a single exit.
Fig. 26 Possible migration routes from Africa
Courtesy -
Sea-level change, palaeotidal modelling and hominin dispersals: The case of the southern Red Sea
Jon Hill,Alexandros Avdis,Geoff Bailey,Kurt Lambeck
Creative Commons CC_BY license
Fig.27 Possible migration routes of H.Sapiens from Africa
from Out-of-Africa, the peopling of continents and islands: tracing uniparental gene trees across the map
Mitochondrial DNA in Human Diversity and Health: From the Golden Age to the Omics Era
The Expansion of mtDNA Haplogroup L3 within and out of Africa
Out-of-Africa, the peopling of continents and islands: tracing uniparental gene trees across the map
David Reich – How One Small Tribe Conquered the World 70,000 Years Ago - YouTube