DNA-Part1

DNA overview.

Human genome.

DNA chemistry.

Nucleotide.

Base.

Double helix structure.

Codons.

Amino acids.

Genes.

Protein structure.

Protein functionality.

DNA and proteins.


DNA overview.

What is common, to bacteria, a butterfly, a Gold fish, a Rose plant, a Mango tree,

an ant, a elephant, a dog, a cat, a chimpanzee, and us?

We might say, that all of them are living beings.

There is another thing, that is common to all of them.

All of them have DNA.

All living organisms, have DNA inside them.

We could even say, that DNA is the fundamental unit of life.


Organisms.

There are millions of types of living organisms.

Biologists have classified them, into a hierarchy.

There are seven levels in the hierarchy.

They are:

1. Kingdom.

2. Phylum.

3. Class.

4. Order.

5. Family.

6. Genus.

7. Species.

Any living organism, can be identified according to this classification.

For example, human beings belong to the :

Kingdom of Animalia.

Phylum of Chordata.

Class of Mammalian.

Order of Primate.

Family of Hominidae.

Genus of Homo.

Species of Sapiens.

We are called Homo sapiens in short.

This 7 step classification was done by biologists, before DNA was discovered.

After DNA was discovered, amazingly, a strong correlation, was found,

between this classification and DNA.

The species of human beings, or homo sapiens, have a unique type of DNA.

Similarly, bacteria, a butterfly, a Gold fish, a Rose plant, a Mango tree,

an ant, an elephant, a dog, a cat, a chimpanzee,

all have a different and unique kind of DNA.

All butterflies have a type of DNA.

All mango trees have a type of DNA.

All dogs have a type of DNA.

All cats have a type of DNA.

If we know the DNA, we can identify, what kind of living organism it is.


Evolution of life.

Even before DNA was discovered,

scientists knew that all kinds of life are inter-related.

Charles Darwin proposed, the theory of evolution.

According to this theory, all forms of life evolved, from a common source.

Over thousands and millions of years, many species of life, evolved on earth.

All of them evolved, from a common ancestor.

The earliest forms of life, were a single celled organism.

They were called as prokaryotes.

Millions of types of living organisms, evolved from this common ancestor.

We homo sapiens, human beings, are one of them.

You would be fascinated to know, that we share some parts of DNA,

with all our ancestors,

from apes to bacteria.


DNA Discovery.

James Watson, Francis Crick and Maurice Wilkins, received the noble prize,

for discovering DNA.

Unfortunately another scientist, Rosalind Franklin,

who played an equally important role, in its discovery,

was not alive long enough, to receive the prize.


DNA evolution.

Scientists, using DNA, were able to prove, that we evolved, from a common ancestor.

With DNA, they were able to map the tree of life.

In this tree of life, we homo sapiens, are an outer branch,

which made an appearance, about 200 thousand years ago.

We still share, some DNA with bacteria.

Our closest cousins, in the tree of life, are chimpanzees.

We share about 95% of our DNA, with chimpanzees.

The 5% that differentiates us, makes a real difference.

We can do an amazing amount of things, that a chimpanzee cannot do.

However, many life processes, with in our cells, are common to chimpanzees.

It is a common practice, by scientists, to test drugs, invented for human beings,

on mice, and chimpanzees.

DNA has been able to prove that, all forms of life, is inter-related,

and evolved from a common ancestor.


DNA similarities and differences.

DNA is a great way, to identify a species of life.

We can know from DNA, whether an organism is a cat, or a dog, or a human being.

Does this mean, that all human beings, have exactly the same DNA ?

DNA springs the next big surprise.

Human beings, inhabiting different parts of the world, have slightly different DNA.

Within the same country, people who traditionally, lived in one area,

can have DNA, different from people, who lived in a different area.

Scientists have used this principle, to trace the migration, of human beings,

from Africa, to the rest of the world.


Family members tend to have very similar DNA.

Can we say, that very close family members, have the same DNA?

Again DNA springs a surprise.

Each family member, have their own unique DNA.

Even siblings, will have separate and unique DNA.

Any person in the world, living or dead,

can be uniquely identified by their DNA.

Earlier, finger prints were used to identify a person.

It is possible, that in the future, DNA will be widely used to identify a person.

In a limited way, DNA is already used to identify persons.


DNA, can not only be used as a great differentiator,

but also as a unique identifier.

This principle is equally applicable, to other forms of life.

Dogs will have similar DNA.

But, the DNA of your neighbour’s dog,

will be different, from the DNA of your dog.


Cell and DNA.

Where does this mysterious DNA reside?

It resides in “all” the living cells.

Almost all the trillions of living cells, in our body,

have the same DNA, residing in them.

DNA is the very basis, of a living cell.


A cell comprises of many organelles.

All the organelles, are enclosed within the cell membrane.

The most important component of a cell, is the nucleus.

The nucleus is a membrane enclosed organelle.

DNA resides in the nucleus of the cell.

The DNA is enveloped, by its own membrane.

The cell nucleus within the cell, contains all the primary DNA.


The entire metabolism of the cell, is controlled by the DNA.

DNA can be considered as the encyclopaedia of life.

It contains all the information, to enable the complex biochemical processes,

that takes place, within the cell.

Proteins are like biochemical machines, in living organisms.

There are thousands of proteins, which perform, thousands of functions.

The information needed, to construct each protein, is encoded in the DNA.

More amazingly, all the rules for the construction, of the protein,

are also encoded in the DNA.

Protein synthesis, is one of the important functions of the cell.

The DNA encodes all the data, and all the instructions,

to enable all the complex processes, that constitute life.


Cell specialisation.

Every cell in our body, has identical DNA.

The DNA in the tip of your nose, is the same, as the DNA, in the tip of your toes.

The DNA in our eyes, our ears, our nose, our tongue, our heart, our brain,

our kidneys, and our liver, are all exactly the same.

How is it then, that these diverse organs, perform such diverse functions?

Certain cells, specialise in certain functions.

Only certain parts , of the DNA, expresses itself, in certain cells.

For example, selective parts of the cell, expresses itself,

in the retina cells, of the eye.

This enables the retina, to perform a specialised function.

If we take an analogy,

the complete encyclopaedia of life, is present in every cell,

but only a certain topic, is used in a specialised cell.

DNA, is a general purpose, all encompassing encyclopaedia.

DNA intelligently, and selectively disseminates, information and instructions,

in specialised cells.


DNA replication.

The most fascinating fact about DNA, is that it can replicate itself.

It is the only part of the cell, that can replicate itself.

Cells in our body have a limited life span.

They keep dying, and new cells are constantly being regenerated.

This requires cells to replicate itself.

The starting point of cell replication, is DNA replication.

First the DNA replicates, and then it provides the information and instructions,

for the cell to build itself.


Life begins as a single cell.

Part of the father’s DNA, and part of the mother’s DNA, fuse together.

A new DNA is born.

This is the beginning of life.

From this single cell, trillions of other cells, are generated.

The cells specialise in different functions.

The end result, is the wonderful body, that nature has gifted us.


Human Genome.

The DNA molecule, is analogous, to the encyclopaedia of life.

The basic alphabets, of DNA are, A, T, C, and G.

The entire encyclopaedia, of life, is written with these alphabets.

These 4 alphabets, are used to write, 3 character codons.

Codons, can be considered as the words of DNA.

A gene, is a string of codons.

A gene, is like a sentence, in the encyclopaedia.

A chromosome is a string of genes.

Each chromosome, can be considered, as a volume of the encyclopaedia.

Human beings have 23 pairs of chromosomes.

The encyclopaedia, has 23 volumes.

Each volume has 2 copies.

So, DNA has 23 pairs of chromosomes.

All the human DNA, is collectively referred to as the human genome.

The human genome, is the encyclopaedia of life.

This summarises the organisation of DNA,

the molecule of life.

DNA chemistry.

Let us imagine a small game of scrabble.

We are given 4 chemical compounds.

These 4 chemical compounds are called,

A, T, C, and G.

With these 4 compounds, A, T, C, and G,

we are asked to build, a human being.

Is it possible?

It sounds impossible.

But, for nature it is possible.

Nature has an amazing way, to build complex organisms,

from simple building blocks.

It is worth exploring, the chemistry of DNA,

to discover, how this is possible.

DNA stands for Deoxyribonucleic acid.

Deoxyribonucleic acid, is just a biochemical molecule.

DNA is a chain of sub units, called nucleotides.

Nucleotide.

DNA molecules store genetic information,

in their repeating subunit structure.

Each subunit, is called as a nucleotide.

A nucleotide has three components:

A phosphate group,

A sugar called deoxyribose,

and a base.

The base is a ring of carbon and nitrogen atoms.

The combination of these, is a nucleotide.

The phosphate group, and the sugar group are exactly the same,

in all the nucleotides.

It is the base, which is different for the different nucleotides.

DNA is a chain of nucleotides.

The chain of nucleotides, is bio-chemically linked together, by the phosphate group.

Base.

The base is a hexagonal ring, of carbon and nitrogen atoms.

It is the base, which differentiates one nucleotide, from another.

There are only four types of nucleotides.

They are:

Adenine, denoted by the alphabet “A”.

Thymine, denoted by the alphabet “T”.

Cytosine, denoted by the alphabet “C”.

Guanine, denoted by the alphabet “G”.

Cytosine and Thymine, or “C” and “T”,

contain one ring of nitrogen and carbon atoms.

Adenine and Guanine, or “A”, and “G”,

contain two rings of nitrogen and carbon atoms.

This biochemical difference, is the significant factor,

which determines the pairing properties, of these nucleotides.

C and T are called as pyrimidine bases.

A and G are called as purine bases.

DNA is a long chain of nucleotides,

which has one of these bases.

We can think of DNA, as having the string, of these bases.

It is the base, which essentially differentiates, and uniquely identifies,

a nucleotide.

It is common practice, to identify the whole nucleotide,

with a single alphabet, representing the base.

We will also discuss the nucleotides, with their short name.

For example, we can say that DNA is a string of characters ,

comprised of ”A”, “T”, “C”, and “G”.

Double helix structure.

Let us imagine the structure of DNA.

We will use the analogy of a ladder.

The ladder has, two long poles.

These two poles, are joined together, by multiple short steps or rungs.

In DNA, each pole corresponds to a long string of bases.

For example, the string might be,

A, T, T,. A, G, G,. T, A, C,. G, T, A,. T, G, T,. G, A, T,.

and so on.

The so on, of course, is a very long so on.

This string will have billions of characters.

Though it is a very long string, the component characters, will be only,

A, T, C, or G.


The ladder has two long poles.

DNA has two long strings, of billions of bases.

Both the strings, will comprise of the same bases,

of A, T, C, or G.

The bases A, T, C and G, can be considered as the alphabets of life.


Now we need to connect these two strings, to resemble a ladder.

Nature follows a very simple rule, to make this connection.

The base A, always connects to a base T.

The base C, always connects to a base G.

The rungs of the ladder, can be the base pairs:

A, T.

T, A.

C, G.

G, C.

Make about 3 billion of these base pairs, and we have life,

in the form of DNA.


In the actual structure of the DNA, the ladder is twisted,

in the form of a helix.

So, the DNA has a double helix structure.

There is a simple and elegant beauty in this design.

The design codes the instructions, for producing the essential molecules,

of life.

For example, the codes for synthesising, all the proteins, are built into,

the code of the DNA.


There is another more fascinating aspect in this design.

The defining nature of life, is that it should be able to reproduce.

Reproduction, starts with the splitting of the DNA ladder,

into two strings.

Each string finds the partners, for its bases, A, T, C and G,

to form two DNA molecules, from the original one.

This is the very basis of life, and the continuity of life.

Codons.

We know that each string, in the DNA double helix structure,

is a long string of alphabets, A, T, C and G.

There is an interesting way, that these bases organise themselves.

They organise themselves into three character strings.

Each three character string, are triplet code, is called as a codon.

Some examples of codons are :

A, T, T,.

A, G, G,.

T, A, C,.

G, T, A,.

T, G, T,.

G, A, T,.

There is an interesting correlation, between the 3 character codons,

and an amino acid.

Most of the 3 character codons, corresponds to a specific amino acid.

Amino acids.

Almost all amino acids have an amino group,

and a carboxyl group, attached to a carbon atom.

The amino group is - N H 2.

The carboxyl group is - C O O H.

The third bond of the carbon atom is attached to a hydrogen atom.

The fourth bond of the carbon atom is attached, to an amino acid side chain.

We will call this amino acid, side chain, as “R”.

The general formula for an amino acid, can be written as:

R - C H, with one branch of N H 2, with another branch of C O O H.

The side branch “R”, uniquely defines, an amino acid.

Human beings have the same set of 20 different amino acids.


Each of the 20 amino acid, that human beings use, corresponds to one codon.

By arranging the four alphabets A, T, C, and G, in 3 character words ,

we can create 4 multiplied by 4 multiplied by 4 equals to 64,

3 character words.

We can therefore get 64 unique combinations, or codons, from these alphabets.

Though theoretically, we can have 64 possible amino acids,

only 20 amino acids are used by the body.

Each of these 20 amino acids, can be identified by a codon.

An amino acid, is referred to, by a full name, like Arginine,

or a short name like A,r,g.

For example,

A, T, T,. corresponds to the amino acid :. i,l,e. : which is called Isoleucine.

A, G, G,. corresponds to the amino acid:. A,r,g. : which is called Arginine.

T, A, C,. corresponds to the amino acid:. T,y,r. : which is called Tyrosine.

G, T, A,. corresponds to the amino acid:. V,a,l. : which is called Valine.

T, G, T,. corresponds to the amino acid:. C,y,s. : which is called Cysteine.

G, A, T,. corresponds to the amino acid:. A,s,p. : which is called Aspartic acid.

So, the seemingly meaningless string of alphabets, we discussed earlier,

corresponds to easily identifiable amino acids.


If, A, T, C, and G are considered as the alphabets of life,

codons can be considered as the words of life.

There are 20 words of life, each corresponding to an amino acid.

The alphabets and words of life, are universal.

For example, the code word for the amino acid, tryptophan,

in the DNA of a bacterium, amoeba, plant, animal, or human being,

is exactly the same.

All forms of life on Earth, share the same alphabets, and words of life.

Genes.

Is the code of life, just one long string of words?

No.

Nature groups these words together, in meaningful sentences.

Many 3 character words, constitute a sentence.

How does nature know, that a sentence has ended, and a new one started?

We need to go back to the alphabets of life.

The 4 alphabets, we know, can be combined into 64, 3 character words.

We use only 20 amino acids, or words.

Some of the combination of the alphabets,

which do not correspond, to an amino acid,

is used by DNA, as termination codes or stop codes.

They act as terminators, or full stops, to a sentence of words.

The string of codons, in the DNA, is called as a gene.

Just like a sentence has many words,

a gene has many codons.

The number of words in a sentence may vary.

Some sentences are relatively shorter.

Some sentences are relatively longer.


Genes are meaningful sentences, of a living organism.

Many genes could be common to many organisms.

Some genes could be specific, to a particular type of living organism.

In the tree of life, the closest relative to human beings, is the chimpanzee.

We share 95% of our genes, with the chimpanzee.

We also share genes, with dogs, lions, birds, fish, plants, trees, insects and bacteria.

We are a member, of a very large family, called life.


In common usage, the term, DNA, and gene, are used interchangeably.

We might hear the phrase:

It is in the genes.

It is in the DNA.

In both cases, what we intend to convey, that it is in our genetic code.

This is a convenient way of communication.

Though, we now know, that they are technically different,

we will follow the convention of using them, interchangeably.

Proteins are biochemical molecules.

There is an interesting correlation between genes, and proteins.

Proteins are the molecular machinery, of life.

Genes contain the code, to synthesise these proteins.

All the genes, that humans have is collectively known as the human genome.

We do not know as yet, exactly how many genes, are active,

in the human genome.

We can say, that we have approximately about 30000 genes.

Protein structure.

Two factors determine the biochemical formula of the protein.

- The total number of amino acids, in the chain.

- The specific type of amino acid, at each position, in the chain.


The number of combinations, in which protein molecule are built,

are almost infinite.

Peptides are smaller strings, of amino acids.

Let us take an example, of a simple peptide molecule with just 3 amino acids.

Since we can select from 20 different amino acids, and 3 possible positions,

we can have 20 into 20 into 20, equal to 8000 types of peptides.

If we choose to have six amino acids in the chain,

the number of possible peptides is 20 to the power of 6,

equal to 64 million types of peptides.

A typical protein will have more than thousand amino acids,

linked together.

If we extend the arithmetic, to calculate the number of possible proteins,

the resulting number will be so very large, that it is almost infinite.

The body thus using only 20 types of amino acids, can synthesise,

an almost unlimited number of proteins.


Proteins are not linear biochemical molecules.

They coil up into different shapes.

The sequence of amino acids influences, the shape of the protein.

The shape of the protein molecule, also plays a significant role,

in its functionality.

The shape of the protein, exposes certain amino acids to biochemical activity.

For example, if a particular protein, is shaped in a particular way,

it will have certain exposed amino acids.

This will attach itself, to a particular virus.

This protein will be used by our immune system, to attack the virus,

and protect us from it.


The structure of a protein is analogous to a long string of beads.

Each bead represents one of the 20 amino acids.

This chain of beads, might be coiled in a particular shape.

All this contributes, to the uniqueness, of the protein.

Now, we can begin to comprehend the fascinating variety and complexity,

that can be built from, 20 simple amino acids.

Protein functionality.

Proteins perform, an amazing number of functions, within the cell,

and in the human body.

Each unique protein, performs a unique function.

Within the cell, there are large variety of proteins.

Proteins make things happen.

Proteins break down foods, make hormones,

provide structure and shape to cells, etc.


The proteins in a particular cell type, is called a proteome.

The study of interactions between specific proteins, is the key to understand,

the important cellular functions.

Ultimately, this is what determines the cell type.

Proteins also perform a wide variety of extra cellular functions.

This is the reason, we can say, that proteins, are the molecular machines of the body.

To get an idea of what proteins can do, we will discuss a few examples.


Most structural proteins, are fibrous proteins.

When proteins bind together, they form fibrils, which have structural functionality.

Collagen is a structural protein, found in most connected tissues in the body.

It is the most abundant protein, in our body.

Collagen is present in tendons, ligaments, and skin.

It is also present in the eye cornea, cartilage, bones and blood vessels.


The cytoskeleton is the intra cellular matrix,

that supports the cell shape, and functions.

The cytoskeleton comprises of proteins.


Mechanical forces, are generated by motor proteins.

They generate the forces, in contracting muscles.


Keratin found in hard filamentous structures, such as hair and nails,

is also a structural protein.


Enzymes are used to catalyse bio chemical reactions.

Each enzyme catalyses one type of bio chemical reaction.

Each enzyme is a particular type of protein.

There are about 4000 known bio chemical reactions,

which are catalysed by protein enzymes.


Enzymes can play a dramatic role, in a bio chemical reactions.

A reaction which take years, can be accelerated by enzymes,

to take place in seconds.

Though the enzymes, may not directly participate in the reaction,

it can be considered, as important as the bio chemicals themselves.

Enzymes are also involved,

in DNA replication, repair and transcription.


Sensor functionality, is also achieved by proteins.

Specific proteins, called colour opsins identify specific colours.

Specific odour, are deducted by specific proteins.


Many signalling mechanisms are also carried out by proteins.

Hormones are signalling molecules.

They travel in the blood circulatory system, and carry messages,

to distant organs.

Insulin is an example of a protein, which acts as a signalling molecule.

It regulates the metabolism of carbohydrates and fats.

It stimulates the absorption of glucose, from the blood stream.

It also stimulates the synthesis and storage of fat.


Membrane proteins, act as receptors, in cell membranes.

They bind with signalling molecules, and induce a bio chemical reaction.

Many drugs are targeted, to act on membrane proteins.


Membrane proteins can also act as selective gate keepers, in the cell membrane.

They allow specific molecules, to come in, and go out of cells.

Oxygen, carbon dioxide, glucose, sodium, potassium etc.

which travel across membranes, are regulated by membrane proteins.


Antibodies are proteins, which function in the immune system.

Antibodies bind to antigens, or foreign substances, like harmful bacteria,

and target them for destruction.


Ligand transport proteins, bind to certain molecules,

and transport them, in the body.

The protein haemoglobin, transports oxygen, in the blood stream,

from the lungs to all parts of the body.


We discussed a few examples, of protein functionality.

There are thousands of such functionalities,

which are performed by proteins, in the human body.

DNA and Proteins.

Proteins are the most important molecular machines, in the body.

Thousands of proteins, in our body, enable and regulate metabolism.

DNA is directly responsible, for providing the instructions,

to synthesise these proteins.

The instructions for synthesising, all the proteins, that we require,

are built into the genes, of DNA.

Typically, one gene would be responsible,

to provide the main instruction, for synthesising one protein.

DNA resides inside the nucleus of the cell.

DNA does not directly synthesise proteins.

It sends out the instructions, from the nucleus, to the cell, to do this.

Protein synthesis takes place within the cell, using the instructions.

It uses the amino acid in the cell, for this process.

DNA uses a biochemical molecule, called RNA, to send the instructions.


The entire DNA is present in every cell,

regardless of what type of cell it is.

Cells in the human body are specialised, some examples are,

bone marrow cells.

kidney cells.

liver cells.

eye cells.

etc, etc,.

Each type of differentiated specialised cell, synthesises some specialised proteins.

Only certain genes, express themselves, in a specialised cell.

By selective expression of genes, cells are able to specialise,

in the thousands of different functions, they perform in the body.

So, we are able to have the same DNA, but with specialised functions,

because of the selective expression of genes.

When a specific gene is expressed, a specific RNA is created,

and sent out to the cell.