The simplest unit of genetic material is a nucleotide. Nucleotides combine together to form a polymer, a polynucleotide, which is used by cells and viruses to make proteins that perform functions and form structural components. There are two types of polynucleotides: ribonucleic acid, known as RNA, and deoxyribonucleic acid, known as DNA. A nucleotide consists of three components: a pentose sugar, a phosphate group and a nitrogenous base.
Figure 1. A nucleotide consists of three components: a phosphate group, a pentose sugar and a nitrogenous base.
The three components bond covalently through a condensation reaction releasing two molecules of water. The phosphate group always bonds directly to a carbon on the pentose sugar and the nitrogenous base bonds directly to another carbon on the pentose sugar. The nucleotide, consisting of the three components, can then link together with other nucleotides by a condensation reaction to form a polymer.
Figure 2. The three components of the nucleotide bond by condensation reactions.
The phosphate group is derived from phosphoric acid, H3PO4. Phosphoric acid is a weak acid that is capable of undergoing ionisation, depending on the pH of the surrounding environment. The phosphoric acid molecule is triprotic, meaning that it can lose up to three protons by ionisation in order to bond to the pentose sugar and other nucleotides by a condensation reaction.
Figure 3. The structure of the phosphate group is derived from phosphoric acid by ionisation (loss of three protons, H+ ions).
The pentose sugars are five-carbon monosaccharides that bond in a pentagonal arrangement. There are two different sugars that bond, either ribose or deoxyribose, depending on whether RNA or DNA is constructed. RNA requires ribose, while DNA requires deoxyribose. The difference in the two monosaccharides is that deoxyribose has one less oxygen – it has been 'deoxygenated', so the chemical formulas for these two sugars differ. Ribose has the ideal formula for a monosaccharide, C5H10O5, while deoxyribose has the formula C5H10O4. This difference in structure allows the body to identify the difference between RNA and DNA.
Figure 4. The pentose sugar components of DNA and RNA.
There are four different nitrogenous bases that are used for DNA – adenine, guanine, cytosine and thymine – abbreviated by their first letters, A, G, C and T. For RNA, adenine, guanine and cytosine are used, but instead of thymine, a different nitrogenous base is used: uracil, abbreviated by the letter U. The difference between thymine and uracil is another way the body can distinguish between RNA and DNA.
Figure 5. The nitrogenous bases of DNA and RNA.
Adenine and guanine are referred to as purine bases, referring to their two fused rings, one hexagonal and one pentagonal. There are slight differences in their structure: adenine has one amino functional group, while guanine has a carbonyl and an amino functional group. Thymine, uracil and cytosine are referred to as pyrimidine bases, referring to their single, hexagonal ring. The differences in the structure of these three bases are slight – thymine has two carbonyl functional groups and a methyl group, uracil has two carbonyl functional groups, while cytosine has one carbonyl and one amino functional group.
The structures for the two pentose sugars and the five nitrogenous bases are in section 34 of the data booklet. You should be able to identify the distinguishing functional groups.
One or more nucleotides can bond together by a condensation reaction forming a polymer, called a polynucleotide. The nucleotides bond together through the pentose sugar and the phosphate group, making an alternating pattern of sugar-phosphate-sugar-phosphate, referred to as a sugar-phosphate backbone, with the nitrogenous bases extending from the pentose sugar.
Figure 6. The assembly of an RNA strand.
The hydroxyl group from the pentose sugar from one nucleotide reacts with the hydroxyl group from the phosphate of another nucleotide by a condensation reaction, releasing a water molecule.
Figure 7. The formation of a polynucleotide by a condensation reaction.
When balancing equations for condensation polymers involving nucleotides, just as with other condensation reactions forming polymers, the number of water molecules evolved is always one less than the number of monomers used. For example, if six nucleotides are forming a polynucleotide, the number of water molecules evolved would be (6-1) = 5.
DNA and RNA are both polynucleotides, but they perform different functions and therefore have different structures. The body must be able to distinguish between DNA and RNA molecules in order to ensure that they are performing their different functions for the organism.
RNA molecules generally consist of a single strand of polynucleotides, although there are some viruses that contain a double strand, similar to DNA. Recall that DNA uses a deoxyribose sugar and thymine base, while RNA uses a ribose sugar and uracil base. RNA molecules are typically much shorter, containing fewer nucleotides, than DNA and only contain small portions of genetic material.
DNA molecules consist of two strands of polynucleotides that are held together by hydrogen bonds between the nitrogenous bases. The structure of a ladder is used as an analogy for the structure of DNA. The sugar-phosphate backbone makes up the two rails of the ladder, while the nitrogenous bases make up the rungs of the ladder. The strong hydrogen bonds keep the important genetic code, spelled out by the nitrogenous bases, well protected. The two strands of DNA can be separated by overcoming the hydrogen bonds with the use of special enzymes or high temperatures above 95°C.
The two strands are aligned with a specific orientation. The sugar-phosphate backbones of the two strands run anti-parallel to each other, that is in opposite directions. Note in Figure 1 that the pentose sugar (in blue) for one strand is upright, while in the other strand is upside down.
Figure 1. DNA consists of two strands of polynucleotides held together by hydrogen bonds.
In addition, the nitrogenous bases have specific pairings. Recall that the nitrogenous bases are either purines (adenine and guanine), with a double fused ring structure, or pyrimidines (thymine and cytosine), with a single ring structure. The difference in sizes of the purines and the pyrimidines requires that only a purine can pair with a pyrimidine, otherwise there would not be a consistent distance between the sugar-phosphate backbones. In addition, the placement of the functional groups on the nitrogenous bases ensures that only a specific purine can pair with a specific pyrimidine through hydrogen bonding.
Figure 2. The two strands of polynucleotides run anti-parallel and with complementary base pairing.
This specificity in bonding is referred to as complementary base pairing. The base pairing is referred to as 'complementary' since A always pairs with T, while C always pairs with G. The bonding between adenine and thymine forms two hydrogen bonds, while the bonding between cytosine and guanine forms three hydrogen bonds.
Figure 3. Complementary base pairing in DNA is based on the relative sizes of the purine and pyrimidine rings and the location of functional groups to align for hydrogen bond formation. Note that the A-T base pairing involves two hydrogen bonds and the G-C base pairing involves three.
When the two anti-parallel strands are aligned and in an aqueous biological environment, such as in a cell, the DNA molecule forms a double helix, that is, two strands which are twisted into a helical structure. This twisting is the result of the properties of the components that make up the nucleotide. The sugar-phosphate backbone is hydrophilic (water-loving). The pentose sugar molecule has many polar functional groups, such as hydroxyl and carbonyl groups, that allow it to dissolve well in aqueous environments, forming additional hydrogen bonds and dipole-dipole interactions with water molecules. The phosphate group has a negative ionic charge, which allows for ion-dipole attractions with water molecules. The nitrogenous bases, however, are hydrophobic (water-fearing). The rings are nonpolar and do not interact with water, resulting in an attraction between the nitrogenous bases to one another. This makes the double-stranded structure twist and become compact as the nitrogenous bases attract each other and shield away from the aqueous environment.
Figure 4. The structure of DNA is a double helix, while RNA generally consists of a single strand.
The long strands of DNA could be subject to breakage if not protected, so when DNA is not in use, it is coiled tightly with proteins into a structure called a chromosome. The double helix is wrapped around proteins called histones, like 'beads on a string', to form a structure called a nucleosome. The nucleosomes then fold up to form chromatin fibre, which is then further compacted into the final structure, a chromosome. This packaging allows for the long strands of DNA to be protected and to fit into the nucleus of the cell. When a portion of the genetic code on the chromosome is required, that portion can lengthen and unwind, exposing what is needed, then recoiling when finished.
Figure 5. The packaging of DNA into a chromosome relies on the attraction of opposite charges – the negatively charged phosphate groups with the positively charged histone proteins.
DNA associates well with proteins, such as histones, because of the negative charge on the phosphate group in the backbone. The binding of the negatively charged DNA with proteins allows DNA to be packaged into chromosomes for protection and for a compact structure that makes DNA more likely to divide more efficiently during cell division. When the sugar-phosphate backbone is formed, the phosphoric acid precursor, H3PO4, loses two hydrogen ions to form bonds to two sugar molecules and the third hydrogen is ionised, resulting in a phosphate ion with a negative charge for each nucleotide. The histone proteins are basic and have a positive charge which attracts to the negatively charged sugar-phosphate backbone, neutralising the charges.
It has been mentioned that DNA contains the genetic code, but what exactly does that mean? Every cellular function is controlled by the nucleus of the cell, which is where the DNA is housed. Each cell contains the complete genome, that is, a complete set of DNA for that organism, regardless of the type of cell or the types of functions it performs. When the cell is required to perform a specific function, the nucleus locates the gene or genes responsible to carry out that function and makes a copy of the DNA, so that the precious genetic material never leaves the nucleus of the cell. When a cell divides into two cells for growth, repair, replacement or reproduction, the DNA is again copied to make an exact duplicate. The copying of the DNA is done carefully to avoid mistakes and is an important safeguard to ensure no genetic material is lost by the cell.
Recall that each cell in an organism contains the complete genome, that is, a complete set of all the DNA. When the organism grows, becomes damaged or needs to reproduce, the cell divides by a process known as mitosis. Mitosis begins by making a complete copy of the genome for the new cell that is made. When a 'parent' cell divides to form two 'daughter' cells, the DNA is carefully copied before cell division occurs. This copying of the DNA is called DNA replication. As the cell continues the process of mitosis, the duplicated genome divides perfectly, making two identical daughter cells, each with the complete genome that the parent cell contained.
Figure 1. Cell division, mitosis, begins by copying the entire genome.
The DNA is replicated in a clever process that reduces the risk of mistakes. Recall that the double helix is held together by hydrogen bonds between complementary base pairs – A pairs with T, C pairs with G. DNA replication begins by separating the two strands of DNA polynucleotides using enzymes to unwind the strands and overcome the attraction of the hydrogen bonds, a process referred to as unzipping, as the two strands separate, similar to a zipper. The enzymes only open a small portion of the double helix at a time, known as the replication fork. When the two strands are unwound and separated, free nucleotides bind to the newly single-stranded portion, complementary to the bases that are exposed. The new nucleotides then undergo a condensation reaction to form the second side of the helix.
Figure 2. DNA is replicated by separating the double helix and using the principle of complementary base pairing to make two new strands of DNA.
For example, if the original double helix of DNA contained the following sequence of complementary base pairs:
When the two strands separate and the free nucleotides complete the second side of the double helix:
The DNA now has two identical copies, each new double helix containing one original strand and one newly synthesised strand. This process continues down the DNA double helix until the entire strand is copied. The result is two strands of identical DNA, with one side of the double helix being the original strand and the other side being the newly generated strand. The process of replicating the DNA is referred to as semi-conservative, since the two daughter double helices each contain one new and one original polynucleotide strand. This process reduces the potential for errors in the genetic code, even if each daughter cell undergoes additional rounds of mitosis.
Figure 3. The semi-conservative replication of DNA.
Occasionally, there are mistakes made during replication, known as mutations. Special enzymes bind to the double helix and run down the strand checking for errors in bonding between complementary base pairs. These enzymes can cut out an incorrect base pairing, allowing for the correct nucleotide to takes its place. Sometimes, however, the enzymes do not detect the mutation or are unable to fix it, resulting in a permanent mutation that would be passed on to new daughter cells in subsequent rounds of mitosis.
The details of DNA replication are not required.
When a cell is required to perform a function, the nucleus identifies the portion of the genome required. The nucleus then copies this portion of the genome and exports it out of the nucleus and into the rest of the cell. The process of making the copy of the DNA is called DNA transcription. Transcription allows the part of the genome to be accessed and read without removing it from the nucleus, so that the entire genome remains intact.
DNA transcription is a similar process to DNA replication, in that the genetic material is being copied, but the process is different. Firstly, only small portions of the genome, called genes, are accessed and copied, not the entire genome. One or more genes may need to be transcribed at a time in order for the cell to ultimately perform a function. A second difference is that the copy that is made from the original DNA is made of RNA, not DNA. The nuclear membrane does not allow for DNA material to leave the nucleus, as a precaution against losing the precious genetic material, so RNA is made and exported out of the nucleus.
Recall that RNA has a few differences compared to DNA. First, it is generally single-stranded, rather than a double helix. Second, it contains ribose sugars, rather than deoxyribose sugars. Lastly, it contains the nitrogenous base uracil, rather than thymine. These three differences ensure that the genetic code is copied from the original DNA molecule, but the cell membrane will not confuse it with the original copy of DNA. There are different types of RNA and the one that is produced during transcription is called messenger RNA, or mRNA.
The portion of the DNA strand containing the necessary gene is first unwound and unzipped by enzymes. The RNA strand is then formed by complementary base pairing with the help of special enzymes to start and stop in the correct positions on the original DNA strand. The mRNA strand is released and exported out of the nuclear membrane, carrying the genetic code with it. The two strands of the DNA then close again by hydrogen bonding between base pairs and the double helix winds back up.
Figure 4. The transcription of DNA to mRNA involves unzipping a portion of the DNA and making a copy by complementary base pairing.
For example, if the original double helix of DNA contained the following sequence of complementary base pairs:
When the two strands separate, the free RNA nucleotides form mRNA by reading the bases:
The mRNA is released and the original two strands of DNA attract and twist, forming the double helix again:
The mRNA strand then travels outside of the nucleus with the help of proteins, bringing the genetic code with it, but not the original DNA from the genome. Just as with replication, there are occasionally mutations (mistakes in copying) that are made during transcription. Special proteins help to reduce or repair the incidence of mutation.
The final process involving the genetic code is actually reading the content of the code. The genes that are transcribed from the original DNA double helix into mRNA that travels out of the nucleus must be read and interpreted so that the cell can perform necessary functions. DNA translation is the process where the mRNA is decoded and made into a protein. Cellular functions occur by the synthesis and action of proteins, so all genetic material codes for the primary structure of proteins, that is the sequence of amino acids.
In order for the four nitrogenous bases in mRNA to code for 20 different amino acids, translation involves a triplet code, that is, three nitrogenous bases code for one amino acid. The three-letter base sequence for amino acids is referred to as a codon. With three letters per code and four bases, there are a total of 43=64 possible codons in the genetic code. In addition to the 20 amino acids, there are also special codons for 'start' and 'stop', indicating to the proteins that facilitate transcription and translation where the gene begins and ends so that the genetic code is read correctly. The amino acid methionine, codon AUG, is the 'start' code, while there are several 'stop' codons that do not code for an amino acid.
Figure 5. The genetic code.
During translation, special RNA molecules, called transfer RNA, or tRNA, are used to carry the amino acid with the complementary codon to the mRNA strand. tRNA molecules contain the complementary codon, known as the anti-codon, and the amino acid that the codon codes for. When DNA translation takes place, free tRNA molecules are available to bind to mRNA by complementary base pairing, reading the genetic code. Special proteins help to bind the amino acids together, forming the protein that was coded for in the original DNA sequence.
Figure 6. Translation of the mRNA into a protein sequence.
For example, if the mRNA contained the following sequence of bases:
The tRNA molecules would have the anti-codon (remember, the first must be a 'start' codon, AUG, for methionine):
The amino acid sequence would be:
You should be able to determine the sequence of DNA to mRNA to tRNA using the concept of complementary base pairing. You are not required to know how to determine the protein sequence using the three-letter codon.
As mentioned previously, when mistakes in the genetic code occur during replication, transcription or translation, a mutation occurs. Special proteins are present during all of these processes to prevent or repair mutations, but some are not identified or fixed, passing on the mutation. Some mutations occur in sections of DNA that do not code for any functional proteins, so these mutations are harmless. Other mutations, however can have minor effects, such as a freckle on your skin, or severe effects. One such example occurs when a single base mutation occurs for the formation of red blood cells, resulting in a condition known as sickle cell disease, or sickle cell anaemia, where the red blood cells do not form correctly and are unable to carry oxygen efficiently in the bloodstream. People with this condition tire easily, have painful swelling of the hands and feet and are prone to infection. Sickle cell disease is the result of a single incorrect base, resulting in coding for the wrong amino acid during translation.
Figure 7. Sickle cell disease is the result of a single mutation in a nitrogenous base.
All cells in a multicellular organism originated from a single cell, and through cell division and DNA replication, all cells contain the complete genome. When mutations occur during DNA replication, sometimes there is no impact, sometimes the result is detrimental to the organism and sometimes it is beneficial. Scientists have very recently begun to manipulate the genome of organisms in a process known as genetic engineering. Genetic engineering can involve adding, removing or altering an organism's genome in order to introduce, modify or remove genes, changing the way the organism functions. Genetic engineering has been practised since the 1970s with animals, single-celled organisms and foods. Genetic engineering generally involves introducing genes from one organism into another. The reasons for doing this can vary, but there are many practical applications.
Genetically modified organisms (GMOs) are species that have had genetic material added from a different species. For example, most commercially available soybeans have been genetically engineered to contain a gene from a soil bacterium, making the soybeans resistant to herbicides. The soybean crops can be sprayed with herbicide to kill any other weeds that might grow, leaving the soybean plants unharmed.
When insulin was discovered as the treatment for diabetes in 1921, the demand for insulin exceeded the supply, which was originally obtained from the pancreas of slaughtered cows and pigs. In 1978, the first commercially available biosynthetic insulin was produced using genetic engineering, inserting the human insulin gene into bacteria. This process allowed for insulin to be produced on a much larger scale and did not involve the use of animals.
Figure 1. Genetically modified bacteria are used to produce human insulin to treat diabetes.
Scientists have successfully made genetically modified mammals such as mice and sheep since the 1980s. The purposes for doing this vary, but originally scientists made GMO mammals to study human illness in animals for research purposes and development of treatment options. GMO animals have also been made in order to boost the nutritional qualities of the foods that they provide to humans.
For example, genes from the roundworm have been inserted into a pig in order to produce higher levels of omega-3 fatty acids, which aids in cardiovascular health. Dairy cows in China and Argentina are being developed with human milk genes with the idea that dairy cows could produce milk similar to human breast milk for human babies where a mother is not capable of breastfeeding. Genetic modification can also help to preserve species that are close to extinction by inserting genes for disease resistance.
Probably the most commonly genetically modified organisms are plants, especially those grown for food. With the ever-growing population, the demand for food has exceeded the ability for farmers to produce it using conventional farming methods. Technology, specifically genetic engineering, is solving the problem by making crops more resistant to disease, more tolerant to weather changes and more nutritious. In addition, genetic engineering can help to make fruits and vegetables ripen more slowly and have a longer shelf life, reducing the amount of food that is wasted.
The debate about the long-term consumption of genetically modified foods is ongoing. Scientists have developed many fruits and vegetables that have been genetically modified to improve both the quality and the quantity of the crop yield, but it is unknown whether there are long-term health effects from eating these foods. There is the potential for health problems that could arise, such as allergies or digestive problems, from eating GMO foods long-term. Also, if a genetically modified crop is grown near a non-genetically modified crop, there could be cross-pollination between the two strains, resulting in an elimination of the non-genetically modified strain. When a crop is developed to be resistant to a particular insect, that insect might end up losing its only source of food, resulting in extinction and a disruption in the food chain.