Transcription is the mechanism by which mRNA is synthesized from DNA. DNA is found in the Nucleus, so this is the site of Transcription. The entire DNA sequence (~3 billion of base pairs long in the Human Genome) is not transcribed, but rather segments that represent genes. It occurs in the following stages:
Transcription begins with RNA Polymerase binding to a specific nitrogenous base sequence called the promoter. Eukaryotic cells have a promoter called the TATA box, named after repeated Thymines and Adenines. This region occurs upstream of the selected gene. It is believed that this region is selected to begin transcription because the connection between Thymine and Adenine is contains 2 hydrogen bonds, whereas Guanine and Cytosine are connected by 3 hydrogen bonds. Therefore these areas would require less energy to break bonds and open up the helical structure of DNA.
Once DNA is unwound and opened, RNA polymerase can begin building a complementary copy of RNA, without the need for primers (needed in DNA Replication). The new RNA strand is built by adding one nucleotide at a time. Only one strand of DNA is used (called the Template Strand), and RNA is built in the 5' to 3' direction (using the DNA 3' to 5'), so there is no lagging strand. The strand of DNA that is not being copied is called the Coding Strand, and has the same sequence of bases as the new RNA strand (except it contains Thymine instead of Uracil).
As the nitrogenous bases are added, the RNA strand elongates. The RNA is momentarily bound to the DNA, but as RNA polymerase passes, the RNA detaches to form its own strand, and the DNA strands begin to reform their helix.
Since most structures in a complex organism require many of the same proteins to be used as building blocks in a larger structure, thousands to millions of RNA molecules will be needed, so many RNA polymerase enzymes will repeat this process. As soon as the DNA has reformed, another RNA polymerase molecule can begin a new strand before the initial strand is finished.
Transcription of a gene is terminated when RNA polymerase reaches a Termination Sequence (another, different series of bases). At this point both the RNA polymerase and the new RNA strand detach. The RNA polymerase can re-attach and create a new RNA molecule, while the newly created RNA strand leaves the nucleus via Nuclear Pores.
The newly transcribed RNA molecule is then subjected to additional enzymes to modify it before it is sent to Ribosomes to undergo Translation.
A poly-A tail (a large number of Adenines) is added to the 3' end of the RNA molecule to enable later Translation as well as protect the RNA molecule. A 5' cap is added to the other end, and consists of 7 Guanines. The 5' cap is the attachment site for ribosomes.
Genes can greatly vary in length, from as little as 30 base pairs to 30,000! Not all base pairs code for anything and these regions are called Introns. Conversely, sequences of DNA or RNA that do code for part of a gene are called Exons. As you might have guessed, the introns need to be removed or else they would be translated into amino acids, which would ultimately alter the shape and therefore function of the protein. Small Ribonucleoproteins (snRNPs- amazingly pronounced as 'snurps') combine with the RNA strand to form a Spliceosome, where introns are removed and exons are spliced (connected) together.
It has been found that about 3/4 of all genes can go through a mechanism called Alternative Splicing, where exons are spliced in different orders. Having a different order of base pairs results in different orders of amino acids, resulting in completely different proteins. As a result, many different proteins are encoded in a single gene.
p.319-324#1-6