Access Keys:
Skip to content (Access Key - 0)

From Gene to Protein

Skip to end of metadata
Go to start of metadata
  • No labels


You should have a working knowledge of the following terms:

  • aminoacyl-tRNA
  • aminoacyl-tRNA synthetase
  • anticodon
  • antiparallel
  • codon
  • DNA polymerase
  • gene regulation
  • genetic code
  • helicase
  • lagging strand
  • leading strand
  • ligase
  • messenger RNA (mRNA)
  • Okazaki fragment
  • origins of replication
  • primase
  • primer
  • promoter
  • release factor protein
  • replication fork
  • RNA polymerase
  • ribosomal RNA (rRNA)
  • ribosome
  • semiconservative replication
  • single-strand binding protein
  • stop codon
  • template strand
  • terminator sequence
  • transcription
  • transcription factor
  • translation
  • transfer RNA (tRNA)

Introduction and Goals

 The last tutorial concluded with a discussion of nucleic acid structure. Watson and Crick concluded their 1953 paper on the structure of DNA with the following statement:

"It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material."




Figure 1.  The structure of DNA. (Click to enlarge) Notice the base pairing rules - A always pairs with T and G always pairs with C.

 The "specific pairing" they mentioned is the hydrogen-bond association between adenine and thymine, and guanine and cytosine. The "copying mechanism" is DNA replication, an amazingly simple yet surprisingly accurate series of steps by which DNA carries the instructions for its own reproduction.

This tutorial will cover DNA replication first, then the synthesis of RNA, and conclude with protein synthesis. By the end of the tutorial you should have a basic understanding of:

  • DNA replication
  • RNA transcription
  • Protein translation 

DNA Replication

DNA replication can be divided into several basic components. The double helix unwinds and nitrogenous bases are added to each strand of the existing molecule (but only onto one end of each), resulting in two perfect copies from one original.

 Figure 2.  Semiconservative DNA Replication. (Click to enlarge).  Each new molecule of DNA is comprised of one parental strand and one newly synthesized strand.

This type of replication is termed semiconservative replication because each newly formed molecule of DNA has one strand conserved from the parent molecule and one newly synthesized strand (depicted left).

This is an animation of an overview of DNA replication.




For a given piece of DNA, replication begins at numerous origins of replication. Each origin of replication is composed of a team of enzyme proteins that are involved in DNA replication. Helicases unwind the DNA helix, single-strand binding proteins keep the strands separate while primases initiate replication, and DNA polymeraseadds nucleotides to the unwound parent molecule. While the fundamentals of replication are simple, there is a feature of DNA structure that makes things a bit more complicated; the strands have opposite chemical polarities. This can be hard to comprehend because the strands seem identical. However, a close inspection reveals that the H-bonding that occurs between bases is only achieved if the strands have opposite polarities. This arrangement of strands is antiparallel, with one strand designated the 3'-to-5' strand and the other the 5'-to-3' strand.

Figure 3.  DNA structure. (Click to enlarge).

DNA polymerase can only add nucleotides to the 3' end of a parent strand. Therefore, nucleotide addition is a smooth, continuous process along one of the strands (the leading strand) of DNA. The other strand (the lagging strand) has a discontinuous mode of replication because DNA polymerase can only work by starting from the replication fork(where DNA is unwinding) and progressing outward (until it runs into a previously synthesized fragment). An added wrinkle to the process on the lagging strand is created by the lack of a continuous new strand; DNA polymerase can only add nucleotides to an existing 3' nucleotide. How is this lagging strand started? Primase has the ability to synthesize a short primer made of a few nucleotides of RNA. DNA polymerase can then add DNA nucleotides to the end of this primer sequence and synthesize relatively short stretches of DNA known as Okazaki fragments. An additional enzyme (ligase) seals the fragments into a continuous strand of DNA. The figure above provides an overview of the enzymes involved, antiparallelism, and the overall direction of DNA replication. All of this takes place in the nucleus of eukaryotic cells (cells that have membrane-bound nuclei and organelles). Be sure that you study this figure carefully. You could see this figure unlabeled and be asked to name the enzymes and their functions on an exam.

 Figure 4.  An overview of DNA replication. (Click to enlarge) 

This is an animation of the steps of DNA replication.


In spite of the rules of base-pairing, sometimes mistakes are made in the synthesis of new DNA molecules. This occurs about once in every 10,000 base pairs. This fraction may seem small, but left unchecked could be disastrous. However, various repair mechanisms fix these errors and, in the end, the actual observed error rate is very low (often less than one mistake/10 million bases). Numerous specialized proteins are known to repair base-pairing errors and damage to existing molecules of DNA.

Gene To Protein: The Central Dogma

Figure 5. Transcription and Translation in Prokaryotic Cells versus Eukaryotic Cells. (Click to enlarge)

The central dogma (prevailing theme) of molecular biology is: DNA is transcribed into messenger RNA, and messenger RNA is translated into proteins. In other words, DNA codes for the synthesis of polypeptides, which in a specific conformation and perhaps in conjunction with other polypeptides make up proteins. This is accomplished via two processes: transcription and translation. First, one of the DNA strands is used as a template to make messenger RNA (mRNA; discussed herein). This single strand of bases is then complementary to one side of the DNA molecule and is identical (except for the substitution of uracil for thymine) to the other strand.

Transcription occurs in the nucleus of eukaryotes, but then the mRNA leaves the nucleus to begin translating proteins. The image above outlines these steps, along with where they occur in prokaryotic and eukaryotic cells. Note that because prokaryotes do not have a nucleus and mRNA has nowhere to travel, proteins can be translated from mRNA immediately.

Gene To Protein: The Triplet Code

It should be clear that it is the actual sequence of nucleotides (adenines, guanines, cytosines, and thymines) in a DNA strand that determines the type of protein that is synthesized (depicted here). However, there are four possible bases and twenty possible amino acids that join to form a polypeptide chain, so one base cannot code for one amino acid; that scenario would yield only four possibilities. A two-base code would provide only sixteen possibilities, so apparently a minimum of three bases is needed to specify a particular amino acid. In 1961, Marshall Nirenberg "cracked" the genetic code by determining the codon(sequence of bases) that specifies the amino acid phenylalanine. The amino acids lysine, glycine and proline followed, and now the sequence of bases that codes for each of the amino acids found in proteins is known. The image below illustrates the "dictionary" of the genetic code for amino acids by RNA bases (complementary to the original DNA template strand). The way in which a sequence of DNA bases is transcribed into complementary RNA bases and then translated into corresponding amino acids is illustrated in the image above.

Figure 6.  The genetic code is a triplet code. (Click to enlarge)


Figure 7.  The Genetic Code. (Click to enlarge)

Two adjectives that are often used to describe the genetic code are "unambiguous" and "redundant." Unambiguous means that the codons are fixed and that each codon specifies one amino acid. For example, ACC codes for tryptophan and nothing else. However, codons may be redundant, meaning that several codons may say the same thing. For example, CAA, CAC, CAG, and CAT all code for a single amino acid (valine). This image illustrates how codons are unambiguous and redundant.

Gene To Protein: Transcription

How are proteins made? First, one of the two strands of DNA is transcribed into a single strand of complementary RNA termed messenger RNA (mRNA). As you should now understand, this RNA is complementary to only one of the DNA strands.

As mentioned, the process of transcription occurs in the nucleus of eukaryotic cells and requires that the two strands of DNA separate, or open up sufficiently enough so that complementary RNA nucleotides can be added to one side of the DNA molecule. The strand that is copied is the template strand. The enzyme RNA polymeraseseparates the DNA strands and joins the RNA nucleotides along the exposed DNA template. This process is initiated when certain proteins, transcription factors, bind to a specific starting point, the promoter. The promoter is actually a sequence of DNA bases that signals the beginning of RNA synthesis. RNA polymerase then adds nucleotides to the 3' end of the elongating RNA molecule. The enzyme then moves down the DNA strand, unwinding as it goes and allowing the DNA helix to reform after a sequence has been transcribed. This continues until a specific RNA sequence is transcribed. This sequence, the terminator sequence, signals the end of RNA synthesis. Transcription is broken down into three stages: initiation, elongation, and termination.

This animation shows the process of transcription.

Gene To Protein: Translation Ingredients

Figure 8. Overview of Involvement of Ribosomes, tRNA and mRNA in Translation. (Click to enlarge)

In addition to mRNA, two other types of RNA are needed for protein synthesis. These are ribosomal RNA (rRNA) and transfer RNA (tRNA). Ribosomal RNA combines with proteins to form ribosomes. Ribosomes are cellular structures where polypeptides form. Ribosomes actually consist of two subunits: one large and one small. As illustrated in this figure, tRNA molecules transport amino acids to the growing polypeptide chain. Each tRNA molecule has an amino acid attachment site for a particular amino acid and an anticodon (a sequence of three nucleotides that is complementary to a sequence of bases in the mRNA strand).

An additional component, the enzyme aminoacyl-tRNA synthetase, insures that a given tRNA molecule picks up only its particular amino acid. Aminoacyl-tRNA synthetase has specific sites that bind amino acids and tRNA, and energy is required to bring these raw materials together. (This explains the yellow ATP in this figure; to be discussed in future tutorials.) Also, be aware that each codon has its own tRNA, with its own anticodon. To insure high fidelity of protein translation, each tRNA has a corresponding aminoacyl-tRNA. Once you're comfortable with all the necessary components (mRNA, ribosomes, tRNA, amino acids), proceed to protein synthesis.

 Figure 9. The Structure of Transfer RNA (tRNA). (Click to enlarge)

 Figure 10. Aminoacyl-tRNA Synthetase's Role in Translation. (Click to enlarge)

Gene To Protein: Translation

Figure 11. Overview of Involvement of Ribosomes, tRNA and mRNA in Translation. (Click to enlarge).

Polypeptide construction does not occur in the nucleus. The ribosomes, aminoacyl-tRNA synthetases and amino acids are located outside of the nucleus. Therefore, mRNA, rRNA and tRNA must all travel out of the nucleus before translation can begin.

The basic concept of translation is depicted in the left image. See how mRNA, ribosomes, tRNA and amino acids all interact to make a polypeptide chain. Just like transcription, this takes place in three stages: initiation, elongation, and termination.

Initiation: In initiation, mRNA binds to the small subunit of a ribosome. An initiation codon, AUG, binds with an initiator tRNA molecule that bears the anticodon UAC and the amino acid methionine. Then a large subunit of a ribosome attaches, making the initiation complex complete and allowing translation to begin.

Figure 12.  Initiation of Translation. (Click to enlarge)

This figure illustrates that there are actually three attachment sites on a ribosome. They are the exit site, the peptidyl-tRNA binding site, and the aminoacyl-tRNA binding site (also called the E, P, and A sites). 

Figure 13. The Three Attachment Sites On a Ribosome. (Click to enlarge)

Elongation: In the elongation stage, the peptide grows by addition of amino acids according to the sequence of bases in the mRNA molecule. This is accomplished through codon recognition, peptide bond formation, and translocation. Basically, a tRNA carrying the appropriate amino acid (an aminoacyl-tRNA) binds to the A-site, and a peptide bond forms between the new amino acid and the end of the growing polypeptide. Then everything shifts: the P-site tRNA is bumped to the E-site, where it dissociates from the ribosome; the A-site tRNA moves into the P-site; and a new aminoacyl-tRNA attaches to the now open A-site.

Termination: Elongation continues until a mRNA stop codonreaches the A-site of the ribosome. Stop codons include UAA, UAG and UGA, and they do not code for any amino acids. They signal the end of translation. Instead of a tRNA, a release factor protein binds to the stop codon and the newly synthesized polypeptide is liberated from the ribosome.

Figure 14. Peptide Chain Elongation During Translation. (Click to enlarge)

This new polypeptide will undergo coiling and folding to form its secondary and tertiary structures, and it might combine with additional polypeptide chains for its quaternary structure (go back and review protein structure, if necessary), hence finally making a protein.

This is an animation of translation.

Gene To Protein: Gene Regulation

Recall that we discussed promoter regions of DNA and how these promoters bind specific proteins to initiate transcription, thereby setting the stage for protein synthesis. Different genes may have different promoters that respond to different transcription factor proteins. Gene regulation describes how genes can be "turned on" to synthesize a needed protein, or "turned off" to stop synthesis of a protein that is no longer needed.



This tutorial examined DNA replication, RNA transcription, and protein translation. These processes are pivotal to life, so it will be important that you have a firm grasp on the basic aspects of these processes (i.e., to the level presented in this tutorial).

All cells that divide need to replicate their DNA so that each daughter cell contains a full complement of all the parent's genetic information. In the process of replication, the two strands are replicated with remarkable fidelity. To appreciate this process, keep a couple of things in mind. First, DNA is comprised of two antiparallel strands. The polarity of each strand is due to the manner in which the nucleotides are linked together. Second, DNA is synthesized in only one direction; 5' to 3'. Be sure that you understand that one strand is synthesized continuously, and that the other is synthesized discontinuously. Be sure that you understand why this is so, and be sure that you are familiar with the basic steps involved in this process.

The process of RNA transcription has some similarities to DNA replication (e.g., synthesis occurs in the 5' to 3' direction), but it also has some important differences. First, only one strand of DNA is used as a template for RNA synthesis. Second, ribonucleotides are used instead of deoxyribonucleotides. Not all DNA in a genome is transcribed at once. Rather, via the action of transcription factors, only selected genes are transcribed at a given time. Be sure that you understand the basic aspects of this process.

Protein translation is the process by which messenger RNA (mRNA) supplies the necessary information for the linear synthesis of proteins. There are three basic components to a cell's translational machinery: mRNA, tRNA, and ribosomes. Messenger RNA provides the template that will be used for ordering the correct sequence of amino acids. Fidelity of the translational process is assured, in part, by the fact that each amino acid has its own transfer RNA. Transfer RNA (tRNA) is found with an appropriate amino acid. For example, a tRNA that has an anticodon of "UAC" will bind to the triplet on the mRNA with the complimentary sequence "AUG." Thus, each tRNA delivers the appropriate amino acid to the ribosome; ordering of amino acids is determined by the linear arrangement of the genetic code. Be sure that you understand the relationship between these three components of the cell's translational machinery.