Introduction and Goals
At each cell division, the DNA of a cell is precisely copied and each daughter cell receives an identical copy of the genome. In a complex multicellular organism, such as a human, this happens millions of times in the lifetime of the individual. The process is remarkably rapid and accurate. This tutorial describes the molecular mechanisms of DNA replication. In addition, the mechanisms for the repair of damaged and mismatched DNA will be discussed. Finally, the knowledge of the molecular details of DNA replication has been used to develop two techniques that have revolutionized molecular biology: the chain termination method of DNA sequencing and the polymerase chain reaction (PCR), both of which are described in this tutorial.
By the end of this tutorial you should know:
- That each strand of a DNA molecule is used as a template during replication
- The activities of DNA polymerase in replication
- The role of RNA primers in DNA replication
- The unique mechanism of DNA replication at the telomeres
- The mechanism of repairing mismatched and damaged DNA
- How nucleotide sequences are determined using the chain termination method of DNA sequencing
- How DNA is amplified using the polymerase chain reaction (PCR)
DNA Replication is Semiconservative and Initiates at Origins of Replication
The structure of double-stranded DNA supports a model for DNA replication (see Tutorial entitled DNA and Chromosomes), in which each strand of the DNA is used as a template (is copied to make the complementary strand). This is, in fact, the mechanism of DNA replication, and it is referred to as semiconservative replicationbecause each new double-stranded DNA molecule is composed of one strand of the conserved "old" DNA and one strand of newly synthesized DNA (seeFigure 1). The template DNA is used as a blueprint for the sequence of the new strand, following the rules of complementary bases (e.g. if the nucleotide is adenine in the template DNA, then a thymine is added in the opposite strand). DNA polymerase (the enzyme that carries out DNA synthesis) uses one strand of DNA as a template and adds a new nucleotide to the 3' end of the new elongating strand. DNA polymerase utilizes a deoxyribonucleotide, cleaves the two terminal phosphates from the 5' end of the nucleotide, and uses the free energy to form a phosphodiester bond between the 5' phosphate of the incoming nucleotide and the 3' hydroxyl end of the last nucleotide in the strand (Figure 2). Therefore, the newly synthesized DNA strand can only elongate in one direction, 5' to 3'.
Recall, in chromosomes, DNA replication is initiated at origins of replication (ORI) scattered along the length of the chromosomes (Figure 3). At the ORI, the DNA is locally denatured and a complex of proteins is assembled on the DNA to carry out DNA synthesis. Although the DNA sequence of any individual ORI may vary, there are some common features: the ORI sequence is composed of multiple, short, repeated sequences; proteins bind to these sequences and recruit DNA polymerase and the other proteins necessary for DNA replication; and the stretches of DNA that flank the ORI have a high percentage of the nucleotides adenine and thymine, which facilitate the unwinding of the DNA molecule. Once assembled at the ORI, the DNA replication protein complex will travel along the DNA, unwinding and copying it in what is termed the replication fork (see Figure 3). Replication forks move from the ORI in both directions (to the right and left of the ORI) in a fashion referred to as bidirectional replication. At each replication fork, the double-stranded DNA is unwound and each strand of DNA serves as a template. If one examines the overall direction of replication (5' to 3') and the polarity of the two template strands (see Figure 3), it becomes clear that the replication fork is asymmetric. Remember that DNA polymerase can only elongate DNA in the 5' to 3' direction (that is, read the template in the 3' to 5' direction), so one template strand can be read (and direct continuous synthesis of its complementary strand) in the same direction as the overall direction of replication. The template strand used for the continuous synthesis of its complementary strand is termed the leading strand. The other strand is termed the lagging strand, and the synthesis of its complementary strand occurs in the opposite direction of the overall direction of replication. The lagging strand template directs the discontinuous elongation of its complementary DNA, resulting in short fragments of complementary DNA termed Okazaki fragments. The Okazaki fragments are eventually linked together to generate a single, continuous strand of DNA (described later in this tutorial).
The Activity of DNA Polymerase
Normal DNA replication is extremely accurate; on average, just one error occurs for every 10 million nucleotides copied. This low error rate arises from the activity of DNA polymerase itself. In addition to the 5' to 3' polymerase activity that elongates a DNA strand, the polymerase contains a proofreading activity that can recognize and remove a misplaced nucleotide. Although A-T and G-C are the normal base pairs in double-stranded DNA, mismatched base pair combinations are possible (albeit less stable) and occasionally do arise during replication. As replication proceeds, however, the proofreading activity of DNA polymerase edits and corrects any newly formed mismatched base pairs. DNA polymerase has two catalytic sites: one site containing the 5' to 3' polymerization activity and the other site containing a DNA degradation activity that can cleave a nucleotide from the 3' end of a DNA strand (3' to 5' exonuclease). Before a new nucleotide can be added to an elongating DNA chain, DNA polymerase checks the last base pair formed to ensure it is correct. If it is correct, replication proceeds. If there is a mismatch, the exonuclease activity of the polymerase cleaves the incorrect base from the elongating strand and another (hopefully correct) nucleotide is added.
The Role of RNA Primers in DNA Replication
The proofreading mechanism of DNA polymerase increases the accuracy of the enzyme; however, this also means that the enzyme can only elongate DNA by adding a new nucleotide to the free 3' hydroxyl end of a correctly base-paired nucleotide. So, how is DNA synthesis initiated? What does DNA polymerase add the first incoming nucleotide to? An additional enzyme in DNA replication, the enzyme primase, synthesizes a strand of nucleic acid complementary to the template strand. Primase generates RNA primers(10 nucleotide sequences complementary to the DNA) and does not require a free 3' hydroxyl. In the replication fork, DNA is unwound and replication is initiated by primase, which makes RNA primers complementary to both the leading and lagging strands. Since this is an RNA primer, the nucleotide uracil is used in place of thymine and is base-paired with adenine. For the leading strand, a single RNA primer is made and is then used by DNA polymerase to elongate the complementary strand. For the lagging strand, many RNA primers are used; one for each Okazaki fragment. However, none of the RNA primers will remain in the DNA; they are removed by an exonuclease and the gap that is created in the newly synthesized DNA is filled by a repair DNA polymerase, which extends the sequence of the adjacent Okazaki fragment (Figure 4). The two Okazaki fragments (now containing only DNA) are joined by the enzyme DNA ligase, which can form a phosphodiester bond between the 5' end of one DNA fragment and the 3' end of another DNA fragment.
The Protein Complex at the Replication Fork
DNA replication requires the activity of DNA polymerase, as well as other enzymes such as primase and ligase. In fact, a multi-protein complex is recruited to the replication fork and it acts coordinately to accurately copy the DNA. In addition to DNA polymerase and primase, there are two other proteins in the replication complex that are required to unwind the double-stranded DNA and keep it unwound. Helicaseis the enzyme that locally unwinds the DNA, and this must occur before any of the other proteins can act. Once the DNA becomes single-stranded, single-strand binding protein (SSBP) binds the DNA, which prevents the two strands from reannealing. This leaves the DNA available to be copied by primase into short RNA primers that are used by DNA polymerase to extend the strand of DNA. In the Okazaki fragments, the RNA primers are removed, the gaps are filled and the fragments are joined by ligase. A complex of proteins (helicase, SSBP, primase, DNA polymerase and ligase) travels together along the DNA as the replication fork extends down the length of the DNA. This is illustrated in Figure 5. In a chromosome that contains multiple ORIs, the replication fork travels along the chromosome until it reaches a replication fork traveling in the opposite direction. The DNA copied from a single ORI, moving bidirectionally, is referred to as a replicon.
Replicating the Telomeres
There is a problem that arises as the replication fork approaches the telomeres; on the lagging strand, the synthesis of DNA does not extend to the very end of the molecule (Figure 6). Consequently, each time the chromosome is replicated, a small portion of its end is not copied and is lost. Over many rounds of replication, the chromosome will shorten. This "end problem" of replication is solved by telomerase (an enzyme that is specific for replication at the telomeres). Telomerase is a DNA polymerase composed of protein and a short RNA template complementary to the repeated DNA sequence at the telomere. Telomerase uses its RNA as a template to extend the lagging strand of DNA by adding more repeats to the end. The newly extended lagging strand DNA is then used by primase and DNA polymerase to synthesize the complementary strand at the end of the chromosome (Figure 6).
Post-Replication DNA Mismatch, Damage and Repair
Although DNA polymerase has a very low error rate, in part due to its proofreading activity, some mismatched base pairs do escape correction during replication. In addition, DNA can be damaged in other ways that alter the DNA sequence. These DNA mismatches and damages, if not corrected or repaired, will result in a mutation(a permanent change in the DNA) in the next round of replication, which may be detrimental to the cell. For example, the genetic disorder sickle cell anemia results from a single nucleotide change (adenine changed to thymine) in the gene encoding one of the two globin proteins that comprise hemoglobin. This nucleotide change causes a change in the amino acid sequence and consequently a change in the shape of the mutant protein in the red blood cells. These alterations to the protein cause it to aggregate, forming large complexes that distort the shape of the red blood cells. These so-called "sickle cells" are more fragile than normal red blood cells and are more likely to break in the bloodstream, resulting in fewer red blood cells (anemia).
Fortunately, cells possess mechanisms for restoring mismatched base pairs and repairing DNA damage. The DNA mismatch repair system corrects 99% of the mismatched base pairs that were not removed by DNA polymerase during replication. Mismatch repair proteins recognize and bind to the mismatched base pair, nucleotides from the newly synthesized strand of DNA are excised, a repair DNA polymerase fills in the gap, and then DNA ligase rejoins this stretch of nucleotides to the rest of the DNA strand. It is important for the DNA mismatch repair system to distinguish between the template strand of DNA and the newly synthesized strand of DNA, however, the complete mechanism for this is not entirely understood.
Mutations can arise in DNA through damage that alters the chemical structure of the nucleotide. This can occur spontaneously, or in response to some environmental factor. The two most common types of damage that arise from spontaneous reactions are depurination (the loss of the base from the purine nucleotides adenine or guanine) and deamination(typically the conversion of cytosine to uracil). Depurination results in a nucleotide having no base; consequently, DNA polymerase will skip this nucleotide when synthesizing the complementary strand during replication. This leads to a nucleotide loss in the next round of replication. In some cases the depurinated nucleotide is base-paired with a mismatched nucleotide on the other strand, resulting in a base pair change. Deamination results in a change of nucleotide sequence in the next round of replication, from a C-G base pair to an A-T base pair. Another common type of damage to DNA is thymine dimerformation. This is most often induced by exposure to ultraviolet light and consists of a covalent bond between the bases of adjacent thymines in one strand of DNA. This causes a block in DNA replication and can result in mutations. The rate of these types of damage is very high (e.g. depurination occurs 500 times per cell per day), however, there are numerous pathways for repairing the damaged DNA, and, as a result, the heritable mutation rate is not that high. Repair generally occurs in three steps: the damaged DNA is recognized and removed; a repair DNA polymerase fills the gap; and DNA ligase reseals the DNA strand. DNA repair is a complex pathway, involving many different proteins and activities that are not completely understood by biologists. However, the importance of these proteins is highlighted in several human disorders that have defective DNA repair proteins and that exhibit a much higher incidence of cancer due to the inability of the cells to efficiently repair damaged or mismatched DNA. Most cancers arise through mutations in key growth-regulating genes that accumulate throughout the lifetime of the individual. Individuals with defective DNA repair have a much greater chance of mutations arising in the cancer-causing genes.
The Chain Termination Method of DNA Sequencing is Based on DNA Synthesis
The study of DNA replication, in particular the isolation and investigation of DNA polymerase, has lead to the development of two technological advances in molecular biology: the chain termination method of DNA sequencing and the polymerase chain reaction (PCR). The chain termination method (or dideoxy method) of DNA sequencing was first described by Fred Sanger in 1977, and in 1980 he was awarded the Nobel Prize for this work. This method of sequencing is based on a DNA synthesis reaction carried out in a test tube, using the DNA that is to be sequenced as a template. In addition to the normal deoxyribonucleotides (dATP, dCTP, dGTP and dTTP), DNA synthesis occurs in the presence of dideoxyribonucleotides(ddATP, ddCTP, ddGTP and ddTTP); they are capable of being added to newly synthesized DNA, but terminate any further extension of the strand because they lack the 3' hydroxyl necessary to form a phosphodiester bond with the next incoming nucleotide. Therefore, reactions are set up with denatured DNA template (the DNA fragment to be sequenced), a synthetic single-stranded DNA primer (analogous to the RNA primer during DNA replication), deoxyribonculeotides and DNA polymerase. In addition, each reaction contains a small amount of one of the four dideoxynucleotides. DNA polymerase uses the primer and makes a partial copy of the template, however, as soon as a dideoxyribonucleotide is incorporated into the newly synthesized strand, the new strand is terminated and DNA polymerase cannot add any additional nucleotides. Therefore, in the reaction containing ddATP, there will be many newly synthesized fragments of DNA of different sizes that are complementary to the template, and all of which end with the nucleotide adenine (A). In the reaction run in the presence of ddCTP, all fragments will terminate with cytosine (C), and ddGTP and ddTTP will terminate with guanine (G) and thymine (T), respectively. The products of each reaction are next separated by gel electrophoresis, which separates DNA fragments based on their lengths. One can determine the sequence by its position and by the terminating nucleotide of each fragment. For example, if one observes a fragment of 10 nucleotides ending with the nucleotide A, a fragment of 11 nucleotides ending with the nucleotide T, and a fragment of 12 nucleotides ending with the nucleotide C, then the sequence is ATC. By reading up the gel (from bottom to top), one can read the sequence of the DNA (5' to 3') that was copied in the reaction. This method has been greatly enhanced in recent years through the use of automated sequencing machines and fluorescently labeled dideoxyribonucleotides (see the animation below). These advances, and others, greatly facilitated the large-scale effort to sequence the entire human genome (the bulk of which was completed by the end of the year 2000).
The Polymerase Chain Reaction (PCR)
The polymerase chain reaction (PCR) is a technique that rapidly amplifies specific DNA sequences. PCR is a very sensitive technique that requires small amounts of DNA (as little as 1-2 molecules of DNA), which are amplified and copied over a billion times. Because of its sensitivity, PCR has revolutionized many aspects of molecular biology and genetics, including diagnoses of diseases and forensic science. The PCR reaction is essentially successive rounds of DNA denaturation, annealing of DNA primers, and DNA synthesis. It requires several components: the template DNA to be amplified (referred to as the target sequence), two DNA primers that flank the DNA sequence to be amplified,thermostable DNA polymerase and nucleotides. PCR is only feasible because of the discovery and isolation of thermostable DNA polymerase. The first thermostable DNA polymerase was isolated from the bacterium Thermus aquaticus, which normally grows in the water of hot springs. This enzyme, called Taq DNA polymerase, is optimally active at 72°C and can retain its activity after successive heating to denature the template DNA. With each successive cycle of DNA synthesis, the number of copies of the target sequence is doubled. Typically 20-30 cycles of DNA synthesis will be performed, resulting in 106-109 copies of the target sequence.
DNA replication is semiconservative, and each strand of the double helix serves as a template for synthesis of a new strand. Replication of DNA is initiated at specific origins of replication (ORI), and replication forks move bidirectionally. DNA synthesis is catalyzed by the enzyme DNA polymerase, which is only capable of elongating a strand of DNA in the 5' to 3' direction. As a result, the synthesis of one strand (the leading strand) is continuous, and the synthesis of the other strand (the lagging strand) is discontinuous; the latter resulting in Okazaki fragments. In addition, DNA polymerase has a proofreading activity that detects and removes mismatched nucleotides before proceeding with synthesis. DNA polymerase can extend a DNA strand but cannot initiate synthesis. RNA primers that are complementary to the template DNA are synthesized by primase and are used by DNA polymerase to elongate the strand by incorporating additional nucleotides. The RNA primers are excised from the newly synthesized DNA fragments and the gaps in the sequence are filled by repair DNA polymerase and resealed by ligase. DNA replication is initiated by additional proteins, which assemble at the ORI. This includes helicase, which locally unwinds the DNA, and single-strand binding protein (SSBP), which binds the single-stranded portions of the DNA and inhibits renaturation. The replication of the DNA near the telomeres is synthesized by telomerase. This enzyme contains an RNA template to extend the lagging strand, which, in turn, is used by primase and DNA polymerase to extend the complementary strand. In the absence of telomerase, the replication of the lagging strand always falls short of the very end, so with subsequent rounds of replication the chromosome shortens.
DNA damage, mismatched base pairs and altered base pairs can arise after replication. Some mismatched base pairs occur during replication and escape repair. Other types of DNA damage occur spontaneously, or occur due to exposure to ultra-violet light. The most common alterations to DNA include deamination, depurination and thymine dimers. If not corrected, these alterations in DNA result in mutations. The DNA repair pathway normally repairs most of this damage. This pathway is capable of recognizing and removing the mismatched or altered nucleotide and replacing it with the correct nucleotide. The isolation and investigation of DNA polymerases has lead to the development of two extremely important applications, the chain termination method (dideoxy method) of sequencing and the polymerase chain reaction (PCR). The chain termination method of sequencing determines the sequence of a fragment of DNA; this DNA synthesis reaction is performed in a test tube in the presence of dideoxyribonucleotides that terminate DNA synthesis once they are incorporated. The products of this DNA synthesis reaction (each terminated at a specific nucleotide) are separated by electrophoresis. PCR also takes advantage of the properties of DNA polymerase, but uses thermostable DNA polymerase. PCR is a method for amplifying a region of DNA through successive rounds of denaturation, annealing of primers and DNA synthesis.