Introduction and Goals
The expression of the genes in any given cell is regulated by which genes will be transcribed and translated, and to what extent. The mechanism of the transcriptional regulation of gene expression is described in this tutorial. A single gene may be transcribed highly in one cell type, yet not expressed at all in another. The organization of promoters, the associated regulatory DNA sequences and proteins that modulate the rate of transcription, are described. Experimental methods that can monitor the levels of gene expression are also explained. By the end of the tutorial you should know:
- How activators and repressors regulate transcription
- The regulation of prokaryotic operons via end-product repression and substrate activation
- The characteristics of eukaryotic enhancers and silencers
- The regulation of tissue-specific expression
- How DNA microarrays are used to profile patterns of gene expression
- The ability to reprogram gene expression through the processes of nuclear transfer and cloning
Different Cells Express Different Proteins
The DNA of most cells is identical in a multicellular organism, however, it is not transcribed in the same way in different cell types. Any given cell does not transcribe all of its genes; rather, some genes are expressed (transcribed into mRNA and translated into protein), whereas others are not. Gene expressiondescribes the profile of which genes are transcribed and translated, and the extent to which they are expressed. The pattern of gene expression is cell or tissue specific. For example, an individual's neurons will express different proteins than their liver cells. This reflects the different structures, activities and functions of these two very distinct cell types. Even for a single cell, the gene expression pattern may change during the different phases of the cell cycle or in response to extracellular signals. Some genes are expressed in all cells; for example, those that encode the structural proteins and proteins involved in common cellular pathways such as DNA replication and translation. These are commonly referred to as housekeeping genes. Other genes encode more specialized proteins that may only be expressed in one cell type (e.g. insulin in the islet cells of the pancreas).
The regulation of gene expression can occur at several levels: which DNA is transcribed and to what extent; alternative splicing to make different isoforms of the proteins; and the efficiency with which the mRNA is translated into protein. In this tutorial the first level, the regulation of transcription, is described.
The Regulation of Gene Expression: Activators and Repressors
The expression of some genes is constitutive, meaning there is no specific regulation and the gene is always transcribed. Other genes are inducible, meaning the expression can be "turned on" or "turned off." The expression of inducible promoters is controlled by regulatory proteins that bind to regulatory DNA sequences and either promote or inhibit transcription of the gene. Regulatory DNA sequences are short sequences that act as binding sites for regulatory proteins, recruiting either activators or repressors to modulate the levels of transcription. Activators bind to regulatory DNA sequences and promote transcription. Repressorsbind to regulatory DNA sequences and inhibit transcription. Often, regulatory DNA sequences are located upstream from the promoters of genes. Several genes, regulated by the same activator, will possess the same regulatory DNA sequence. A single gene may, and often does, have binding sites for multiple activators as well as binding sites for repressors. Thus, a gene is maximally expressed when the repressors fail to bind and the activators do bind to the appropriate regulatory DNA. This tutorial describes the logic of gene expression by selecting a few examples of prokaryotic and eukaryotic genes. Although their mechanisms are similar (both using activators and repressors), the situation is more complex for eukaryotic genes.
The Regulation of Gene Expression in Bacteria by End-Product Repression
The organization of genes in bacteria is somewhat distinct. Often the genes that encode the enzymes of a single biochemical pathway are expressed as a single mRNA, with a single promoter region. This organization is termed an operon (illustrated in Figure 1). Transcription is initiated from the promoter, and the single mRNA that is made encodes several different proteins (referred to as polycistronic mRNA). This is not to be confused with splicing, which is the processing of mRNA and does not occur in bacteria. The clustering of these genes in one operon ensures that the different proteins involved in one biochemical pathway are uniformly expressed. For example, the trp operonshown in Figure 1 encodes five enzymes (Trp A-E) that catalyze reactions in the biosynthetic pathway of the amino acid tryptophan. Expression of all five genes is equivalent because they are linked on a single mRNA and regulated by a single promoter. The expression of this operon is regulated by the levels of tryptophan, the final product of the biosynthetic pathway encoded by the operon. This type of regulation is termed end-product repression, whereby high levels of tryptophan will result in the repression of the expression of this operon. Repression is mediated by a regulatory DNA sequence referred to as the operator (embedded within the promoter region), which is recognized and bound by a repressor protein that inhibits transcription. The repressor protein is expressed constitutively, but is only active when it is bound to tryptophan. The repressor is an allosteric protein, and tryptophan is an effector of this protein. When there are high levels of tryptophan, the repressor is active and it binds to the operator, preventing RNA polymerase from binding to the promoter and initiating transcription. Thus, the mechanism of end-product repression ensures that the operon is streamlined and only active when there are low levels of tryptophan in the cell. When the levels of tryptophan increase, the operon is "shut off" because there is no need to make additional tryptophan.
The Regulation of Gene Expression in Bacteria by Substrate Activation
The bacterial genes that encode the proteins involved in catabolic pathways are also organized in an operon. The best-studied example is the lac operon, which encodes the three genes required for the transport and breakdown of the disaccharide lactose into galactose and glucose (illustrated in Figure 2). This operon's regulation is different from the trp operon; the lac operon is regulated by substrate activation. Also, the lac operon is coordinately regulated by the available levels of lactose (the substrate of the enzymes of the lac operon) and glucose (the preferred carbon source for bacteria). The lac operon is highly expressed if lactose is present and glucose is not.
When bacterial cells are grown in the absence of lactose, a constitutively expressed repressor (the lac repressor) binds to the operator of the lac operon and represses transcription by preventing RNA polymerase from binding to the promoter. This is logical because in the absence of lactose in the media, there is no need for the expression of this operon or for proteins it encodes - their sole function is to metabolize lactose. However, when bacterial cells are grown in the presence of lactose, repression by the lac repressor is alleviated. Lactose is a negative effector of the lac repressor protein and it inactivates the repressor, thus preventing it from binding to the operator.
When glucose is present in the media, even if lactose is present, there is only very low-level expression of the operon. RNA polymerase does not bind to the promoter very well in the absence of an activating protein. Glucose regulates the operon through the activity of the catabolic activating protein (CAP), which interacts with RNA polymerase at the promoter and stimulates transcription. CAP is active when bound to cAMP, and then it binds to a DNA regulatory sequence upstream from the promoter. When glucose levels are low, cAMP levels are high and CAP is active. When glucose levels are high, cAMP levels are low and CAP is inactive and unable to bind to the regulatory DNA and stimulate transcription. Only when lactose is present (inactivating the repressor) and glucose is absent (resulting in high cAMP levels and activated CAP), is there significant transcription of the lac operon.
Gene Expression in Eukaryotes - Enhancers and Silencers
The expression of eukaryotic genes also requires regulatory proteins and regulatory DNA sequences, however, there are some important distinct features of eukaryotic gene expression. First, genes are not organized into operons; genes are coordinately expressed through the action of regulatory sequences associated with several genes that bind to common activators or repressors. Second, the substrate and end-product regulations associated with the lac and trp operons, respectively, do not exist in eukaryotes. There are repressors and activators, but they are much more varied in a eukaryotic cell and they regulate transcription in response to a host of different conditions (including phases of the cell cycle, hormonal regulation and environmental stresses on the cell).
Enhancers and Silencers
Transcription initiation of a eukaryotic gene occurs through the recruitment and binding of the general transcription factors (TFII A,B,D,E,F and H) and RNA polymerase II to the promoter. Despite the large number of proteins included in the initiation complex, it is not sufficient to promote transcription to the levels normally occurring in cells. Eukaryotic transcription requires the activity of additional activators. Some activator proteins bind to proximal regulatory DNA sequences (~50-200 nucleotides) upstream from the promoter. These activators will help recruit the general transcription factors and RNA polymerase II to the promoter. Other activators act by binding to regulatory sequences at great distances from the promoter. These regulatory sequences are referred to as enhancers and their presence enhances transcription. Enhancers are unique in that they can act upstream or downstream from the promoter. They can act at great distances (>10,000 nucleotides away) from the promoter and their orientation relative to the promoter is not important. Some have even been detected with an intron (introns are discussed in the tutorial entitled Transcription) sequence of genes. In addition to enhancers, there are also silencers, regulatory DNA sequences that recruit repressors and inactivate transcription at a distance. Silencers have many of the same properties as enhancers.
How do enhancers and silencers affect the activity of RNA polymerase at such great distances? An example of an activator bound to a distant enhancer stimulating transcription is illustrated in Figure 3. The DNA, with the activator bound to the enhancer sequence, can loop out so that the enhancer is now in close proximity to the general transcription factors and RNA polymerase II at the promoter. Often the long-distance activators do not bind directly to the general transcription factors but do so indirectly through a mediator (co-activator) that binds to the long-distance activator as well as the general transcription factors (but does not bind to DNA directly).
Tissue-Specific Gene Expression
Sometimes the regulatory proteins (activators or repressors) that are recruited to the DNA regulatory site (enhancers or silencers, respectively) are tissue specific; that is, the regulatory protein is expressed in only one cell type. For example, a liver-specific activator called hepatocyte nuclear factor (HNF-1) activates the transcription of the over 40 different genes in the liver. The genes regulated by HNF-1 are present in every cell, as are the enhancer sequences that are bound by HNF-1, but these genes are preferentially expressed in liver cells because the HNF-1 protein is predominantly expressed in liver cells and not in other cells. The HNF-1 protein regulates the expression of many genes that encode a variety of different proteins. These genes all have an HNF-1 binding site in their regulatory DNA (an enhancer that is bound by HNF-1).
A single gene will likely have many enhancers and silencers in its regulatory DNA sequences. The precise pattern of expression of that gene in any given cell is a function of the combined activities of the activators and repressors expressed in that cell. This is referred to as the combinatorial control of gene expression. Differential expression of a gene in different tissues is due to the presence of different combinations of activators and repressors in each cell type.
Remodeling the Chromatin
A unique feature of eukaryotic DNA is its assembly into nucleosomes and other higher-order chromatin. The organization of DNA into nucleosomes acts as a barrier for the binding of transcription factors and RNA polymerase II to the promoter by reducing the accessibility of the DNA. One way in which activators and repressors can modulate gene expression is to recruit proteins that remodel the chromatin structure in and around the promoter of a gene. Activators can stimulate transcription by locally recruiting enzymes that modify the histones (via acetylation) and alter the nucleosome structure in and around the promoter so that the chromatin is more accessible to transcription factors. Alternatively, activators can recruit other protein complexes that alter the packing and positioning of the nucleosomes, thereby increasing the accessibility of the DNA. Repressors act in a reciprocal fashion by recruiting proteins to the promoter that facilitates the tight packing of DNA into chromatin, therefore, there is little access of the promoter to the transcription factors.
Until somewhat recently, it was not feasible to monitor all of the genes expressed in a particular cell type or tissue. Examining the expression of a single gene in many tissues was only possible through a procedure termed Northern blots (described in the tutorial on Recombinant DNA Technology). Briefly, total mRNA is isolated from tissue, separated by gel electrophoresis, transferred to a nylon membrane, and then hybridized to a labeled DNA probe specific for the gene of interest. Typically this would be done for one gene at a time. With the advent of DNA microarrays, one can simultaneously examine the expression of thousands of genes. A DNA microarray is a small chip, made of glass or plastic (about the size of a postage stamp), which has been spotted with thousands of single-stranded DNA fragments (in a precise and non-overlapping fashion) that correspond to different genes. For a relatively simple eukaryote such as yeast, which has only 6000 genes, a microarray with 6000 unique spots of DNA would be sufficient to determine the expression pattern of the whole genome. Typically, microarrays are used to examine the changes in gene expression in cells or tissues in two different states. For example, a microarray might be used to compare gene expression of cells before and after exposure to a particular hormone.
Let's consider an example of yeast grown with or without glucose in the media. The mRNA is isolated from yeast grown in glucose (normal conditions) and labeled with green fluorescent molecules, whereas mRNA isolated from a second sample of cells grown in the absence of glucose (starved conditions) is labeled with red fluorescent molecules. The differentially labeled mRNA populations are mixed with the DNA on the micorarray, allowing the mRNA to anneal to its complementary DNA sequence, then all of the unbound labeled mRNA is removed. Each mRNA will anneal to its complementary DNA spot on the microarray chip, leaving a bright fluorescent spot on the chip. The fluorescence is measured using a scanning-laser microscope. Spots that anneal to mRNA isolated predominantly from normal yeast cells are green, spots that anneal to mRNA isolated predominantly from starved yeast cells are red, and spots that anneal to roughly equivalent amounts of mRNA isolated from both samples are yellow. The red and green fluorescence for each spot is measured and, because the identity of the DNA for each spot is known, it is possible to compare the expression of each gene in normal and starved cells. Thus, a profile of all the genes "turned on" or "turned off" during starvation can be determined. An example of an actual microarray is shown in Figure 4. It is possible to generate profiles of gene expression in cancer cells, diseased cells, cells treated with hormones and many other conditions.
Gene Expression, Embryonic Development and Cloning
Consider the normal development of an embryo. Initially the egg and sperm fuse, resulting in a diploid cell. This cell will divide and eventually give rise to all of the cells of the organism; therefore, the fertilized egg is considered totipotent- capable of giving rise to any cell. As the embryo grows and develops, groups of cells will acquire a distinct fate and become restricted in their potential. Finally these determined cells will differentiate; that is, they will assume a unique cell shape, identity, and activity (e.g. red blood cells). The processes of cell fate determination and differentiation occur through the regulation of gene expression. As the cells acquire a unique identity, some genes are expressed and others are silenced. The pattern of gene expression is stable and is inherited by cells derived from the original diploid cell. Although some genes are silenced and will never be expressed in the differentiated cell, their DNA is still present and has the potential to be expressed. This point has been dramatically demonstrated by experiments in which animals have been cloned.
The first mammal to be cloned was a sheep, cloned by Ian Wilmut and his colleagues at the Roslin Institute in Scotland in 1996, through a process called nuclear transfer. (In this context, a cloneis an exact copy of an individual, produced by the genetic material from the original animal.) During nuclear transfer, an unfertilized egg had its nucleus removed and replaced with the nucleus of a donor cell, a differentiated mammary gland cell from a female sheep (ewe). The new fused cell was subjected to an electric current that activated the cell to divide, as well as reactivated the DNA from the donor cell to reprogram its gene expression. The embryo was then transplanted into a surrogate mother, resulting in the birth of an ewe that was genetically identical to the ewe that provided the donor nucleus. This cloned sheep (derived from a mammary cell) was named Dolly after the buxom Country and Western singer Dolly Parton.
The birth of Dolly confirmed that the DNA of a differentiated cell is intact and that when placed in the correct environment (the unfertilized egg), it still has the potential to encode for all the genes of the genome. In fact, cloning experiments with frogs (performed by John Gurdon in the late 1960s) had already demonstrated that frog skin cells could be used in nuclear transplant experiments to clone frogs. These frog clones, however, survived only to the tadpole stage; at the time, it was believed that nuclear transplantation could not work in mammals because their differentiation altered the DNA in some fashion so that the patterns of gene expression could not be changed. This is clearly not the case. Since the first cloning of Dolly, many other animals (e.g. mice, cows, horses and cats) have been cloned using similar techniques.
Gene expression relates to which genes are transcribed and at what rates they are transcribed. Housekeeping genes are expressed at approximately equivalent levels in all cells. Inducible genes can be "turned on" or "turned off," depending on the conditions in the cell and the cell type. The regulation of gene expression is mediated through the action of regulatory proteins (activators and repressors) that bind to specific DNA sequences to activate or repress, respectively, the rate of transcription.
In prokaryotes genes are often organized into operons, where several genes of related function are transcribed into a single polycistronic mRNA and regulated by a single promoter. Operons typically serve either to break down a target molecule, a catabolic operon, or to synthesize a biological molecule, an anabolic operon. Two common means of operon regulation are end-product repression and substrate activation. In both cases, a repressor regulates expression of the operon by binding to the operator and restricting RNA polymerase from initiating transcription. The trp operon is an example of an anabolic operon that employs end-product repression. The end-product (tryptophan) regulates the expression of the operon through the binding and activation of the trp repressor. The lac operon is an example of a catabolic operon that employs substrate activation. Lactose alleviates the repression of the lac repressor by binding to and inactivating the repressor. The lac operon expression also requires activation by CAP, which binds to the site upstream from the promoter; thus, stimulating transcription. Active CAP stimulates transcription of the operon when glucose levels are low.
Eukaryotic genes are also regulated by activators and repressors, which in some cases act by binding to enhancers and silencers (respectively) that can be great distances from the promoter. These long-distance activators and repressors interact with the transcription initiation complex at the promoter by looping the DNA to bring the proteins together. Activators and repressors can also recruit proteins to remodel the chromatin and increase or decrease (respectively) the accessibility of the DNA around the promoter. Many activators and repressors are tissue specific and regulate tissue-specific gene expression. The expression of an individual gene, in any given cell, is a function of the combination of activators and repressors expressed in the cell.
DNA microarrays afford the monitoring of the expression of thousands of genes simultaneously by comparing the levels of mRNA of cells in two different states or two different cell types. The cloning of mammals has demonstrated that the changes in gene expression that normally occur during the process of differentiation are reversible.