This article is currently undergoing a major revision. Please do not edit this page until this template is removed.
Genes are segments of DNA that contain the information required to make a protein. Gene expression refers to all the processes involved in converting genetic information from a DNA sequence, or protein. In prokaryotes, there are just two processes required: transcription and translation. In eukaryotes, there is an additional step RNA processing (splicing), which intervenes.
The synthesis of a single-stranded RNA molecule using DNA as a template is referred to as transcription. The enzyme that catalyzes this reaction is known as RNA polymerase. Although the subunit structure and details of the process differ significantly in prokaryotes and eukaryotes, the chemical reaction is identical.
Bacterial RNA polymerases are composed of multiple subunits. The core enzyme consists of 2 α, β and β′ subunits. This enzyme contains the catalytic activity for RNA synthesis. Association of the core enzyme with the σ subunit is required for initiation of transcription. This complex is referred to as the holoenzyme. The holoenzyme binds to DNA nonspecifically and scans for a promoter sequence. In E. coli, the most common type of promoter contains two conserved sequences centered at -10 and – 35. The σ subunit interacts specifically with these sequences in the double stranded DNA, positioning the active site at +1 (transcription start site). This complex is referred to as the closed complex.
The second step in transcription initiation involves the unwinding of the DNA between the -10 region and the start site of transcription. This is referred to as the open complex.
The first two rNTPs complementary to the DNA bind to the active site of RNA polymerase and the first phosphodiester bond is formed. Synthesis is always in the 5’ to 3’ direction and involves the nucleophic attack by the 3’OH on the 5’ phosphate with the release of pyrophosphate. After approximately 8-10 ribonucleotides are added, σ factor is released and RNA polymerase core enzyme moves away from the promoter, synthesizing RNA as it goes.
Transcription continues until a termination signal is reached. The current model is that the RNA dissociates from the DNA – RNA hybrid to form a stem-loop structure. The remaining AU base pairs are not strong enough to keep RNA hybridized with the DNA. Dissociation of the RNA destabilizes RNA polymerases and transcription terminates.
The basic features of RNA synthesis are shared between prokaryotes and eukaryotes; however, transcription in eukaryotes differs in that it is significantly more complex. First, rather than having a single RNA polymerase, eukaryotes have three different RNA polymerases, each of which transcribes a different set of genes. RNA polymerase I transcribes three types of rRNA (the 18S, 5.8S, and 28S species), RNA polymerase II transcribes mRNA, and RNA polymerase III transcribes tRNA and the smallest rRNA (the 5S species). The eukaryotic RNA polymerases consist of between eight and fourteen subunits, with two of them corresponding to the β and β′ subunits of prokaryotic RNA polymerases.
All three eukaryotic polymerases have some homology to the E. coli core polymerase (α2ββ′), but also contain additional subunits not found in prokaryotes. In total, approximately 12-17 subunits (varies with the species) are required for activity in vivo, making eukaryotic polymerases much more complex than prokaryotic polymerases.
RNA Polymerase II
Near the carboxy end of Pol II’s largest subunit a unique region is found (CTD, carboxy terminal domain). This region contains a stretch of 7 amino acids that is repeated between 26 and 52 times (differences in the number of repeats occur in a species specific manner). During transcription initiation several amino acids in the repeat becomes phosphorylated.
A typical eukaryotic promoter is located from approximately -40 to +50 relative to the start site (+1). This is referred to as a “core promoter”, with additional regulatory sites being located nearby or at a large distance (enhancers and silencers). Within the core promoter several conserved sequences have been identified. The most prominent (though not found in all promoters) is the TATA motif (5'-TATAAA-3') or TATA box (Goldberg-Hogness box ). Located at approximately -30, this highly conserved sequence is easy to identify. However, some promoters are missing the TATA box and through the analysis of these promoters other less conserved sequences have been found. Once is located around +1 and is called the Initiator element (Inr).
Although Pol II is very large and complex, it is unable to neither recognize specific promoter sequences on its own nor catalyze all the steps required for transcription initiation. It requires the help of several factors known as general transcription factors (GTFs). The factors required at most promoters are TFIIA, TFIID, TFIIE, TFIIF and TFIIH.
TFIIE, TFIIF and TFIIH associate with RNA Polymerase II and another protein complex (the mediator) to form a holoenzyme. TFIIA and TFIIB are sometimes found together with the holoenzyme, and in other experiments they seem to bind independently. In any event, the holoenzyme is unable to recognize promoter squences and depends on TFIID binding at the TATA box. TFIIA and TFIIB bind to the DNA adjacent to the TATA box and also interact with TFIID.
TFIID is composed of a single polypeptide known as the TATA binding protein or TBP, plus approximately 14 additional polypeptides known as TBP Associated Factors or TAFIIs. TBP specifically binds to the TATA box, causing a sharp bend in the DNA. It is this binding of TBP with its associated factors that begins the assembly of the remaining GTFs and the holoenzyme at the promoter.
TFIIE and TFIIH
After binding of the general transcirption machinery to the promoter, TFIIE and TFIIH assist Pol II in unwinding the DNA at the start site of transcription. One of the subunits of TFIIH is a helicase responsible for the unwinding and TFIIE binds to and stablilizes the single stranded DNA.
Another subunit of TFIIH contains a kinase activity responsible for phosphorylating the CTD region of Pol II. Although this activity can be influenced by many factors, in simple systems it is believed to correlate with the initiation of transcription and that movement of RNA polymerase away from the promoter.
Summery of Transcription
Although eukaryotic transcription is much more complex than prokaryotic transcription, initiation involves the same basic steps in both:
- Promoter Recognition
- Unwinding DNA around +1
- Synthesis of first few polynucleotides
- Release of polymerase from promoter
In prokaryotes, termination occures at discrete sites and translation can begin even before transcription has terminated. In eukaryotes, transcription continues past the site where poly-A addition occurs and then terminates randomly (frequently 500 bases or so down stream). RNA processing and splicing is completed in the nucleus before transported to the cytoplasm for translation.
Eukaryotic RNA Processing
- Stryer, Lubert. Biochemistry, 4th ed. New York: W. H. Freeman and Company, 1995
- Tijian, Robert. "Molecular Machines that Control Genes." Scientific American 272 (1995): 54–61.
- Lifton RP, Goldberg ML, Karp RW, Hogness DS. (1978). The organization of the histone genes in Drosophila melanogaster: functional and evolutionary implications. Cold Spring Harb Symp Quant Biol. 42, 1047-1051. PMID 98262