Contents of this lecture can be viewed on YouTube site : Life Science Lectures for You
1. Contents of this lecture
In this lecture, evolution of spliceosomal intron orignated in eubacterial Group II intron is included. This article contains 28 figures.
In this lecture, I will explain:
- Intron transposition by enzymatic activities encoded in Group I and Group II introns.
2. The relationship between prokaryotic Group II introns and the introns in the nuclear genome.
3. The splicing reaction of prokaryotic Group I introns.
Key Words:
Group I, Group II, intron, spliceosomal, intronic ORF, eubacteria, archaea, chloroplast, mitochondria, ribozyme, snRNA, U1RNA, U6RNA, IBS, EBS, non-LTR retrotransposon, reverse transcription, retrohoming, intron transfer
2. Various introns in eukaryotic cell
I suppose you are all familiar with introns found in eukaryotic genes that are spliced out by the spliceosome.
However, in the genomes of true bacteria, mitochondria, and chloroplasts, there are entirely different types of introns.
Here are schematic diagrams of Group I introns and Group II introns. These introns, which have such three-dimensional structures, and encode proteins within themselves, are found in abundance in these genomes.
3. Spliceosomal intron : Introns within protein-coding genes in the nuclear genome
This figure illustrates an overview of splicing for protein-coding genes encoded in the nuclear genome of eukaryotes. Messenger RNA that has just been transcribed in the nucleus contains introns.
Messenger RNA in this state is called immature messenger RNA. The apparatus that removes these introns and connects the remaining exon regions is called the Spliceosome. The Spliceosome is a complex of six types of short RNAs called U1, U2, U3, U4, U5, U6, and more than 100 types of proteins.
Messenger RNA that has completed the splicing reaction is called mature messenger RNA. Mature messenger RNA is transported through nuclear pores to the cytoplasm, where it is translated.
4. Distribution of introns in bacteria
First, all bacterial genomes do not contain introns that are spliced out by spliceosome, but they do contain Group I and Group II introns.
Bacteria are classified into eubacteria and archaea. Archaea are well known as the host cells in an evolutionary event where they engulfed eubacteria, leading to symbiosis and eventually giving rise to eukaryotes.
Archaea have very few introns. Only rarely are Group II introns discovered in them. On the other hand, many Group I and Group II introns have been found in various eubacteria.
5. Group I, Group II-introns bear 3D structures
6. Three types of introns
Let’s summarize the three types of introns once again: Group I introns are characterized by an open reading frame within the hairpin structure introns. This open reading frame encodes a homing enzyme or homing endonuclease, which is a DNase that cuts specific sequences in the genomic DNA. They are widely found in the genomes of eubacteria, chloroplasts, and mitochondria. In eukaryotic genomes, they are exceptionally found within ribosomal RNA genes.
Group II introns have a larger and more complex tertiary structure than Group I introns. Like Group I introns, they also contain an open reading frame within the intron. This open reading frame encodes reverse transcriptase and endonuclease. Regarding their distribution, they are widely found in eubacteria and very rarely in archaea. They are also numerous in chloroplast and mitochondrial genomes.
On the other hand, spliceosomal introns, which are removed by the spliceosome, do not contain open reading frames within the intron. Additionally, the intron portion does not have a tertiary structure. They are embedded within protein-coding genes in eukaryotic genomes.
These are the general characteristics of the three types of introns.
7. Comparison of intron frequencies within the mitochondrial genome
As I explained that there are Group I and Group II introns in the mitochondrial genome, let’s examine this in more detail. We find that there is a significant bias in the frequency of these introns within genomes.
Regarding multicellular animals, one Group I intron has been exceptionally found in sea anemones, but there are no reports of either Group I or Group II introns in the mitochondria of other multicellular animals.
In protozoans, while Group I and Group II introns are not very frequent, they have been reported. Similarly, for algal mitochondria, there are reports of Group I and Group II introns, but their frequency is not particularly high. In contrast, in fungi, both Group I and Group II introns are frequently discovered.
8. Fifteen group I introns within mitochondrial cox1 gene of a fungi Podospora anserina
This is the cox1 gene of a fungi Podospora anserina. The white areas with hatching indicate Group I introns.
There are total of 15 Group I introns within this cox1 gene. Additionally, there is one Group II intron, which is shown in gray.
On the other hand, the red areas indicate exons. The total length of the exons in the cox1 gene is 1.6 kilobases, whereas the total length of the introns is 22.9 kilobases. As in this example, numerous Group I and Group II introns have been discovered in the mitochondria of fungi.
9. Expression of the ORF encoded in intronic region
10. Splicing reaction mechanism in Group II and Spliceosomal intron
On the other hand, this is the splicing reaction of Group II introns. The 2’-OH oxygen of an adenosine within the intron acts as a nucleophile, attacking the exon-intron boundary. This cleaves the exon-intron boundary, and simultaneously forms a lariat structure. The upstream 3’-OH then attacks the downstream exon-intron boundary in a nucleophilic reaction, joining the two exons and simultaneously ejecting the intron with its lariat structure.
In other words, Group II introns and spliceosomal introns are excised through exactly the same chemical reactions. The difference is that in Group II introns, the intron portion forms a tertiary structure and functions as a ribozyme, whereas in spliceosomal introns, the intron portion does not form a tertiary structure and does not function as a ribozyme. Therefore, a spliceosome, an apparatus with RNA cleavage and ligation activity, is essential.
11. Compariosn of the Group I and Group II splicing reactions
Now, using this diagram, we will compare the splicing reactions of Group I introns and Group II introns.
In the case of Group I introns, the oxygen element of the 3′-OH of the GTP bound to the intron nucleophilically attacks and cleaves the upstream exon-intron boundary. Next, the oxygen element at the cleavage site of the exon nucleophilically attacks the downstream exon-intron boundary. As a result, a chemical reaction occurs where the intron is excised and ejected while the two exons are joined together.
This reaction proceeds due to the ribozyme activity that arises from the Group I intron forming a three-dimensional structure. In this way, the excision reaction of Group I introns differs from the excision reactions of Group II introns and spliceosomal introns.
12. Structural similarity between the spliceosomal small nuclear RNAs and Group II introns
Next, let’s focus on structural similarity between the three-dimensional structure of spliceosomal small nuclear RNAs and that formed in Group II introns.
13. Spliceosome
Spliceosome is a complex composed of five types of small RNAs called U1, U2, U4, U5, and U6, along with approximately 50 to 100 different proteins.
The small RNAs are about 150 nucleotides in length and are rich in uracil bases, which is why they are given the prefix ‘U’. The number of proteins varies depending on the species. This is an overview of the spliceosome.
14. Structure of Group II intron
On the other hand, here is a two-dimensional schematic of Group II intron RNA, which has six structures called Arm or domains from D1 to D6. This is a diagram of the three-dimensional structure formed by these arms.
15. Catalytic active centre of spliceosome
We have discussed that in Group II introns, if the RNA folds correctly, it can act as a ribozyme that can excise itself and join exons.
It has been discovered that small nuclear RNAs within the spliceosome can also function as ribozymes, when they fold correctly and associate with each other. This figure depicts a complex formed by the association of U2 small nuclear RNA and U6 small nuclear RNA, binding to the adenosine located at the branch point of the lariat structure to be excised. As shown in this figure, in the spliceosome, splicing reactions occur when small nuclear RNAs form complexes and attach to messenger RNA.
Although the spliceosome is a complex of proteins and RNA, the active site for the splicing reaction is on the RNA side, not on the protein side. Proteins are thought to primarily serve the purpose of helping small nuclear RNAs form the correct three-dimensional structure and positioning them correctly.
16. Maturase coded in Group II intron
As I mentioned earlier, small nuclear RNAs within the spliceosome require protein assistance to fold correctly. The same is true for the three-dimensional structures of Group II introns.
The proteins used for this purpose are products of intronic Open Reading Frame. This figure shows the activities of a protein encoded within the intronic ORF of a Group II intron. This protein has several different activities, each associated with a specific region. One of these activities is called the maturase activity, which is encoded in the X-domain.
As the name ‘maturase’ suggests, it has an activity that helps something mature. Specifically, the maturase has the activity to mature the three-dimensional structure of the intronic RNA into its correct form. Thus, the intron carries within its own intronic ORF the protein necessary for maturing its own three-dimensional structure.
17. Structural similarities core reaction center
Group II introns, when correctly folded, possess ribozyme activity and proceed with splicing. Similarly, small nuclear RNAs within the spliceosome can function as ribozymes that carry out splicing reactions when properly folded. Notably, the chemical reactions for intron excision in these two systems are identical.
While these two types of introns already share remarkably similar properties, recent findings have revealed structural similarities between their intronic RNAs as well.
This figure shows domains 5 and 6 of the Group II intron. In domain 6, there is an adenine base that corresponds to the knot of the lariat structure. On the other hand, this image depicts the U6-small nuclear RNA and U2-small nuclear RNA complex bound to the exon-intron boundary of the messenger RNA to be spliced in the spliceosome. Here too, an adenine base corresponding to the lariat structure’s knot is present. You can see that the body structures of both RNAs are very similar.
18. Birth of spliceosomal Introns from Group II Intron
The theory that spliceosomal introns originated from Group II introns can be explained as follows: As is widely known, eukaryotes emerged when an archaeon engulfed and established a symbiotic relationship with a bacterium. In present-day archaea, Group II introns are extremely rare, and Group I introns have not been found at all. In contrast, eubacteria that were engulfed by archaea have been found to contain numerous Group II and Group I introns.
It is also known that in early eukaryotes, many genes from the engulfed eubacterium were transferred to the archaeal nuclear genome. This gene transfer would naturally include the transfer of Group I and Group II introns. If we consider that the Group II introns transferred to the archaeal nuclear genome eventually evolved into the spliceosomal introns we see today, we can better understand the distribution of Group II introns across the biological world, and the similarities in the splicing mechanisms between the two types of introns.
This explanation effectively accounts for the origin of spliceosomal introns from Group II introns, considering the evolutionary history of eukaryotes, and the distribution and characteristics of different intron types. However, Group I introns are currently found only in ribosomal genes of eukaryotes. The reason for this limited distribution of Group I intron in eucaryotic genes remains unexplained.
19. Retrohoming of Group II intron: Translocation to homologous genes without intron
Next, I will explain the intron transposition reaction, in which a spliced out Group II intron is inserted into a specific site with a particular nucleotide sequence in a genome through a reverse reaction of intron splicing.
In many cases, such specific nucleotide sequences are found in homologous genes that do not contain the intron. This transposition reaction is called ‘Retrohoming’, because the reverse-transcribed RNA back to the original site. Retrohoming is a reaction that does not occur with spliceosomal introns.
20. Reverse splicing to target DNA is retrohoming of Group II intron
I will now explain the retrohoming of Group II introns. In this figure, this gene containing a Group II intron, and here is its homologous gene without the intron.
First, messenger RNA is produced from the gene containing the intron, and the Group II intron is spliced out from this immature messenger RNA. The excised intron RNA is then inserted into the intron-less gene, at the position corresponding to the exon-intron boundary in the intron-containing gene, through a reverse of the splicing reaction.
The site where the intron is newly inserted is the same as where the original intron was located, hence it’s called ‘homing’. Additionally, because this reaction involves reverse transcriptase, the prefix ‘retro’ is added, and the process is termed ‘retrohoming’. I will explain this reaction in more detail.
21. Retrohoming reaction of Group II intron
Group II intron retrohoming is a complex process involving several steps. The retrohoming process occurs as follows:
This is a spliced out Group II intron with a lariat structure. The protein required for the assist of the splicing is still attached on the spliced out intronic RNA. The protein includes reverse transcriptase activity and endonuclease activity in addition to the RNA folding activity. Target DNA sequence of reverse-splicing is formed by the linkage of IBS-1 and IBS-2 sequences. I will explain later for these sequences. What is characteristic in the reverse-splicing reaction is the insertion of intronic RNA into the target DNA.
In this reaction, first, an endonuclease cut one strand of the target DNA. Then,the intronic RNA is ligated to the digested point. Following this reaction, cleavage of another DNA strand occurs. Then, cDNA synthesis of the intronic RNA begins at the 3’- end of the cleaved DNA. When the initial cDNA synthesis is complete, the intronic RNA is degraded, and the gap in the DNA strand is filled in, so that the intronic RNA sequence is eventually converted into double-stranded DNA.
In this reaction, Endonuclease activity and reverse-transcriptase activity are provided from the protein attached to the spliced out Group II intron
22. Enzyme activities coded in Domain IV-ORF Group II intron
This is an immature messenger RNA containing Group II intron. In the upstream exon region, there is an RNA sequence, , where IBS1 and IBS2 are connected. On the other hand, in the intron part, there is EBS1, which has a sequence complementary to IBS1, and EBS2, which has a sequence complementary to IBS2.
For the excision of Group II introns, it is important that stable hydrogen bonds are formed between these RNA sequences. This refers to the hydrogen bond formation between IBS1 and EBS1, and IBS2 and EBS2. In contrast, during the reverse-splicing reaction, AAA, EBS1 and EBS2, which are RNA sequences of the excised Group II intron, need to form hydrogen bonds with DNA sequences homologous to IBS1 and IBS2, respectively.
In the splicing reaction, the intron-exon boundary was determined by RNA-RNA hydrogen bonding, whereas in reverse-splicing, the insertion site of the intron is determined by RNA-DNA hydrogen bonding.
23. Reverse transcriptase and Endonuclease coded in an intronic ORF
Group II intron reverse-splicing requires activities such as reverse transcriptase and endonuclease.
Let’s now explain where these enzymatic activities are encoded.
24. Enzyme activities coded in Domain IV-ORF Group II intron
These are encoded in the intronic Open Reading Frame located in domain 4 of the Group II intron.
This single protein encodes reverse transcriptase activity, endonuclease activity, and DNA binding activity. The intronic ORF is translated before the intron splicing occurs, and the resulting protein is used for both splicing and reverse-splicing processes.
25. Selfish-genetic elements
Group II introns are considered selfish genetic elements, as they increase their own copy through reverse splicing. There are other selfish genetic elements in the genome that use similar mechanisms to expand their own coding element.
One such example is Long Interspersed Nuclear Element, for short called LINE, which belongs to the Non-Long Terminal Repeat Retrotransposon class. L1, a type of LINE found in the human genome, contains open reading frames that encode an endonuclease and a reverse transcriptase. Protein produced from this open reading frame attach to the messenger RNA that encoded it, forming a complex. This complex then randomly attacks genomic DNA. The endonuclease first cleaves one strand of the double-stranded DNA, and reverse transcription of the LINE begins from this cleavage point. Eventually, the LINE messenger RNA is converted into double-stranded DNA, and integrated into the genomic DNA. Through this process, LINE molecules can increase their own copies at random locations within the genome.
Interestingly, the reverse transcriptase possessed by LINE shows high homology with the reverse transcriptase of Group 2 introns. Molecules that have the ability to reverse transcribe their own messenger RNA into double-stranded DNA, exhibit characteristics of selfish genetic elements.
26. Homing reaction in Group I intron
27. Homing enzyme as a tool for genetic engineering
Homing enzymes with such long recognition sequences are now used as tools in genetic engineering. The example shown here is a type of homing enzyme named I-SceI. The recognition sequence of this homing enzyme is 18 base pairs long.
This enzyme is encoded in the open reading frame of a Group I intron in the mitochondria of Saccharomyces cerevisiae, a species of yeast. The recognition sequence is 18 base pairs long. Since 4 to the power of 18 is about 68 billion, a simple probability calculation suggests that such a sequence would appear only once in 68 billion base pairs.
Considering that the human genome is about 3 billion base pairs long, this demonstrates that it is an enzyme with extremely high specificity. Therefore, it is used in cases where there is a need to fragment the genome into very large pieces.
28. Coexistence of intron-containing and intronless homologous genes : Bacterial conjugation
Group II intron retrohoming and Group I intron homing require the coexistence of an intron-containing gene and an intronless homologous gene within the same cell. However, in many bacteria, each gene typically exists as a single copy, and there is no situation where intron-containing and intronless genes are simultaneously encoded in the genome. Let’s explore how the coexistence of intron-containing genes and intronless homologous genes can occur.
In bacteria, conjugation is a well-known process. Conjugation refers to the sharing and recombination of genomic DNA between bacteria of the same species through mating. Even in bacteria, there is a form of sex, where DNA can be transferred from a male bacterium to a female bacterium. The female and male bacteria connect through a sex pilus. DNA sharing begins from the male bacterium to the female bacterium through this sex pilus. In this way, a situation can arise where the intronless gene from the male bacterium and the intron-containing gene from the female bacterium coexist within the same cell.
In eukaryotes, during fertilization, not only do the nuclei fuse, but mitochondria and chloroplasts also merge. If the male’s mitochondrial DNA genes are intronless while the female’s mitochondrial DNA genes contain introns, a situation can occur where intronless genes and intron-containing genes coexist. Under such circumstances, retrohoming or homing can take place.
This concludes the explanation of this lecture.
コメント