Nature Reviews Genetics | Analysis
Key points
- Recent advances in next-generation sequencing methods and quantitative mass spectrometry have renewed the interest in RNA biology and the genome-wide investigation of post-transcriptional gene regulatory proteins. A global census that systematically lists the number of factors involved in post-transcriptional gene regulation (PTGR) is currently not available. Here, we provide an overall summary of the proteins involved in interactions with all classes of RNAs based on our current knowledge of PTGR; this will guide future systems-wide studies of PTGR.
- RNA-binding proteins (RBPs) are evolutionarily deeply conserved, and their structural domains diversified early in evolution.
- RBPs are among the most abundant proteins in the cell and are generally ubiquitously expressed, which mirrors their central and conserved role in gene regulation.
- Only ~2% of RBPs are tissue-specific, and most of these are mRNA- and non-coding RNA-binding proteins.
- Diseases involving RBPs show characteristic phenotypes depending on the type of RNA (for example, mRNA, ribosomal RNA and tRNA) predominantly bound by the RBPs.
- Correlated expression of RBPs across developmental processes can identify factors in shared PTGR pathways.
Introduction
Figure 1: Overview of the main post-transcriptional gene regulation pathways in eukaryotes.
An overview is given for the biogenesis, decay and function of the most abundant RNAs: tRNAs, ribosomal RNAs, small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), mRNAs, microRNAs (miRNAs), PIWI-interacting RNAs (piRNAs) and long non-coding RNAs (lncRNAs). Processes are described from left to right. Referenced gene names and complexes in the figure are listed in Supplementary information S3 (table) and within the listed references. a | tRNAs are transcribed by RNA polymerase III (Pol III); the 5′ leader and 3′ trailer sequences are removed, introns are spliced, and the ends are joined. CCA nucleotides are added to 3′ ends, and nucleotide modifications — such as methylation (M), pseudouridylation (ψ) and deamination of adenosines to inosines (I) — are introduced before tRNA aminoacylation195. b | The 5S rRNA is transcribed by Pol III, whereas 28S, 18S and 5.8S rRNAs are transcribed as one transcript by Pol I. The precursor is processed by RNA exonucleases, endonucleases and the ribonucleoprotein (RNP) RNase MRP, guided by U3 small nucleolar RNP (snoRNP). Nucleotide modifications are introduced by snoRNPs. rRNAs are assembled together with ribosomal proteins into ribosomal precursor complexes in the nucleus and transported to the cytoplasm, where they mature to functional ribosomes92, 196, 197. c | Most snRNAs are transcribed by Pol II, capped and processed in the nucleus. When exported to the cytoplasm, they undergo methylation and assemble with LSM proteins into small nuclear ribonucleic particles (snRNPs) in a process aided by the survival motor neuron 1 (SMN1). These snRNPs are re-imported into the Cajal body (CB) within the nucleus, where they undergo final maturation and snRNP assembly81. U6 and U6atac snRNAs are transcribed by Pol III and are alternatively processed in the nucleus and the nucleolus198. Mature snRNPs form the core of the spliceosome. d | snoRNAs and small Cajal body-specific RNAs (scaRNAs) are processed from mRNA introns, capped and modified before they assemble into snoRNPs or scaRNPs in the CB. snoRNPs and scaRNPs carry out methylation and pseudouridylation of rRNAs, snoRNAs and snRNAs, or function in rRNA processing (for example, processing of U3 snoRNA)81. e | mRNAs are transcribed by Pol II, capped, spliced, edited and polyadenylated in the nucleus. Correctly matured mRNAs are exported into the cytoplasm. Regulatory RNA-binding proteins (RBPs) control correct translation, monitor stability, decay and localization, and shuttle mRNAs between actively translating ribosomes, stress granules and P bodies37, 141, 142, 199, 200, 201, 202. f | miRNAs are either transcribed from separate genes by Pol II as long primary miRNA (pri-miRNA) transcripts or expressed from mRNA introns (mirtrons) and processed into hairpin pre-miRNAs in the nucleus. After transport into the cytoplasm, they are processed into 21-nucleotide-long double-stranded RNAs. One strand is incorporated into Argonaute (AGO) proteins (forming miRNA-containing RNPs (miRNPs)) and guides them to partially complementary target mRNAs to recruit deadenylases and repress translation203. g | piRNAs are ~28-nucleotides-long, germline-specific small RNAs. Primary piRNAs are directly processed and assembled from long, Pol II-transcribed precursor transcripts, whereas secondary piRNAs are generated in the 'ping pong' cycle by the cleavage of complementary transcripts by PIWI proteins. Mature piRNAs are 2′-O-methylated and incorporated into PIWI proteins. The piRNA–PIWI complexes (piRNPs) silence transposable elements (TEs) either by endonucleolytic cleavage in the cytoplasm or through transcriptional silencing at their genomic loci in the nucleus107. h | Most lncRNAs are transcribed and processed in a similar way to mRNAs. Nuclear lncRNAs play an active part in gene regulation by directing proteins to specific gene loci, where they recruit chromatin modification complexes and induce transcriptional silencing or activation185. Other non-coding RNAs (for example, 7SK RNA) regulate transcription elongation rates204 or induce the formation of paraspeckles (PS)205. Cytoplasmic non-coding RNAs can modulate mRNA translation206. i | Incorrectly processed RNAs are recognized by several complexes in the nucleus and cytoplasm that initiate and execute their degradation207, 208. CPSF, cleavage and polyadenylation specificity factor; EJC, exon junction complex; hnRNP, heterogeneous nuclear RNP; NGD, no-go decay; NMD, nonsense-mediated RNA decay; NSD, non-stop decay; PABP, poly(A)-binding protein.
|