UTRs of mRNAs: crucial checkpoints of gene regulation
Authors: Subodh Kumar Sinha and Basavaprabhu L. Patil
ICAR-National Research Centre on Plant Biotechnology, Pusa, New Delhi-110012


Owing to the requirement of having tight regulation of energy inside the cell, majority of the genome is actually allocated for gene regulation and just a small fraction of it is in fact utilized in encoding proteins. For instance only about 1.5% of human genetic material encodes for proteins. These regulatory elements of any genome exert their function either at transcriptional level controlling whether the transcript is required to be synthesized or if it is transcribed what should be the extent, or at post-transcriptional level controlling the fate of the nascent transcripts, including their stability, translational efficiency, their sub-cellular localizations etc. Transcriptional level control is mediated by transcription factors, RNA polymerase in association with several cis-acting elements e.g. promoter, enhancer, silencer etc which finally result in to pre-mRNA molecule. The pre-mRNAs undergo several processing steps to become mature mRNA which can now yield specific protein involving number of translational machineries. A mature mRNA typically has three regions viz. 5’ untranslated region (5’UTR), coding region consisting of triplet codon that gives rise to specific amino acid of protein, and 3’ untranslated region (3’UTR). These two UTRs play crucial roles to regulate the expression of its cognate transcript post-transcriptionally. These UTRs involve their both primary as well as secondary structures in regulation of genes while interacting with RNA-binding proteins, other RNA molecules etc, unlike to DNA based regulation. Here in this article we tried to discuss briefly the multi-faceted roles of UTRs in controlling expression of transcript finally into the form of protein as significant checkpoints.

General structural features of UTRs

Studies show that both 5’ and 3’ UTRs vary in their length with more variable and longer 3’UTR than 5’. The length of both 5’ and 3’UTRs can also vary within species ranging from a dozen nucleotides to few thousands. The genome size of rice is three times larger than Arabidopsis, nevertheless the genomic proportion of UTRs in both plants is almost same, however, the rice UTRs are longer than Arabidopsis. Genes located in large GC-rich regions of a chromosome have shorter 5’ and 3’ UTRs than genes located in GC-poor isochores. A similar correlation has also been shown for the coding sequence and introns. The genomic region corresponding to UTRs may also contain intron with much prevalence in 5’UTRs than 3’UTRs. Alternative UTRs can also be formed from same transcript if different transcription initiation sites, polyadenylation sites or splice donor/acceptor are used. The Shine-Dalgarno (AGGAGGU; 3-10bp upstream to initiation codon) and Kozak consensus (ACCAUGG; contains the initiation codon) sequences of prokaryote and eukaryote respectively are found in 5’UTRs which are one of the essential elements of 5’UTR. The base composition of 5’ and 3’UTRs may also vary. It has been observed that G+C content of 5’UTRs is greater than that of 3’UTRs. Apart from it, these UTRs are also known to contain several types of repeats including SINES (short interspersed elements), LINES (long interspersed elements), mini and microsatellites.

UTRs mediated gene regulation

UTRs are known to mediate control of gene expression post-transcriptionally by several means which includes modulation of transport of mRNAs out of the nucleus, translational efficiency of mRNAs, stability of mRNAs, sub-cellular localization etc.

It is not always necessary to have a correlation between abundance of mRNA and its encoded protein which indicates the rates of translation of different mRNAs differs. It has generally been observed that the 5’UTRs that are longer than average with upstream initiation codons or open reading frames (an ORF within the 5’UTR of mRNA; upstream ORF: uORF) and the one which has more stable secondary structure often poses reduced translation efficiency. The sequences flanking to AUG initiation codon is also important as far as scanning of AUG by 40S ribosomal subunit is concerned. Therefore, the sequence context of the first AUG specifically in uORF can modulate the efficiency of translation initiation. In this situation 40S subunit may hold onto mRNA and resume scanning and reinitiate translation at a downstream AUG codon, or may leave the mRNA altogether and thereby impair translation of the main ORF. Apart from it, 5’UTRs with more stable secondary structure (∆G below -50kcal/mol) involving the AUG often stall migration of 40S subunit as compared to the one which has moderately stable secondary structure and hence decreases efficiency of translation.

The turnover of mRNAs has direct relation with abundance of the corresponding protein. The mRNA can be degraded by shortening/removal of the poly(A) tail at the 3’end and/or by removal of the m7G cap at 5’end. A cis-acting element located at 3’end such as AU-rich elements (AREs) play very important role in mRNA turnover by responding to various intra- and extra-cellular signals by controlling cytoplasmic deadenylation of mRNAs. An endonuclease activity mediated degradation of mRNA is also reported, which is independent of deadenylation and decapping in mRNA encoding transferrin receptor, a protein that mediates iron transfer in the cell for which the necessary sequence motif is located in 3’UTR. Upstream AUG and uORF are also involved in mRNA decay by the process known as nonsense-mediated mRNA decay (NMD) pathway. If uORF stop codon is located upstream of the exon junction complex (EJC), it can be recognized as premature termination codon (PTC) and activate mRNA degradation through NMD in which interaction of cap binding proteins, polyA-binding protein cytoplasmic 1 (PABPC1), eukaryotic release factor like eRF1 and eRF3 are inhibited due to long 3’UTR in which eRFs are no longer able to interact with these protein complex which otherwise interacts with NMD factors or other proteins and finally triggers translation repression and mRNA decay.

The asymmetric localization of some mRNAs can lead to asymmetric localization of corresponding protein as well in cell which may results from localization of mRNAs as ribonucleoproetin complexes along with other proteins of translation machinery. In such cases UTRs more commonly 3’UTR plays important role in localizing corresponding mRNA. For instance a 21-nucleotide sequence found in 3’UTR of myelin basic protein mRNA is required for its transport and localization. Specific cis-acting element located in 3’UTR is also reported which mediate both location specific degradation and stabilization of mRNA in the cell. Similarly UTR mediated both diffusion and entrapment of mRNA is also well documented e.g. Bicoid mRNA in Drosophila.

In conclusion, UTRs of mRNAs perform diverse roles in controlling gene expression. Further information about the sequence motif present in UTRs will be more informative to decipher their precise role in gene expression regulation.

References:

1. Srivastava et al. (2017). Trend Plant Sci. (In Press) doi.org/10.1016/j.tplants.2017.11.003

2. Mignone et al. (2002). Genome Biol. 3(3) Reviews 4


About Author / Additional Info:
I work as Sr. Scientist at ICAR-National Research Centre on Plant Biotechnology, Pusa Campus, New Delhi