The transcriptome is the collection of all RNA transcribed by an individual or a population of cells at a certain biological state. The studies into transcriptome focus on mRNAs and non-coding RNAs (ncRNAs), which encode various proteins and act as cellular regulators respectively. Transcriptomic studies interpret gene function and gene structure in a holistic view, revealing molecular mechanisms of specific biological processes and disease occurrence. The technique can also discover unknown and rare transcripts, accurately identify variable cleavage sites and cSNP (coding sequence single nucleotide polymorphism), and provide more comprehensive transcriptional information.
Transcriptome sequencing employs high-throughput sequencing technologies to get access to almost all transcripts of specific tissues or cells in a certain state by comprehensive and rapid cDNA sequencing, becoming the basis as well as the starting point for the research of gene expression. The main objectives of transcriptome research include the classification of all transcriptional products, the determination of the transcription structure of genes (such as starting sites, 5′ and 3′ terminals, splicing patterns, and post-transcriptional modifications), and the changes in the expression levels of individual transcripts during development. This technology has been widely used in biological research, medical and clinical research, and drug development.
Advantages of Transcriptome Sequencing
RNA-Seq has significant advantages over other transcriptomic technologies. To begin with, RNA-Seq is highly sensitive and can detect and quantify almost all transcripts in the cell(s), including some rare transcripts of only a few copies. Its wide detection range covers 6 orders of magnitude. Meanwhile, RNA-seq can accurately determine every single nucleotide of each transcript, without issues of cross-reaction and background noise caused by the fluorescence signal as from gene microarray technologies. RNA-Seq can conduct transcriptomic detection for any species without the design of probe for known sequences. With all these advantages, RNA-Seq is widely used in transcriptome studies.
A Comparison Between RNA-Seq and Other Transcriptomic Technologies
Technology
|
Gene Microarray
|
SAGE / MPSS
|
RNA-Seq
|
Principle
|
Hybridization
|
Sanger sequencing
|
High-throughput sequencing
|
Resolution
|
Few – 100 base pairs
|
Single nucleotide
|
Single nucleotide
|
Throughput
|
High
|
Low
|
High
|
Background Signal
|
High
|
Low
|
Low
|
Cost
|
High
|
High
|
Relatively low
|
RNA Amount Required
|
Large
|
Large
|
Small
|
Applications of Transcriptome Sequencing
The Study on Transcript Structure and Variant Discovery
RNA-Seq can greatly enrich many aspects of gene annotation, including the determination of exon/intron boundaries, verification or amendment of previously annotated 5’ and 3’ gene boundaries, ORFs contained within transcripts, UTRs domain identification and new transcription domain identification, cSNP profiling, detection of variable cleavages, and fusion gene identification. This would shed light on the complexity of transcription and provides an extensively revised annotation of the transcriptome of interest.
The Study on Gene Expression Level
Gene microarray technologies are difficult to recognize the detection of low-abundance targets and the small changes in gene expression, which are critical for studying biological response under stimulation or environmental alternations. Since RNA-Seq technology can quantitatively capture the dynamic changes of transcriptomes in different tissues or states, it can determine the expression level of RNA more accurately than microarrays. RNA-Seq makes it possible to determine the absolute number of each transcript in individual cells and allows for direct comparison among different assays.
Functions of Non-Coding RNAs
An important field of transcriptomics research is the discovery and analysis of ncRNA. High-throughput sequencing has revealed that at least 93% of the human genome is transcribed into RNA, among which less than 2% of the genome encodes proteins and the remaining 91% of the genome is transcribed into non-protein-coding RNAs (ncRNA). ncRNAs can be divided into housekeeping ncRNAs and regulation ncRNAs. The former, which mainly include tRNA, rRNA, snRNA, and snoRNA, are usually expressed stably and play a series of functions that are essential to cell physiology. The latter mainly include lncRNA and small ncRNA (represented by microRNA), which regulate gene expression at multiple levels regarding epigenetics, transcription, and post-transcription.