Next-generation sequencing (NGS), also recognized as high-throughput sequencing, has evolved from PCR and Gene microarray technologies. This innovative sequencing method introduces reversible termination ends, allowing sequencing to occur concurrently with synthesis. The determination of DNA sequences is achieved by capturing newly added bases with specific markers during the DNA replication process. In a significant breakthrough, Roche introduced the first next-generation sequencer, the Roche 454, in 2005, marking the inception of the high-throughput sequencing era. And Illumina is becoming the most popular sequencing platform.
This sequencing technology holds paramount significance in life science and pharmaceutical research due to its capability to rapidly generate substantial data with a relatively short read length. In the NGS workflow, the initial step involves constructing the sample DNA, followed by machine testing and subsequent analysis. The construction of the sequencing library is a pivotal stage that dictates the success of NGS research. The quality of DNA and library construction products significantly influences parameters such as library conversion rate, sequencing depth, complexity, and homogeneity. Therefore, stringent quality control measures are imperative throughout the sequencing process.
A DNA library encompasses a series of preparatory steps for DNA/RNA samples prior to sequencing. The original nucleic acid samples cannot be directly utilized for sequencing; only processed samples can meet the sequencing platform's requirements. These preparations involve tasks such as introducing essential end joints to DNA samples, a prerequisite for sequencing. In cases of insufficient sample volume, PCR amplification is employed to fulfill the machine's criteria. DNA library construction forms the foundation of next-generation sequencing library technology.
Quality Control in NGS Library Preparation Workflow may be a helpful article for the sequencing and analysis.
Represents the steps involved in Next Generation Sequencing: Library preparation and amplification, sequencing, and data analysis are the three important steps involved in NGS. (Selvakumar et al., 2022)
DNA library construction lies at the core of next-generation sequencing library technology. It involves the crucial step of adding end joints to the fragments under examination. Various methods exist for DNA library construction, categorized based on their approaches to joint formation:
CD Genomics high-throughput sequencing and library construction services enable in-depth analysis of genomes, transcriptomes and epigenomes. Dive into the intricacies of library construction, a crucial step in this process, as it ensures the generation of high-quality data essential for robust genomic analysis.
TA Cloning Junction Library Construction
TA cloning junction library construction is presently the most prevalent method for library construction. The process entails the following steps:
This technique necessitates the prior synthesis of target fragments with T-tailed adapters and A-tailed ends, subsequently linking them to fragmented samples via TA cloning facilitated by DNA ligase.
Swift Method
The Swift library construction method shares similarities with the TA cloning junction library construction method. It involves introducing P5 and P7 junction sequences at both ends of the fragment under examination. Following end repair, the P7 junction is initially connected at the 3' end, succeeded by the connection of the P5 junction at the 5' end. Subsequently, sample concentration is augmented through PCR amplification.
Transposase Library Construction
The cornerstone of transposase library construction lies in the Tn5 transposon, essentially a DNA fragment encoding the transposase gene. Traditional library construction entails DNA fragmentation, end repair, library amplification, and multi-step purification. However, leveraging Tn5 for library construction streamlines the process, condensing multiple steps into a single reaction.
The in vitro transposable elements of the Tn5 transposon utilized for library construction include the terminal sequence of the transposon, target DNA, transposase (Tnp), and Mg2+ (activator).
PCR Amplicon Library Construction
The amplicon library construction method employs a PCR reaction to introduce junctions at the ends of the target fragments, necessitating only two rounds of PCR and purification to yield the desired library. Initially, primers containing the universal sequence are paired with the target region. Subsequently, the sequencing junctions are linked through a PCR reaction in the second step.
Flat-End Junction Library Construction
The flat-end ligated adapter method entails attaching specific adapters to the ends of fragmented DNA fragments. This approach follows a series of steps including DNA sample fragmentation, end repair, adapter ligation, selective recovery of DNA fragments, PCR enrichment of the library, and purification of the PCR product. The sequencing process for constructing libraries with flat-end ligated adapters involves transient decreases in pH within the microenvironment, which are detected and recorded by a pH electrode to facilitate data reads.
Please read our article: Illumina Next-Generation Sequencing (NGS): Principles and Workflow for more information.
Fragmentation
The initial phase of library construction involves addressing the read length limitations of sequencing machines. The extracted DNA cannot be directly sequenced due to these limitations. Therefore, various methods, such as ultrasonic fragmentation and enzyme digestion, are employed to precisely cut the DNA into fragments of suitable lengths. While mechanical methods may incur higher sample loss and complexity, enzymatic methods, particularly those utilizing the Tn5 transposase, are preferred for their cost-effectiveness and simplicity. The Tn5 transposon, originally identified in E. coli, comprises key components including IS50 sequences, Outside End (OE) sequences, Inside End (IE) sequences, and drug resistance genes.
End Repair/Addition of A-tail
Following fragmentation, DNA ends may exhibit flat or uneven characteristics. In this step, the ends of the DNA fragments are modified, and a distinctive A base is added to create a sticky end for subsequent junction primer attachment.
DNA fragments generated in the previous step, featuring 5'/3' sticky ends or flat ends, undergo end repair to convert all sticky ends to flat ends. For TA-joining, phosphorylation at the 5' end and the addition of an "A" at the 3' end are essential for complementary pairing with junctions possessing "T" sticky ends. This process involves the collaborative action of T4 DNA polymerase, T4 polynucleotide kinase, and Taq DNA polymerase.
DNA library preparation using a transposase-based method (Nextera) developed by Illumina. (Head et al., 2014)
Junctions
DNA fragments with appended A-tails exhibit prominent A-termini, facilitating their complementary pairing with junctions harboring T-termini. The primary objective of incorporating junctions is to append library tags and oligonucleotide sequences that complement the sequencing platform onto the ends of fragmented DNA.
Junctions play a pivotal role in the library, with Illumina platform Y-type junctions being widely utilized in sequencing. These junctions encompass P5/P7, Index, and Rd1/Rd2 SP sequences. P5/P7 sequences pair with the sequences on the sequencing chip, anchoring the fragments for testing on the Flowcell to complete bridge amplification. The Index distinguishes between different samples in the onboard sequencing mixed library, and Rd1/Rd2 SP serves as the primer binding region for Read1 and Read2 sequencing. Junction ligation typically involves T4 DNA Ligase, repairing single-stranded cuts in double-stranded DNA and reconnecting adjacent nucleotides. In junction ligation, junctions with "T" sticky ends and "A" sticky ends can be seamlessly combined, forming a complete double strand.
PCR Amplification
Due to the previously added junction, direct amplification is achieved using primers complementary to the junction. The non-complementary ends of the previously added "Y" junctions necessitate an intermediary step before direct sequencing. To sequence multiple samples concurrently, indexes/barcodes can be added to differentiate between different samples. This step not only aids in distinguishing various libraries during subsequent sample analysis but also introduces oligonucleotide sequences complementary to the sequencer at both ends through PCR, specifically P5/P7.
First Breakthrough: Direct Reading
Second Breakthrough: Automation
Third Breakthrough: Scale-Up
Fourth Breakthrough: Massively Parallel Sequencing
This represents a substantial leap forward, characterized by the ability to sequence DNA in a massively parallel manner, resulting in a rapid and simultaneous analysis of multiple fragments. One of the key features of sequencing is its significant contribution to the precipitous drop in sequencing costs.
1. What are the primary steps involved in DNA library construction?
The main parts of DNA library construction involve:
Note: Bead purification steps are omitted here.
Additionally, constructing a library from RNA samples involves an extra step due to the nature of RNA. It requires reverse transcription of RNA into complementary DNA (cDNA) before proceeding with the above library construction process. However, the fundamental principle remains the same.
2. Why is it necessary to fragment gDNA samples prior to NGS library preparation?
Illumina sequencers typically read fragments within the range of 50-600 base pairs, although this range can vary depending on the sequencing chips and reagents used. Conversely, most intact human genomic DNA (gDNA) samples exceed 10 kilobases (kb) in length. Therefore, breaking down these large fragments into smaller pieces is essential for successful library construction. This fragmentation prerequisite holds true for the MGI platform as well.
Table 1 Recommended Sequencing Read Lengths for Different Applications on Illumina Sequencing Platforms
DNA Sequencing | |
Sequencing Application | Recommended Read Length |
Whole Genome Sequencing | 2 x 150 bp |
Whole Exome Sequencing | 2 x 150 bp |
Targeted Capture Sequencing | 2 x 150 bp |
Amplicon Sequencing | The whole amplicon insert length |
De novo Sequencing | 2 x 150 - 2 x 300 bp |
RNA Sequencing | |
Sequencing Application | Recommended Read Length |
Transcriptome Analysis | 2 x 75 bp |
Gene Expression Profiling | 1 x 50 bp |
Small RNA Sequencing | 1 x 50 bp |
3. Why is it essential to incorporate an adapter following DNA fragmentation? What significance does it hold?
A complete adapter comprises three components: universal primers (P5&P7), index (i7/i5), and sequencing primers (SP1&SP2).
Note: While there are multiple methods to add a complete junction to a sample, the resulting library's junction sequence remains consistent. We intend to dedicate an article to elucidate the diverse structures and connections of junctions in the future.
4. Why do PCR and PCR-free libraries exist, and which is more prevalent?
PCR serves as a method to boost sample volume. When the initial input volume is insufficient, PCR replication is necessary to attain the requisite library volume for sequencing. Conversely, ample starting volume allows for completing junction ligation, purification, and subsequent sequencing without PCR amplification.
However, PCR libraries are more widespread due to several reasons:
5. Why do various kits impose distinct requirements for sample starting volumes?
The starting volume capability of a kit primarily hinges on the sensitivity, specificity, and stability of its respective enzymes. Kits catering to lower starting volumes (e.g., nasopharyngeal swab samples for COVID-19 testing) demand heightened enzyme efficacy, consequently reflecting in higher prices. Additionally, kits tailored for low starting volume samples often boast proprietary patents and distinct advantages, further contributing to their elevated costs. Please contact our technical team for more information.
References: