The swift decline of global biodiversity has spurred a growing focus on preserving biodiversity. Conservation genetics stands out as a crucial tool in safeguarding endangered species, significantly enriching our understanding across various aspects of conservation biology. Nonetheless, certain key scientific inquiries within conservation biology, such as the evolutionary trajectory of at-risk plants, the drivers and mechanisms behind endangerment, and the workings of adaptive evolution, remain subjects requiring deeper investigation.
In recent times, the integration of high-throughput sequencing technology with conservation genetics has given rise to conservation genomics. This emerging field introduces novel techniques and perspectives that delve into these pivotal inquiries with greater depth. A standout technique in conservation genomics, whole-genome resequencing, has achieved notable progress in exploring endangered plant phylogeny and population genetics. This technique delves into genome diversity, the evolutionary history of populations, adaptive evolution, and the decline of inbreeding. Through these explorations, insights into the taxonomic classification and conservation divisions of endangered species have emerged. Moreover, the studies have illuminated the species' evolutionary background, reasons for endangerment, and aspects of adaptive evolutionary history.
Classical conservation genetics studies have relied on methods such as allelic and microsatellite genotyping, as well as mitochondrial DNA sequencing, to yield valuable insights about natural populations. However, these approaches provided only limited genetic marker data. In the 21st century, the rapid advancement of sequencing technologies, particularly NGS and long-read sequencing, has paved the way for the birth of conservation genomics. Presently, the prevailing methods for genome conservation can be categorized into two main streams: reduced representation genome sequencing (RRGS) and whole-genome sequencing.
Reduced representation genome sequencing (RRGS), also known as partial genome sequencing, substantially simplifies the genome's complexity. As a result, it decreases both the sequencing expenses and computational demands. This approach offers several advantages, including cost-effectiveness, enhanced stability, simpler library construction procedures, shorter experimental periods, a substantial yield of single nucleotide polymorphisms (SNPs), and independence from the reference genome. Consequently, it finds extensive utility in safeguarding endangered plant and animal species. Hence, this technology plays a crucial role in the conservation efforts directed towards endangered flora and fauna.
RRGS encompasses various techniques, such as restriction site-associated DNA sequencing (RAD seq), RNA sequencing (RNA seq), and whole exome sequencing (WES). These methods share a common characteristic: they typically scrutinize only a fraction of the genome. However, due to the inherent incomplete coverage and occasional missing data, the data acquired through RRGS pose challenges for subsequent population genetics analyses. In contrast, the whole-genome resequencing method, reliant on a reference genome, offers a significant enhancement in the quantity and quality of obtained genetic markers. This advancement greatly refines the precision of acquired genetic markers when compared to the simplified genome sequencing methods.
Whole genome sequencing encompasses two primary categories: de novo whole genome sequencing and whole genome resequencing. De novo sequencing involves constructing an entirely new genome sequence from scratch. The complexity and success of this assembly process depend on factors such as genome size, intricacy, available computational resources, and expertise in bioinformatics. Currently, the process of de novo whole genome sequencing predominantly relies on three generations of sequencing technologies. These include single-molecule real-time sequencing (SMRT) and High Fidelity (HiFi) reads offered by Pacific Biosciences, as well as nanopore sequencing by Nanopore Technologies (ONT). Following sequencing, the application of Hi-C (high throughput chromosome conformation capture) aids in assembling the sequencing data into chromosomal contexts.
On the other hand, the objective of whole genome resequencing is to analyze genomic variations across individuals and populations. This involves using sequencing technology to generate numerous short reads, which are then compared against a reference genome. By doing so, population-level single nucleotide polymorphism (SNP) data can be acquired. Subsequent analyses in population genetics are conducted based on these SNP data.