De novo sequencing refers to sequencing the genome of a species without any reference genome information, and using bioinformatics analysis methods to splice and assemble the genome sequence map of the species to advance the subsequent research of the species.
CD Genomics has the professional technical team with extensive experience in experimental operations and bioinformatics analysis. We provide accurate, efficient, comprehensive and complete species characterization to ensure reliable experimental results.We provide de novo sequencing of plants and animals, mainly for plant and animal species with unknown genome sequences or poorly assembled reference genomes, by constructing different types of genomic DNA libraries and performing sequence determination and bioinformatics analysis to help our clients map the complete genome sequence information of the species.
Sample type | Amount | Purity |
---|---|---|
Genomic DNA | ≥ 200 ng | A 260/280=1.8-2.0 no degradation, no contamination |
Genomic DNA (PCR free non-350bp) |
≥ 5 μg | |
Genomic DNA (PCR free -350bp) |
≥ 1.2 μg | |
HMW genomic DNA | ≥ 15 μg (Concentration ≥ 70 ng/μL) |
A 260/280=1.8-2.0 A 260/230=1.5-2.6 NC/QC=1.00-2.20 Fragments should be ≥ 30 kb |
From obtaining the sample genomic DNA, checking the quality of the sample genomic DNA, to constructing a library of genomic DNA that meets the requirements, and dividing the library into small fragment library and large fragment library according to the fragment size, and checking the library quality of the constructed library, to analyzing the sequencing data, data quality control, and bioinformatics analysis. Each step is carefully controlled and scientifically designed to ensure the correct results.
Fig 1. De novo sequencing workflow
Sequencing Plan | Sequencing Platforms | Sequencing Strategy | Deliverables | Bioinformatics Analysis |
---|---|---|---|---|
Simple genome | Long-read sequencing | 20 kb long-read sequencing (80X) + Hi-C | Contig N50 ≥ 1Mb |
|
Long-read sequencing + NGS | Long-read sequencing (40X) + NGS (PE, 100X) | Contig N50 ≥ 500Kb | ||
Complex genome | Long-read sequencing | 20 kb long-read sequencing (100X) + Hi-C | - | |
Long-read sequencing + NGS | Long-read sequencing (40X) + NGS (PE, 100X) | Contig N50 ≥ 200Kb | ||
Mammals and Birds | Long-read sequencing | 20 kb long-read sequencing (80X) | Contig N50 ≥ 3Mb | |
Long-read sequencing + NGS | Long-read sequencing (40X) + NGS (PE, 100X) | Contig N50 ≥ 1Mb |
CD Genomics offers a full suite of state-of-the-art sequencing equipment and software, high-quality sequencing reagents and industry-leading data quality, as well as integrated workflows that streamline de novo sequencing from library preparation to data analysis, ensuring our customers get the fastest and most accurate results while helping them save more time on their research and ensuring a high turnaround on their experiments. If you are interested in us, please feel free to contact us.
How do I decide between genome resequencing and de novo sequencing?
(1) When dealing with a species lacking a reference genome, de novo sequencing is necessary to assemble the genome from scratch.
(2) If a species possesses an incomplete reference genome, opting for de novo sequencing becomes essential to assemble a more comprehensive genome.
(3) In cases where a single reference genome falls short of capturing all the genetic information within a species, de novo sequencing is required to assemble a pan-genome that encompasses greater genetic diversity.
What sets apart a genomic framework map from a fine map?
A framework map provides comprehensive coverage, encompassing 90% of the autosomal genome and 95% of the gene region. It specifies Contig N5 to 5 kb, Scaffold N50 up to 20 kb, and maintains a remarkable single-base error rate of less than 1 in 100,000.
On the other hand, a fine map exhibits even greater precision, covering 95% of the autosomal genome and an impressive 98% of the gene region. It delineates Contig N5 to 20 kb, and Scaffold N50 up to 300 kb, with a single-base error rate consistently below 1 in 100,000. This nuanced contrast highlights the enhanced resolution and detailed information offered by a fine map compared to a genomic framework map.
Why construct diverse libraries for sequencing plant and animal genomes?
The complexity and size of these genomes, with numerous repetitive regions, necessitate specialized gradient sequencing libraries. This approach, combined with BAC or Fosmid libraries and multiple sequencing platforms, ensures accurate and comprehensive mapping, avoiding mis-splicing caused by repetitive sequences and enhancing overall genome integrity in higher plants and animals.
What sequencing platforms do you offer in animal and plant de novo sequencing service?
CD Genomics offers a comprehensive range of sequencing platforms, including Illumina, Nanopore and PacBio SMRT Sequencing.
The Illumina platform excels in its cost-effectiveness and large sequencing data volume, providing high coverage for genome sequencing. However, its short read lengths and GC preferences pose challenges in navigating repetitive and GC-rich regions. Sequencing data volumes of 100X are generally recommended.
Our long-read sequencing platforms, PacBio RS II and Sequel, stand out with long reads (average length > 10 kb), rapid sequencing (2-6 hours for a single SMRT cell), high throughput (averaging ~8G of valid data per SMRT cell), absence of GC preference, and the ability to detect base modification information, including methylation. This makes them especially well-suited for the comprehensive sequencing of plant and animal genomes from scratch. PacBio HiFi data, 30X~50X (recommended); PacBio CLR data, recommended 100X.
For any general inquiries, please fill out the form below.
CD Genomics is propelling the future of agriculture by employing cutting-edge sequencing and genotyping technologies to predict and enhance multiple complex polygenic traits within breeding populations.