Before commencing a Bisulfite-Seq project, several crucial factors must be taken into consideration:
• Methylation Rate: It is essential to assess whether the species under investigation exhibits a low or high methylation rate, as this can influence the experimental design and data analysis. • Genome Completeness: The completeness and quality of the species' genome assembly are crucial factors, as they can impact the accuracy of the Bisulfite-Seq data and its comparison with other BS-Seq datasets. • Genome Complexity: Factors such as high GC content, heterozygosity, presence of transposons, and repetitive regions in the genome can pose challenges during data analysis and interpretation. Understanding and addressing these complexities beforehand is essential.
By carefully considering these factors, a well-planned Bisulfite-Seq project can yield reliable and valuable insights into the DNA methylation patterns of the species in question.
Bisulfite-Seq is heavily dependent on the completeness of the genome, and the accuracy of the results in downstream analysis is influenced by the quality of the genes. Therefore, it is better suited for species with available and complete genome information.
Libraries that are self-built can indeed undergo on-board sequencing. When using a standard Illumina kit, it is crucial to specify the type and item number of the kit. In the case where an Illumina kit is not utilized, partners must provide an information sheet detailing the library construction method, including the kit and brand used, along with the primer sequence of the junction employed for library construction and the expected library fragment size.
If the library contains index sequences, it is essential to indicate the index location and sequence. Moreover, if only a portion of the library construction process has been completed, clear indication of this status is necessary. For libraries that are PCR product-based or contain specific sequences within the insert fragments, it is vital to provide this information in the sample information sheet, as the omission may significantly impact the quality of the resulting data.
In the case of partner libraries, the suitability of fragment sizes can be determined using an Agilent 2100 Bioanalyzer. Furthermore, Q-PCR can be utilized to accurately quantify the on-board volume.
By adhering to these procedures, we can effectively gauge the library's quality and ensure reliable sequencing outcomes.
Whole genome bisulfite sequencing (WGBS) is commonly used for plant DNA methylation sequencing methods, including conventional WGBS, and single-cell scWGBS, in addition to plant simplified genome methylation sequencing and immunoprecipitation sequencing (MeDIP-seq), which are suitable for different research needs.
In plant genomes, individual C bases such as CG, CHG, and CHH can undergo methylation. Currently, the primary research method involves sulfite sequencing of the entire plant region. However, there is no optimal method for examining methylation at single genes, although the MSP method (methylation-specific PCR) can be used for preliminary exploration. While there are isolated reports of using BSP sequencing to detect methylation at single genes, it is not highly recommended. If the species genome is not extensive, direct whole-genome methylation sequencing is a preferable option, offering better cost-effectiveness and accuracy. Consequently, there is no necessity to examine individual genes.
The viability of PCR amplification for plant target gene methylation using specific primers depends on certain considerations. Firstly, it is uncertain whether a single "C" (cytosine) in the target gene is methylated or not. If it is unmethylated, during sulfite conversion, it will change to "U" (uracil), while methylation keeps the "C" base intact.
Secondly, it becomes challenging to discern whether the designed primer is targeting a "C" base or a "T" (thymine) base, if the primer itself contains a "C" base. To address this, it is essential that neither the upstream nor downstream primer contains "C" bases or contains only 1-2 "C"s. This approach ensures that the primer accounts for all possible scenarios and guarantees that the upstream and downstream primers do not carry "C" bases. For instance, if there are 2 "C"s in the primer, the permutations for that primer would be CC/TC/CT/TT.
Moreover, when dealing with large regions, the design of gene primers needs to consider various combinations. This process can be time-consuming and might not always result in effective amplification of the desired product. Particularly, PCR amplification after methylation transformation can be challenging. If the methylation pattern is limited to CG islands, single or multiple genes can be targeted successfully. However, in plant DNA methylation, it can exist in CG, CHG, and CHH contexts simultaneously, and is mainly influenced by methylation patterns such as CHH. This complexity adds to the difficulty of successful PCR amplification.
Whole genome methylation sequencing of plants offers a solution to the challenges of designing primers for target gene methylation. This cutting-edge technology utilizes T4-DNA ligase to attach junction sequences to fragmented genomic DNA, which has been interrupted using ultrasonic techniques. Following this, bisulfite treatment converts unmethylated cytosine C in the junction products to uracil U, and subsequently, uracil U is transformed into thymine T through junction-sequence-mediated PCR technology.
An advantage of whole-genome methylation sequencing is that it does not involve genome sequence amplification, effectively bypassing the difficulties associated with primer design for targeted gene methylation.
The WGBS library necessitates a starting amount of DNA, which should be ≥ 1 μg. Libraries with concentrations below 1 ng/μl cannot be detected effectively using the 2100 system. Furthermore, the WGBS library's minimum concentration should not fall below 3 ng/μl to ensure optimal sequencing results.
Typically, the Bisulfite conversion rate exceeds 99%. However, in cases where the DNA sample lacks non-methylated DNA as a reference (e.g., chloroplast DNA in Arabidopsis thaliana is non-methylated), control DNA is incorporated into the sample to validate the Bisulfite conversion rate.
Methylation-supportive reads in Bisulfite-Seq can be defined based on the following criterion: For a specific C site, if it possesses a methylation modification, it will remain unchanged during the bisulfite treatment and be detected as a C in the sequencing results. Conversely, if the C site lacks methylation, it will be converted to a T after bisulfite treatment, leading to T in the sequencing reads aligned to this site. In summary, reads supporting methylation will exhibit C, whereas reads lacking methylation will display T.
Bisulfite-Seq has the capability to determine the methylation frequency.
Specifically, in tissue sequencing where various cells exhibit different methylation statuses, the methylation rate is calculated as the proportion of methylated cells relative to the total number of cells. This information is derived by analyzing the number of reads that support methylation compared to the total number of reads obtained from the sequencing process.
The reason for this is that hydroxymethylation levels are generally lower compared to methylation levels. Furthermore, hydroxymethylation displays a narrower range of variation, typically ranging from 0 to 0.25, whereas methylation ranges from 0 to 1. As a result, the number of genes showing differential hydroxymethylation is significantly lower than those exhibiting differential methylation. This phenomenon is considered normal.
Moreover, unlike gene expression, differentially methylated genes do not have a specific value associated with them. The objective is to identify genome-wide differentially methylated regions (DMRs) and differentially hydroxymethylated regions (DhMRs) and determine which genes they modify. Consequently, there is no singular methylation value for each gene, but rather methylation values for DMRs and DhMRs. It's important to note that one gene may be modified by multiple DMRs or DhMRs, and conversely, one DMR or DhMR may modify more than one gene. Thus, the relationship is not strictly one-to-one.
In RRBS library construction, the process involves utilizing CEGX Precision Methylation and Hydroxymethylation kits for chemical oxidation and sulfite treatment. Additionally, the libraries are enriched with various modification spike-in quality control sequences. The methylation conversion rate achieves approximately 95%, while the oxidation rate remains at a minimum of 90%.
The gold standard method for detecting DNA methylation in plants is Whole Genome Sulfite Sequencing (WGBS). However, when working with a limited budget, a viable alternative is Simplified Genome Methylation Sequencing of plants. A noteworthy option is Plant-RRBS solution. Peer studies have demonstrated its remarkable effectiveness in detecting genome-wide methylation patterns, especially in large genomes or plant populations, providing extensive cytosine coverage.
• The species must be eukaryotic. • The species should be one of the following hypomethylated species: Drosophila, flour beetle, brewer's yeast, corn wine fission yeast, nematode, or Aspergillus flavus. • The genome of the species must be sufficiently complete, including at least scaffold-level splicing and comprehensive annotation, to facilitate Bisulfite-Seq alignment. • Complex factors present in the genome, such as high GC content, high heterozygosity, numerous transposons, repetitive regions, etc., must be taken into account as they can affect the comparison and potentially lead to unsatisfactory final results.
For WGBS methylation studies, it is generally recommended to aim for a sequencing depth of 30X along with two biological replicates. However, if your research involves investigating similar cells, a sequencing depth of 5-15X is sufficient. On the other hand, for studying Differentially Methylated Regions (DMRs) with shorter regions, a higher sequencing depth of more than 15X is advisable. Conversely, if you're analyzing DMRs with longer regions, a lower sequencing depth of 1-2X can be considered.
It is important to note that having at least two biological replicates is highly recommended when analyzing DMRs to ensure robust and reliable results.
The WGBS project exhibits a relatively low mapping rate when compared to other projects. In the process of constructing the WGBS library, Bisulfite treatment was applied, which converted unmethylated cytosine (umC) to uracil (U), while methylated cytosine (mC) remained unchanged. After PCR amplification, all mC sites remained unaffected, whereas umC sites were converted to thymine (T), and their complementary strands were converted to adenine (A). Consequently, the initial double-stranded DNA transformed into four distinct strands.
To align the sequencing data with the reference genome, the Bismark software was utilized, which converted all reference genome cytosines (C) and corresponding reads to thymines (T) (and their complementary strand, G, to adenines, A). This conversion led to an increase in base sites for potential mismapping, ultimately contributing to the lower mapping rate observed in the WGBS project.
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.