DNA methylation constitutes a pivotal epigenetic modification mechanism that involves the enzymatic activity of DNA methyltransferases (DNMTs), which catalyze the addition of methyl groups to specific regions of the DNA molecule, particularly CpG islands. CpG islands are defined as DNA sequences rich in CpG dinucleotides (cytosine-phosphate-guanine sites). While DNA methylation does not alter the underlying nucleotide sequence, it plays a crucial role in the regulation of gene expression.
DNA methylation has broad applications across diverse fields, including:
CD Genomics currently offers a range of DNA methylation products and services, encompassing several technologies: Whole Genome Bisulfite Sequencing (WGBS), Reduced Representation Bisulfite Sequencing (RRBS), Enzymatic Methyl-seq (EM-seq), and methylation microarrays (850K/935K), ei al,. Each technology exhibits distinct principles and applications, which will be systematically analyzed in this article to assist researchers in selecting the most suitable method based on specific research requirements.
Technology | WGBS | RRBS | EM-seq | 850K/935K Microarrays | Methylation Screening Array (MSA 270K) |
Detection Principle | Bisulfite conversion coupled with high-throughput sequencing | Dual-enzyme digestion followed by bisulfite conversion and high-throughput sequencing | Enzymatic conversion with high-throughput sequencing | Chip-based analysis using bisulfite-converted DNA | Chip-based analysis using bisulfite-converted DNA |
Sample Applicability | Applicable to any species with a genome | Restricted to mammalian tissues | Applicable to any species with a genome | Human samples only | Human samples only |
Required DNA Input | 1–5 μg | 3–5 μg | >200 ng | 3–5 μg | 3–5 μg |
FFPE Compatibility | Yes | Yes | Yes | Yes | Yes |
Detection Scope | Genome-wide | Promoter regions; approximately 60% of CpG islands, covering 10–15% of the genome | Genome-wide | Targeted known loci (850K/935K sites), covering approximately 3–4% of the genome | Targeted known loci (270K sites), covering relevant traits and disease phenotypes across a wide range of domains |
WGBS is a comprehensive technique for detecting DNA methylation across the entire genome. The method relies on the chemical conversion of unmethylated cytosine residues (C) to uracil (U) through bisulfite treatment, while methylated cytosine residues (5-mC) remain unmodified. During subsequent polymerase chain reaction (PCR) amplification, uracil is replaced by thymine (T), allowing differentiation between methylated and unmethylated cytosines upon high-throughput sequencing and alignment to a reference genome. This approach provides a single-base resolution methylation profile across the entire genome.
WGBS is among the highest-resolution techniques available for methylation detection, offering an extensive and detailed methylation landscape at a genome-wide scale. The method excels in capturing the breadth of methylation patterns, including novel methylation sites and regions, thereby making it invaluable for exploratory studies seeking to discover new epigenomic markers.
The approach requires relatively high amounts of starting DNA (1–5 μg), making it less suitable for low-input samples. WGBS is also associated with considerable technical complexity, high operational costs, and extensive data analysis requirements due to the large volume of sequencing data generated.
You may interested in
Learn More
Reduced representation bisulfite sequencing (RRBS) is a targeted DNA methylation sequencing method that selectively enriches DNA fragments containing CpG islands through restriction enzyme digestion. These CpG-rich fragments typically represent methylation-regulated regions, such as gene promoters, which are functionally significant for gene regulation. Following enrichment, fragments undergo bisulfite treatment and sequencing to capture methylation patterns within these targeted regions. This approach reduces genomic complexity while concentrating on methylation sites with known regulatory relevance.
RRBS reduces both sequencing costs and data volume by focusing on highly methylated regions of the genome, such as CpG islands and specific methylation contexts (e.g., CHG and CHH sites). This targeted analysis is highly effective for examining methylation changes associated with gene expression regulation. The dual-enzyme digestion strategy (using MspI and ApeKI) improves coverage and accuracy of methylation analysis by enhancing fragment diversity.
RRBS is currently optimized for animal samples and requires a relatively high starting amount of DNA (1–5 μg). It does not provide coverage of the entire genome, and thus, certain genomic regions may remain unassessed.
You may interested in
Learn More
Figure 1 Comparison of sequencing-based methods for genome-wide methylation analysis. (Daniel B Lipka et al,. 2014)
Enzymatic Methyl-seq is a highly efficient methodology for detecting DNA methylation at single-base resolution across the entire genome. This technique leverages the enzymatic action of Tet methylcytosine dioxygenase 2 (TET2) and T4 bacteriophage beta-glucosyltransferase (T4-BGT) to protect 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) from deamination. Following this protection step, the APOBEC3A enzyme selectively converts unmodified cytosine residues to uracil, while methylated cytosine residues remain unaltered. During polymerase chain reaction (PCR) amplification, uracil is subsequently replaced by thymine, thus enabling differentiation of methylated versus unmethylated cytosine residues through high-throughput sequencing of PCR products. By aligning these sequenced reads with a reference genome, methylation statuses at CpG, CHG, and CHH sites can be precisely determined.
EM-seq offers several advantages over traditional bisulfite sequencing (BS-seq) for DNA methylation analysis. This technique requires a lower quantity of starting DNA, with as little as 200 ng, making it suitable for low-input samples. EM-seq also preserves high conversion efficiency without the DNA damage commonly associated with bisulfite treatment, which can cause fragmentation and selective enrichment issues that compromise sequence integrity. By eliminating these artifacts, EM-seq enables cost-effective sequencing while providing high-fidelity methylation data with genome-wide coverage. This approach is particularly advantageous for studies involving plant DNA, where DNA extraction is often challenging, as well as other applications involving low-quantity sample types.
As a recently developed technology, EM-seq has limited validation outside human and murine models. Although several plant and non-model organisms have been successfully processed, including Brassica leaves, Myrica fruit, Rehmannia root, and Arabidopsis thaliana, additional optimization and validation are required to expand its applicability across a broader range of species.
You may interested in
Figure 2 Enzymatic Methyl-seq mechanism of action and workflow (Romualdas Vaisvila et al,. 2019)
DNA methylation microarrays provide a high-throughput approach to assess methylation status at targeted CpG sites by hybridizing bisulfite-treated DNA with designed probes. This technique requires 0.5–1 μg of genomic DNA, which is first subjected to bisulfite conversion to distinguish methylated from unmethylated cytosines. Two types of methylation-specific probes are then hybridized with the converted DNA: one specific to methylated cytosine and the other specific to unmethylated cytosine. Probes hybridize at the 3' CpG position with labeled nucleotides (ddNTPs) and are marked via fluorescent staining. Using the Illumina iScan platform, fluorescence intensity ratios are measured to quantify methylation levels.
The 850K array covers over 850,000 CpG sites, while the enhanced 935K array provides coverage of 935,000 CpG sites. The 270K array, in contrast, includes a subset of 270,000 CpG loci. Given that the human genome contains approximately 28 million CpG sites, with an estimated 60–80% being methylated (equivalent to about 5% of all CpG sites), these arrays provide substantial genome-wide methylation profiling.
Figure 3 Infinium Methylation Screening Array (image source illumina)
Compared to whole-genome methylation sequencing, DNA methylation arrays require lower DNA input (0.5–1 μg), exhibit a shorter analysis cycle, and incur reduced costs. Furthermore, microarray technology is compatible with formalin-fixed, paraffin-embedded (FFPE) samples, making it suitable for large-scale studies, such as those involving hundreds to thousands of samples.
Current DNA methylation microarrays are restricted to human samples and are limited to detecting preselected, fixed methylation sites, which may not encompass the entirety of genomic methylation variability.
You may interested in
Microarray | Species | Coverage | Platform |
InfiniumTM MethylationEPIC v2.0 (950K) Array | Human | Detects over 935,000 CpG sites, covering CpG islands, promoters, coding regions, and enhancers comprehensively. | Illumina |
Infinium MethylationEPIC (850K) | Human | Over 850,000 methylation sites per sample at single-nucleotide resolution | Illumina |
Human CpG Island Microarray | Human | 27,800 CpG islands covering 21 MB | Agilent |
Human DNA Methylation Microarrays | Human | 27,627 expanded CpG islands and 5081 UMR regions | Agilent |
Mouse CpG Island Microarray | Mouse | 15,342 CpG islands | Agilent |
Infinium Methylation Screening Array 270K | Human | It covers approximately 270,000 methylation sites,with core applications in specific disease cohort research and extensive health screenings. Ensuring high-level precision, this microarray has elevated the single array detection throughput by 48 samples, a six-fold increase from the Infinium MethylationEPIC v2.0, thereby achieving higher throughput and lower costs. | Illumina |
Technical Characteristics:
WGBS is the gold standard for DNA methylation analysis, providing single-base resolution across the entire genome. This technique targets CpG, CHG, and CHH sites and is suitable for diverse species, including humans, animals, plants, and fungi. WGBS is compatible with various sample types, such as cultured cells, whole blood, tissue samples, cell-free DNA (cfDNA), and formalin-fixed, paraffin-embedded (FFPE) specimens. High sequencing depth (≥30X) is typically required, making WGBS the most expensive option, especially for large genomes such as those of humans and other mammals.
Applications:
WGBS is widely applied in high-resolution methylation studies across human and mammalian samples, as well as in agricultural, forestry, and aquaculture research. It supports exploratory research requiring comprehensive DNA methylation profiling, including studies of gene regulation, biomarker discovery, genetic breeding, and the identification of disease-associated methylation patterns.
Technical Characteristics:
RRBS achieves single-base resolution and is also regarded as a gold-standard technique. It selectively enriches for CpG-dense regions, including promoter regions and CpG islands, which are often involved in methylation regulation. Although RRBS provides genome-wide methylation detection, it focuses on functionally significant regions, reducing both genomic complexity and sequencing requirements. The method is optimized for human and mammalian samples and can also be applied to fish samples. It is compatible with cells, whole blood, and fresh-frozen tissue, but is not suitable for fragmented samples, such as cfDNA, or for plant-based research.
Applications:
RRBS is highly effective for studies involving human, mammalian, and fish DNA methylation profiling. It is frequently employed in studies exploring gene regulatory mechanisms, molecular disease subtyping, and biomarker discovery. Its cost-effectiveness and targeted approach make it advantageous for clinical samples and large genomes.
Technical Characteristics:
EM-seq combines single-base resolution with low DNA input requirements. By avoiding the DNA fragmentation and selective enrichment biases seen in B), EM-seq maintains DNA integrity and provides high CpG site coverage with lower sequencing depths. It is compatible with WGBS workflows and suitable for all species, allowing genome-wide methylation analysis with greater efficiency. The reduced sequencing depth (15X) required for EM-seq yields comparable CpG coverage to 30X WGBS, decreasing costs while maintaining high data fidelity.
Applications:
EM-seq is particularly valuable for low-input samples and for non-model organisms. It is highly suited to studying methylation changes associated with development, aging, and disease. EM-seq's capacity for producing high-fidelity methylation maps makes it ideal for research into cellular aging and DNA methylation alterations in developmental and disease contexts.
Technical Characteristics:
High-density methylation arrays offer single-base resolution and deliver precise methylation measurements without the dependency on sequencing depth. Arrays are well-suited for FFPE and other challenging sample types and cover the genome-wide CpG landscape. The 935K array, an upgrade of the 850K array, includes an additional 186,000 CpG sites. It provides enhanced coverage of enhancer regions, super-enhancers, CTCF binding sites, CNV detection regions, and CpG islands associated with cancer. Arrays are a cost-effective solution with high reproducibility across samples, ideal for large cohort studies.
Applications:
The 935K methylation array's broad coverage of CpG islands, promoter regions, and enhancer regions makes it a powerful tool for cancer research, complex disease studies, and aging-related methylation clocks. The Infinium MethylationEPIC v2.0 array, covering approximately 270,000 methylation sites, provides high-resolution, accurate methylation data that facilitates a deep understanding of DNA methylation's role in gene regulation, cell differentiation, and disease progression. The array's affordability and efficiency make it accessible to many research institutions for large-scale methylation studies.
Each DNA methylation technology described has distinct features that make it suitable for different applications and research requirements. Researchers should select the most appropriate method based on their study goals, sample types, and species of interest to achieve optimal results.
References: