DNA methylation, a chemical modification of DNA, can influence genetic expression without altering the DNA sequence. It involves the covalent bonding of a methyl group to the cytosine 5-carbon position of the genomic CpG dinucleotide through the action of DNA methylation transferase. This process has been found to induce changes in chromatin structure, DNA conformation, DNA stability, and DNA-protein interactions, thereby regulating gene expression.
For data mining, DNA methylation analysis typically follows three steps:
Then how to do the methylation differential level analysis to screen DMC, DMR, DMG in specific differential genes?
(1) DMC/DMR Identification
In this step, the main goal is to identify specific sites (DMCs) and regions (DMRs) in the genome where differential methylation occurs between different experimental groups. This can involve comparing methylation levels between disease and control samples, treated and untreated samples, or different time points in a time-course experiment.
(2) DMC/DMR Transcription Factor Binding Analysis (TF Binding Motif)
The focus in this step is to investigate whether DMCs/DMRs overlap with known transcription factor (TF) binding sites, particularly in regulatory regions like promoters and enhancers.
TF Binding Motif Analysis: Bioinformatics tools and databases are utilized to predict TF binding motifs at DMCs/DMRs. This information can provide insights into the potential regulatory factors involved in the differential methylation.
(3) Analysis Strategy of Temporal Methylation Data (Time Course)
If the experimental design includes multiple time points or time-course data, the analysis strategy is adapted to compare changes in methylation over time.
(4) DMC/DMR Distribution on Gene Elements
In this phase, we conduct an in-depth investigation into the genomic localization of Differentially Methylated CpGs (DMCs) and Differentially Methylated Regions (DMRs), focusing on their distribution within distinct gene elements.
The DMC/DMR analysis entails a multifaceted approach encompassing computational methodologies, rigorous statistical analysis, and the integration of diverse genomic information. This comprehensive methodology allows us to uncover distinctive patterns of DNA methylation alterations and elucidate their potential functional consequences. Ultimately, the outcomes of this investigation provide invaluable insights into the intricate epigenetic mechanisms governing gene expression, thereby shedding light on the underlying molecular underpinnings of diverse biological processes and pathological conditions.
Functional analysis of differentially methylated genes (DMGs) is a crucial step in understanding the biological processes and pathways influenced by DNA methylation changes. DMGs are genes that have at least one differentially methylated region (DMR) annotated to their promoter or gene body. DMRs in the promoter region have the potential to influence gene transcription, while DMRs in the gene body region often show a positive correlation with gene expression levels. Analyzing these DMGs can provide valuable insights into the regulatory functions of DNA methylation.
(1) Categorization of DMGs
(2) Promoter-DMG and Genebody-DMG
(3) Functional Enrichment Analysis
Functional Enrichment Analysis is a vital investigative approach that affords valuable insights into the intricate landscape of DNA methylation changes and their ramifications in various biological contexts. This analysis involves the examination of Gene Ontology (GO) to discern enriched biological processes, molecular functions, and cellular components linked to Differentially Methylated Genes (DMGs). Additionally, the investigation extends to KEGG Pathway analysis, which elucidates the significant enrichment of DMGs within various biological pathways, thereby illuminating their functional context.
Moreover, the exploration of Reactome Pathway analysis unveils the specific molecular pathways and signaling cascades wherein DMGs are implicated. By integrating DisGeNET Disease and Disease Ontology analysis, researchers gain valuable insights into the association of DMGs with particular diseases and disease-related terms, providing crucial clues regarding the potential role of DNA methylation in disease pathogenesis.
Undoubtedly, the in-depth functional analysis of DMGs enables researchers to foster a comprehensive comprehension of the intricate interplay between DNA methylation changes and diverse biological processes. Consequently, this endeavor facilitates the formulation of hypotheses concerning the regulatory roles of DNA methylation and its implications in various physiological and pathological contexts. Ultimately, the amalgamation of epigenetic and functional genomics data significantly contributes to a profound understanding of gene regulation and the complexity underpinning cellular processes.
The differential methylation level detection and analysis workflow described above involve several key steps, including DMR detection, DMR annotation, functional analysis of differentially methylated genes (DMGs), and visualization of DMRs. This comprehensive approach allows researchers to identify regions with significant DNA methylation changes, understand their potential functional implications, and visualize the results for better interpretation.
(1) Differential Methylation Region (DMR) Detection
DMRs are detected using the metilene software, which employs a binary segmentation algorithm combined with double statistical tests (MWU-test and 2D KS-test).
The following criteria are used to define DMRs:
CpG sites that meet these criteria are used to define the differentially methylated regions.
(2) Differential Methylation Region (DMR) Annotation
The identified DMRs are annotated to genebody and promoter regions, respectively. This step helps link the DMRs to specific genes and understand their potential regulatory implications on gene expression.
(3) Functional Analysis of Differentially Methylated Genes (DMGs)
GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) enrichment analysis is performed to study the functions of the DMGs. The enrichment of DMGs in GO terms and KEGG pathways is analyzed using a Hypergeometric distribution test. This analysis provides insights into the biological processes and pathways influenced by the differential DNA methylation.
(4) Visualization of Differentially Methylated Regions (DMRs)
Due to the large number of DMRs, the top 20 DMRs with Q-value (adjusted p-value) are selected for visualization.
By following this workflow, researchers can identify and characterize differentially methylated regions, gain insights into the potential functional consequences of DNA methylation changes, and effectively visualize the results to aid in data interpretation and hypothesis generation.