PacBio sequencing has 4 fluorescent tags, which are labeled with 4 types of dNTP, and its polymerase is anchored at the bottom of the sequencing chip; when the DNA strand binds to the enzyme, sequencing is performed, and during sequencing, the fluorescent dNTP forms a complex with the enzyme and the DNA template, which will bind briefly. During this process, when the polymerase encounters methylated A, C and other bases on the template, the rate of polymerization will be significantly slower. and the corresponding spectral features will change, which makes it possible to directly measure the methylation of A, methylation of C, hydroxymethylation of C, etc.
To obtain 5mC methylation information from HiFi data, there are two cases, one is the previously measured HiFi data, and the other is the HiFi data ready to be measured.
HiFi sequencing, the bam file after primrose analysis, represents the methylation information of the whole genome sequence. Based on this, we need to first compare it with the reference genome (using minimap software), and then use pb-CpG-tools software to obtain the probability information of methylation at a certain position on the genome after obtaining the Aligned bam file after comparison.
According to the official test results, when HiFi data is above 10×, the methylation results based on HiFi data detection are in good agreement with WGBS results, and the correlation coefficient tends to level off gradually with the increase of sequencing depth. Therefore, HiFi detection of 5mC and a sequencing depth of 10-15× is sufficient.
Using human samples as test data, we compared the consistency of WGBS, ONT, HiFi, etc. to detect 5mC, and the results showed that the correlation coefficient between HiFi test results and other test results was over 90%, which confirmed the sensitivity and accuracy of HiFi to detect 5mC.
The existing methylation detection technologies are either too short in read length or not accurate enough in sequencing, and it is difficult to distinguish haplotype methylation information. HiFi 5mC, on the other hand, inherits the advantages of the above-mentioned detection technologies, but can also distinguish haplotype methylation information, making it possible to analyze methylation differences at the haplotype level.
As mentioned above, HiFi detection of 5mC methylation data has low depth requirement, wide coverage, and can be methylation phasing, which allows us to measure HiFi data once to complete both genome assembly and variant detection and methylation analysis, which can provide a comprehensive aid for the deep resolution of species. In addition, based on the results of methylation phasing, we can explore the differential methylation of alleles, trace the parental imprinting, and search for imprinted genes, which will help researchers to analyze the genetic mechanism of species development, disease causes, and important phenotypes in a higher dimension.
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.