Advances in HLA Typing: Methods, Applications, and Clinical Implications

Human leukocyte antigen typing (HLA typing) is a crucial process for determining the specificity of an individual's HLA genes. HLA, located on human chromosome 6, encodes glycoprotein molecules on the cell surface and plays a fundamental role in the immune system. It holds great significance in organ transplantation, disease correlation research, anthropological studies, and immunotherapy.

The HLA gene complex encompasses multiple gene loci, which are classified into three categories: class I, class II, and class III genes. Class I genes, including HLA-A, HLA-B, and HLA-C, are primarily expressed on the surface of nucleated cells. The molecules they encode present endogenous antigens to CD8⁺ T cells, thereby participating in the cellular immune response and playing a vital role in identifying and eliminating virus-infected cells and tumor cells. Class II genes, such as HLA-DR, HLA-DQ, and HLA-DP, are mainly expressed on the surface of antigen-presenting cells (e.g., macrophages, dendritic cells, and B cells). They are responsible for presenting exogenous antigens to CD4⁺ T cells, initiating the immune response, and are of key importance in regulating the intensity and type of immune response. Class III genes encode other molecules related to immunity, such as complement components, which are involved in inflammatory reactions and immune regulation.

HLA Typing Methods

Serological methods were used to detect lymphocyte surface antigens with known anti-HLA antibodies. Cytological method was used to observe the cell reaction typing by mixed lymphocyte culture. Molecular biological methods, such as PCR-SSP and PCR-SSOP, amplify and analyze HLA genes with high accuracy.

Serological method: Traditionally, the serological method relies on the combination of anti-HLA antibodies with known specificities and HLA antigens on the lymphocyte surface. It is detected through complement-dependent cytotoxicity tests (CDC) or microlymphocytotoxicity tests (MLCT). In the CDC test, lymphocytes to be detected are mixed with a series of known anti-HLA sera. If the antibodies in the sera bind to the corresponding HLA antigens on the lymphocyte surface, the lymphocytes will be damaged or killed in the presence of complement. The HLA type can then be determined by observing the death of the cells. MLCT is a more sensitive serological detection method capable of detecting weak antigen-antibody reactions. However, serological methods have certain limitations. They can only detect HLA antigens that have already been identified, and some rare or new alleles may not be accurately recognized. Additionally, the specificity and affinity of antibodies can vary, affecting the accuracy of the detection results.

PCR-based technology: Polymerase chain reaction (PCR)-related technologies are widely utilized in HLA typing. For example, sequence-specific primer PCR (PCR-SSP) employs primers designed for specific HLA allele sequences to amplify the sample under test. If an allele sequence complementary to the primer exists in the sample, a specific amplification product will be generated. The HLA type can be determined by detecting the presence or absence of the amplification product through gel electrophoresis. This method offers high specificity and relatively straightforward operation. Nevertheless, it requires the design of a large number of primers and can only detect known alleles. Sequence-specific oligonucleotide probe hybridization (PCR-SSOP) involves amplifying HLA gene fragments first and then hybridizing them with a series of labeled sequence-specific oligonucleotide probes. The HLA type is judged based on the hybridization signal. Its advantage lies in the ability to detect multiple alleles simultaneously, but the operation is complex, and the optimization of hybridization conditions significantly impacts the accuracy of the results.

Amplicon sequencing scheme of PCR for HLA typing (Erlich., 2012)Preparation of HLA typing by clonal amplification by PCR (Erlich., 2012)

Method based on DNA sequencing: Direct sequencing of the HLA gene is an extremely accurate approach that can determine the nucleotide sequence of the HLA gene, enabling precise identification of alleles. For instance, the Sanger sequencing method can detect both known and unknown allele variations by sequencing the amplified products of HLA genes. Next-generation sequencing (NGS) technology has made it possible to perform high-throughput sequencing of multiple HLA loci in a single reaction, substantially enhancing the detection efficiency and resolution. It can comprehensively analyze the polymorphism of HLA genes. However, it comes with relatively high costs and more complex data analysis.

Provide Accurate Method for HLA Typing in Clinic

A study published in HLA Immune Response Genetics introduced a personalized HLA typing method. This method can identify new HLA alleles and tumor-specific HLA variants from whole exome sequencing (WES) data, presenting a more accurate and comprehensive solution for the application of HLA typing in clinical and research settings.

Research background

HLA full-length typing is of significant clinical and research value. However, due to the complexity and polymorphism of the HLA region, achieving accurate typing remains challenging. NGS data is commonly used for HLA typing, and the methods can be divided into HLA targeted sequencing and standard NGS sequencing (such as WES, whole genome sequencing (WGS), RNA-Seq), each with its own pros and cons. Most HLA typing tools based on NGS rely on database matching, which has limitations, particularly in detecting new alleles. In cancer, a comprehensive characterization of the HLA status is crucial for immunotherapy, but existing methods are insufficient.LA status is very important for immunotherapy, but the existing methods are insufficient.

Experimental method

  • The closest matching typing of HLA database: The WES reads were compared with the known HLA alleles in the IPD-IMGT/HLA database using the "OncoHLA" method. The closest matching HLA alleles and sequences were determined by an integer linear programming algorithm, with a resolution of four fields.
  • Integrating germline variation detection to realize personalized HLA typing: The WES reads were compared with the closest matching HLA sequence generated in the previous step using GSNAP. Subsequently, GATK-HaplotypeCaller and Strelka2 were employed to detect germline variations, screening out eligible variants for subsequent personalized HLA typing.
  • Identification of tumor-specific HLA variants by integrated somatic mutation detection: The somatic mutation detection pipeline was expanded, and six tools were used to detect somatic mutations in the HLA region. High-quality candidate variants were screened, and their impact on gene products was evaluated using VEP.
  • Reconstruction of personalized HLA alleles and tumor-specific HLA alleles: The phase relationship of HLA variations was determined using WhatsHap, and the impact of variations was evaluated with Haplosaurus. Its function was expanded to reconstruct candidate variant HLA sequences, followed by another round of HLA typing. The reconstruction of tumor-specific HLA alleles was similar, but all possible alleles were retained.

Experimental results

  • Discovery of simulated new HLA alleles: Simulation experiments were conducted by introducing different combinations of germline variations into three HLA alleles, totaling 1800 trials. The overall success rate was 83%, with a higher success rate when single mutations were added. The success rate of co-occurrence mutation simulations decreased, especially in the case of insertion-deletion co-occurrence simulations. By expanding the number of HLA alleles, the overall success rate increased to 97%, indicating that this method has the potential to accurately identify germline variations affecting HLA alleles from WES data and discover new HLA sequences.
  • WES-based HLA typing protein coding sequence level verification: The WES data of 10 donor peripheral blood mononuclear cells (PBMC) were HLA typed using NeoOncoHLA and compared with the results of OncoHLA and targeted HLA sequencing. NeoOncoHLA was 100% consistent with OncoHLA and targeted HLA sequencing results at the first and second field resolutions. In the third and fourth fields, the performance of OncoHLA declined, while NeoOncoHLA significantly improved its performance by integrating germline variation detection.
  • Integrated germline variation detection enhances personalized HLA typing and realizes new allele discovery: The performance of personalized HLA typing was notably enhanced at the third and fourth field resolutions. NeoOncoHLA could correct the HLA typology in five patient samples with at least one germline variation HLA allele, and a new class I HLA allele, HLA-B*44:02:01:52, was also identified.
  • Evaluation of the ability of integrated somatic cell mutation detection to enhance the detection of HLA allele somatic cell mutation: NeoOncoHLA and POLYSOLVER had comparable capabilities in detecting somatic HLA mutations. In 740 simulation experiments, NeoOncoHLA's detection success rate was 91%, higher than POLYSOLVER's 75%, and it performed better under different alleles, mutation types, and simulated variant allele frequencies (VAF).
  • Somatic variation of HLA alleles in class I was detected by using personalized HLA alleles as a reference: The WES data of 14 metastatic melanoma samples were analyzed by NeoOncoHLA, revealing 15 somatic mutations distributed across HLA-A, B, and C alleles, with varying functional effects, some in coding regions and some in non-coding regions. The average VAF was low, indicating tumor genome heterogeneity. One high VAF mutation was confirmed, and its expression level was high. RNA-Seq orthogonal tests confirmed some exon variations, and the allele expression of unconfirmed variations was low or not expressed. POLYSOLVER detected more variants but had a problem of duplicate reporting. NeoOncoHLA improved the accuracy of variation detection through various improvement measures.

HLA variant simulation in different patients (Anzar et al., 2022)Somatic HLA variant simulation results (Anzar et al., 2022)

Discussion

The HLA typing method described in this paper is based on NGS data and combined with the comparison of known HLA allele databases. It can identify new alleles and tumor-specific variants and improving typing accuracy. Compared with other methods, it has advantages in the discovery of new alleles and the detection of tumor-specific variants. However, it is limited by the sequence similarity to known alleles and has mainly been tested on classical class I alleles. New methods may be required for the detection of highly divergent new alleles, and the germline variation detection steps can be further optimized. This method is beneficial for enriching the HLA library, enhancing the accuracy of HLA typing in clinical and research applications, and is of great significance for understanding the mechanism of tumor immune escape and developing cancer immunotherapy strategies.

A New Economical Method for HLA Typing

A paper published in Bioinformatics introduced a novel HLA genotyping algorithm called OptiType. This algorithm can accurately predict HLA genotypes from NGS data without specific enrichment of the HLA region and performs well across multiple datasets, providing a fast, accurate, and economical method for HLA genotyping. It is expected to play an important role in clinical applications.

Research background

The HLA gene cluster is of utmost importance in adaptive immunity and is closely related to vaccinology, regenerative medicine, transplantation medicine, autoimmune diseases, and other fields. HLA alleles exhibit high polymorphism and sequence similarity. Traditional HLA typing methods have limitations, such as being labor-intensive, time-consuming, and yielding unclear results. Although there are HLA typing methods based on NGS, issues such as cumbersome preparation, long processing times, or insufficient accuracy still exist. In particular, the computational methods for determining HLA genotypes from conventional sequencing data are not accurate enough.

Experimental method

  • To evaluate clinical samples, exome sequencing datasets from 10 patients with acute lymphoblastic leukemia were used. Some samples were sequenced after enriching HLA loci with different kits.
  • Simulation of coverage depth: Random subsets of different sizes were selected from the benchmark samples of the 1000 Genomes project to simulate various coverage depth conditions and study their influence on prediction accuracy.
  • Performance evaluation index: The percentage of correctly predicted HLA alleles (at two-digit and four-digit levels) in each sample was used as the basis for evaluation, and the correctness of zygosity prediction was taken as an independent performance index.
  • Implementation and usability: The NGS analysis pipeline was implemented in Python 2.7, supported by the Pandas module and HDF5. Read mapping was performed using RazerS3 and Bowtie 2, integer linear programming (ILP) was carried out with Pyomo and solved by ILOG CPLEX 12.5 or an open-source ILP solver (such as GLPK). Statistical analysis was conducted in R. OptiType follows the BSD open-source license, and its source code can be downloaded from GitHub.

Comparison of HLA typing algorithms in different samples (Szolek et al., 2014)Performance comparison of HLA typing algorithms (Szolek et al., 2014)

Experimental result

  • Overall performance: On 361 benchmark samples, OptiType achieved an accuracy of 97.1% (95% confidence interval: 96.1% - 97.80%) at the four-digit level and 99.3% (95% confidence interval: 98.7% - 99.7%) at the two-digit level, outperforming other similar methods on multiple datasets.
  • Impact of intron reconstruction: A modified version of OptiType that only used exon 2 and 3 sequences as a reference was tested on 1000 Genomes datasets. It was found that the error rate increased compared to the default OptiType pipeline that utilized intron sequences, indicating the significant impact of intron sequences on prediction performance.
  • The influence of HLA enrichment and coverage depth: In a specific HLA-enriched sample, as low as approximately 0.3% of the total reads (about 12-fold coverage) could achieve completely correct genotype prediction. In the simulation experiment of 1000 Genomes exome sequencing samples, an accuracy rate of 95% could be reached when the average coverage depth exceeded 10-fold, and the length of the read had little effect on the prediction accuracy.

Discussion

OptiType is a fast and accurate HLA typing method based on NGS data. It can perform automatic typing with four-digit resolution and performs well on different types of sequencing data, making it suitable for NGS data from various sources. It is superior to previous computational methods. Coverage depth above a certain level has minimal impact on its performance. Short read lengths increase mapping ambiguity but do not affect the method's performance. Incorrect predictions are mainly caused by issues such as uncovered sequences, failed zygosity detection, and ambiguous typing of small loci. Ensuring that each allele has an equal opportunity to be identified is crucial. Using phylogenetic methods to reconstruct intron sequences can help improve performance. With the increase in the number of fully sequenced HLA alleles, the reference sequence can be expanded, which is expected to enhance the prediction accuracy. OptiType provides an alternative to common HLA genotyping methods, although it can only predict known alleles.

Provide New Insights in Diseases Treatment

A study on HLA typing focused on the differences in HLA typing among hospitalized children with sickle cell disease (SCD). The details are as follows:

Research background

Treatment: Hematopoietic stem cell transplantation (HSCT) is an effective treatment for SCD, and HLA typing is the critical first step in HSCT.

Current situation: The proportion of SCD patients who have completed HLA typing is unclear. Socio-economic factors may influence the HLA typing of SCD families and their access to HSCT, resulting in a medical disparity.

Research method

  • Study Design: A prospective study was conducted on SCD patients hospitalized in Children's National Hospital in 2020. Patient information was collected through the REDCap database to determine HLA typing, and the relatives of patients who had not been typed were investigated and recommended for typing.
  • HLA typing: HLA typing was performed by comparing with the IPD-IMGT/HLA database.
  • Socio-economic status: The 2018 regional poverty index (ADI) score of the Wisconsin Medical College and School of Public Health was used as a measure of socio-economic status.
  • Statistical analysis: Chi-square tests and t-tests were used to compare the characteristics of patients with and without HLA typing, and a logistic regression model was employed to evaluate the multivariate association between baseline HLA typing and patient characteristics.

Research results

  • Characteristics of patients: Patients who had completed HLA typing were more likely to have a severe SCD genotype, a previous hospitalization history in the intensive care unit (ICU), be receiving hydroxyurea or chronic blood transfusion treatment, have more hospitalizations and emergency visits in the past year, and have a low ADI score.
  • Doctor difference: There were differences in the baseline HLA typing rates among different hematologists, ranging from 8% to 50%.
  • Regional differences: The likelihood of HLA typing in patients living in poor areas (with high ADI scores) was significantly reduced.

Different ADI of HLA typing in hospitalized patients with HSCT for SCD (Takor et al., 2024)Comparison of ADI in hospitalized patients and patients who underwent HSCT for SCD (Takor et al., 2024)

Discussion

Classification status: Most hospitalized SCD patients (>75%) had not undergone HLA typing, and patients with more severe diseases were more likely to be typed, leading to a medical gap related to socio-economic status.

Acceptance: Most (69%) families who had not had HLA typing at baseline expressed interest in testing, suggesting that doctors may not have fully informed patients about HLA typing and HSCT options.

Limitations of the study: The study only involved one institution, and the results may not be generalizable. In the future, it is necessary to ensure that all SCD patients can receive equal treatment.

With the continuous advancement of technology, HLA typing is expected to become more accurate, efficient, and widely used in the future. New molecular biology techniques and data analysis algorithms will continue to emerge, further enhancing the resolution and accuracy of typing, reducing costs, and shortening detection times. Additionally, the sharing and collaborative research of HLA typing data worldwide will continue to strengthen, contributing to a deeper understanding of the diversity and function of human HLA genes and promoting the development of related fields.

References

  1. H Erlich. "HLA DNA typing: past, present, and future." Tissue Antigens (2012) 1-11. doi: 10.1111/j.1399-0039.2012.01881.x
  2. Irantzu Anzar, Angelina Sverchkova, Pubudu Samarakoon and Trevor Clancy. "Personalized HLA typing leads to the discovery of novel HLA alleles and tumor-specific HLA variants." HLA Immune Response Genetics (2022) 313-327. DOI: 10.1111/tan.14562
  3. Andras Szolek, Benjamin Schubert and Oliver Kohlbacher. "OptiType: precision HLA typing from next-generation sequencing data." Bioinformatics (2014) https://www.researchgate.net/publication/264989140
  4. Arrey-Takor Ayuk-Arrey, Olufunke Y Martin and Robert Sheppard Nickel. "Disparity in HLA Typing Rates Among Hospitalized Pediatric Patients with Sickle Cell Disease." Transplantation and Cellular Therapy (2023) 107-113. https://doi.org/10.1016/j.jtct.2023.10.020
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.


Related Services
Inquiry

* Providing your work email helps us offer the best support for your research.

For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.

CD Genomics is transforming biomedical potential into precision insights through seamless sequencing and advanced bioinformatics.

Copyright © CD Genomics. All Rights Reserved.
Top