CRISPR-Net: An sgRNA Off-Target Scoring Tool

Introduction to CRISPR-Net

The CRISPR/Cas9 gene-editing system has emerged as a revolutionary tool in the field of genome editing, demonstrating exceptional efficiency in targeted DNA cleavage across various organisms, including humans, mice, and plants. However, a notable limitation of this system is its propensity for unintended binding of the Cas9-sgRNA complex to non-target regions, leading to off-target cleavage events. These off-target effects have been documented in numerous studies, and they pose a significant risk by potentially inducing undesirable consequences, thereby limiting the broader application of CRISPR/Cas9 in clinical settings.

Given these challenges, designing sgRNAs that exhibit both high on-target efficiency and minimal off-target activity has become an urgent research priority. Additionally, the accurate quantification of off-target effects is critical to enhancing the safety and efficacy of the CRISPR/Cas9 system. Previously, tools such as Cas-Offinder have been introduced, which provide an initial identification of potential off-target sites. However, these tools are limited in their ability to precisely evaluate the likelihood of off-target activity occurring at these sites.

Addressing this limitation, researchers led by Lin et al. from the University of Hong Kong have leveraged a recurrent convolutional neural network (RCNN) algorithm to develop an innovative tool called CRISPR-Net. This tool provides accurate quantification of off-target effects caused by mismatches, insertions, or deletions (Indels) between the sgRNA and target DNA during the CRISPR/Cas9 editing process. CRISPR-Net offers a more comprehensive and refined evaluation of off-target risk, equipping researchers with enhanced capabilities for assessing the safety of CRISPR-mediated genome editing.

Figure 1 illustrates the architecture of CRISPR-Net. The user inputs a pair of sequences: the on-target sequence (XON) and the off-target sequence (XOFF). These sequences are encoded into binary matrices, which are then fed into the Inception layer of the convolutional neural network (CNN). The initial layer of the network contains 10 filters of varying sizes. The output from the Inception layer is integrated with the encoded matrices through feature fusion, and subsequently passed into the convolutional layer. The output from the convolutional layer is then dimensionally reduced and transmitted to two non-linear dense layers, consisting of 80 and 20 neurons, respectively. Specifically, the output neurons utilize the sigmoid activation function, while the remaining layers employ the ReLU activation function.

You may interested in

CD Genomics offers a range of services related to gene editing predictions and off-target detection, including:

Take the Next Step: Explore Related Services

Structure and Performance of CRISPR-Net

Novel Encoding Scheme

In this study, the authors introduced an innovative encoding scheme that transforms each sgRNA-target DNA pair into binary matrices, which serve as the input data for CRISPR-Net. As a classifier, CRISPR-Net applies a binary cross-entropy loss function to the output neurons with sigmoid activation, generating off-target scores for both the target and off-target sequences. In its function as a regressor, the model employs the mean square error (MSE) loss function on the output neurons to compute off-target scores.

Model Performance

The study provides a detailed evaluation of CRISPR-Net and two optimized off-target prediction models (CNN_std and Gradient Boosting Regression Trees) on two benchmark datasets: CIRCLE-Seq and GUIDE-Seq. Results from the ROC (Receiver Operating Characteristic) and PRC (Precision-Recall Curve) analyses demonstrate that CRISPR-Net outperformed the other models on the CIRCLE-Seq dataset. Furthermore, when evaluated on an independent GUIDE-Seq dataset, which only contains mismatch-based off-target effects, CRISPR-Net exhibited superior predictive performance compared to six existing off-target prediction models: AttnToMismatch_CNN, CRISPRoff, Elevation-Score, CFD, Ensemble SVM, and CNN_std.

CRISPR-Net presents a significant advancement in the field of off-target prediction for CRISPR/Cas9 gene editing. Its superior performance on benchmark datasets and the development of an innovative encoding scheme mark a substantial contribution to improving the precision and safety of CRISPR-based genetic interventions.

Figure 2. Performance comparison between CRISPR-Net and two improved off-target prediction models using the CIRCLE-Seq dataset (I/1). The ROC curve is presented on the left, and the PRC curve on the right.

Figure 3. Performance comparison between CRISPR-Net and two improved off-target prediction models using the GUIDE-Seq dataset (I/2). The ROC curve is presented on the left, and the PRC curve on the right.

Figure 4. Performance comparison between CRISPR-Net and six existing off-target prediction models using the GUIDE-Seq dataset (II/5). The ROC curve is presented on the left, and the PRC curve on the right.

User Guide for CRISPR-Net

The authors have provided a specialized command-line tool designed to quantify off-target effects induced by single-guide RNA (sgRNA). This tool, named CRISPR_Net.py, is specifically designed for the accurate prediction of off-target scores for sequences containing insertions or deletions (Indels) and mismatches. The tool can be accessed via the following link:
https://codeocean.com/capsule/9553651/tree/v3.

Usage Instructions

The following are the detailed steps for using CRISPR-Net:

Prerequisites:

Before running CRISPR-Net, ensure that Python is installed on the computer (version 3.6 or higher is recommended). Additionally, the required libraries, including scipy, numpy, pandas, scikit-learn, TensorFlow, and Keras, must be installed.

Preparing the Input File:

Prepare an input file named input.csv. In this file, the first column should contain the on-target sequences (i.e., sequences where precise editing is desired), while the second column should list the off-target sequences (i.e., sequences that may be affected non-specifically). Ensure that the data are correctly formatted to allow the tool to process and analyze the sequences efficiently.

Running CRISPR-Net:

Execute CRISPR-Net by entering the following command in the terminal:

$> python ./CRISPR_net.py input.csv

After execution, the tool will output a file, with the third column displaying the predicted off-target scores generated by the model.

Conclusion

CRISPR-Net provides a robust method for performing genome-wide off-target scoring in advance, allowing researchers to mitigate risks and ensure the safety and efficacy of CRISPR system design and application. After selecting suitable sgRNAs based on predictions, it remains critical to rigorously validate their off-target effects using scientific methodologies.

As a leader in the field of gene editing technology services, CD Genomics is dedicated to providing comprehensive and high-quality technical support to researchers. We offer a wide range of services, including in vivo off-target detection using GUIDE-seq, in vitro detection with AID-seq, and amplicon-based sequencing for editing efficiency evaluation.

References

  1. Lin, J., Zhang, Z., Zhang, S., Chen, J., Wong, K., CRISPR-Net: A Recurrent Convolutional Network Quantifies CRISPR Off-Target Activities with Mismatches and Indels. Adv. Sci. 2020, 7, 1903562. https://doi.org/10.1002/advs.201903562.
  2. Lin J, Wong KC. Off-target predictions in CRISPR-Cas9 gene editing using deep learning. Bioinformatics. 2018 Sep 1;34(17):i656-i663. doi: 10.1093/bioinformatics/bty554.
  3. Frock RL, Hu J, Meyers RM, Ho YJ, Kii E, Alt FW. Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nat Biotechnol. 2015 Feb;33(2):179-86. doi: 10.1038/nbt.3101.
  4. Listgarten J, Weinstein M, Kleinstiver BP, Sousa AA, Joung JK, Crawford J, Gao K, Hoang L, Elibol M, Doench JG, Fusi N. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nat Biomed Eng. 2018 Jan;2(1):38-47. doi: 10.1038/s41551-017-0178-6.
  5. Kleinstiver BP, Prew MS, Tsai SQ, Topkar VV, Nguyen NT, Zheng Z, Gonzales AP, Li Z, Peterson RT, Yeh JR, Aryee MJ, Joung JK. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature. 2015 Jul 23;523(7561):481-5. doi: 10.1038/nature14592.
For research use only, not intended for any clinical use.


Related Services
Inquiry

CD Genomics is transforming biomedical potential into precision insights through seamless sequencing and advanced bioinformatics.

Copyright © CD Genomics. All Rights Reserved.
Top