On-target and off-target scoring for CRISPR gRNAs (2023)

Jean-Philippe Fortin1*, Aaron Lun1** and Luke Hoberecht1***

1Data Science and Statistical Computing, gRED, Genentech

*fortin946@gmail.com
**infinite.monkeys.with.keyboards@gmail.com
***lukehob3@gmail.com

2023-04-25

The crisprScore package provides R wrappers of several on-target and off-target scoringmethods for CRISPR guide RNAs (gRNAs). The following nucleases are supported:SpCas9, AsCas12a, enAsCas12a, and RfxCas13d (CasRx). The available on-targetcutting efficiency scoring methods are RuleSet1, RuleSet3, Azimuth, DeepHF,DeepSpCas9, DeepCpf1, enPAM+GB, CRISPRscan and CRISPRater. Both the CFD and MITscoring methods are available for off-target specificity prediction. Thepackage also provides a Lindel-derived score to predict the probabilityof a gRNA to produce indels inducing a frameshift for the Cas9 nuclease.Note that DeepHF, DeepCpf1 and enPAM+GB are not available on Windows machines.

Our work is described in a recent bioRxiv preprint:“The crisprVerse: A comprehensive Bioconductor ecosystem for the design of CRISPR guide RNAs across nucleases and technologies”

Our main gRNA design package crisprDesign utilizes the crisprScore package to add on- and off-target scores to user-designed gRNAs; check out our Cas9 gRNA tutorial page to learn how to use crisprScore via crisprDesign.

2.1 Software requirements

2.1.1 OS Requirements

This package is supported for macOS, Linux and Windows machines.Some functionalities are not supported for Windows machines.Packages were developed and tested on R version 4.2.

2.2 Installation from Bioconductor

crisprScore can be installed from from the Bioconductor devel branchusing the following commands in a fresh R session:

if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")BiocManager::install(version="devel")BiocManager::install("crisprScore")

2.3 Installation from GitHub

Alternatively, the development version of crisprScore and its dependencies can be installed by typing the following commands inside of an R session:

install.packages("devtools")library(devtools)install_github("crisprVerse/crisprScoreData")install_github("crisprVerse/crisprScore")

When calling one of the scoring methods for the first time after packageinstallation, the underlying python module and conda environment will beautomatically downloaded and installed without the need for user intervention.This may take several minutes, but this is a one-time installation.the first time after package installation.

Note that RStudio users will need to add the following line to their .Rprofilefile in order for crisprScore to work properly:

options(reticulate.useImportHook=FALSE)

We load crisprScore in the usual way:

library(crisprScore)
## Warning: replacing previous import 'utils::findMatches' by## 'S4Vectors::findMatches' when loading 'AnnotationDbi'

The scoringMethodsInfo data.frame contains a succinct summary of scoringmethods available in crisprScore:

data(scoringMethodsInfo)print(scoringMethodsInfo)
## method nuclease left right type label len## 1 ruleset1 SpCas9 -24 5 On-target RuleSet1 30## 2 azimuth SpCas9 -24 5 On-target Azimuth 30## 3 deephf SpCas9 -20 2 On-target DeepHF 23## 4 lindel SpCas9 -33 31 On-target Lindel 65## 5 mit SpCas9 -20 2 Off-target MIT 23## 6 cfd SpCas9 -20 2 Off-target CFD 23## 7 deepcpf1 AsCas12a -4 29 On-target DeepCpf1 34## 8 enpamgb enAsCas12a -4 29 On-target EnPAMGB 34## 9 crisprscan SpCas9 -26 8 On-target CRISPRscan 35## 10 casrxrf CasRx NA NA On-target CasRx-RF NA## 11 crisprai SpCas9 -19 2 On-target CRISPRai 22## 12 crisprater SpCas9 -20 -1 On-target CRISPRater 20## 13 deepspcas9 SpCas9 -24 5 On-target DeepSpCas9 30## 14 ruleset3 SpCas9 -24 5 On-target RuleSet3 30

Each scoring algorithm requires a different contextual nucleotide sequence.The left and right columns indicates how many nucleotides upstreamand downstream of the first nucleotide of the PAM sequence are needed forinput, and the len column indicates the total number of nucleotides neededfor input. The crisprDesign (GitHub link)package provides user-friendly functionalities to extract and score thosesequences automatically via the addOnTargetScores function.

(Video) Online Crispr Cas9 gRNA design Target site prediction tools explained | ChopChop | IDT Technologies|

Predicting on-target cutting efficiency is an extensive area of research, andwe try to provide in crisprScore the latest state-of-the-art algorithms asthey become available.

4.1 Cas9 methods

Different algorithms require different input nucleotidesequences to predict cutting efficiency as illustrated in the figure below.

4.1.1 Rule Set 1

The Rule Set 1 algorithm is one of the first on-target efficiency methodsdeveloped for the Cas9 nuclease (Doench et al. 2014). It generates a probability(therefore a score between 0 and 1) that a given sgRNA will cut at itsintended target. 4 nucleotides upstream and 3 nucleotides downstream ofthe PAM sequence are needed for scoring:

flank5 <- "ACCT" #4bpspacer <- "ATCGATGCTGATGCTAGATA" #20bppam <- "AGG" #3bp flank3 <- "TTG" #3bpinput <- paste0(flank5, spacer, pam, flank3) results <- getRuleSet1Scores(input)

The Azimuth score described below is an improvement over Rule Set 1from the same lab.

4.1.2 Azimuth

The Azimuth algorithm is an improved version of the popular Rule Set 2 score forthe Cas9 nuclease (Doench et al. 2016). It generates a probability (therefore a scorebetween 0 and 1) that a given sgRNA will cut at its intended target.4 nucleotides upstream and 3 nucleotides downstream of the PAMsequence are needed for scoring:

flank5 <- "ACCT" #4bpspacer <- "ATCGATGCTGATGCTAGATA" #20bppam <- "AGG" #3bp flank3 <- "TTG" #3bpinput <- paste0(flank5, spacer, pam, flank3) results <- getAzimuthScores(input)

4.1.3 Rule Set 3

The Rule Set 3 is an improvement over Rule Set 1 and Rule Set 2/Azimuthdeveloped for the SpCas9 nuclease, taking into account the type oftracrRNAs (DeWeirdt et al. 2022). Two types of tracrRNAs are currently offered:

GTTTTAGAGCTA-----GAAA-----TAGCAAGTTAAAAT... --> Hsu2013 tracrRNAGTTTAAGAGCTATGCTGGAAACAGCATAGCAAGTTTAAAT... --> Chen2013 tracrRNA

Similar to Rule Set 1 and Azimuth, the input sequence requires 4 nucleotidesupstream of the protospacer sequence, the protospacer sequence itself(20nt spacersequence and PAM sequence), and 3 nucleotides downstream ofthe PAM sequence:

flank5 <- "ACCT" #4bpspacer <- "ATCGATGCTGATGCTAGATA" #20bppam <- "AGG" #3bp flank3 <- "TTG" #3bpinput <- paste0(flank5, spacer, pam, flank3) results <- getRuleSet3Scores(input, tracrRNA="Hsu2013")

A more involved version of the algorithm takes into account gene context ofthe target protospacer sequence (Rule Set 3 Target) and will be soonimplemented in crisprScore.

4.1.4 DeepHF

The DeepHF algorithm is an on-target cutting efficiency prediction algorithm forseveral variants of the Cas9 nuclease (Wang et al. 2019) using a recurrent neuralnetwork (RNN) framework. Similar to the Azimuth score, it generates aprobability of cutting at the intended on-target. The algorithm only needsthe protospacer and PAM sequences as inputs:

spacer <- "ATCGATGCTGATGCTAGATA" #20bppam <- "AGG" #3bp input <- paste0(spacer, pam) results <- getDeepHFScores(input)

Users can specify for which Cas9 they wish to score sgRNAs by using the argumentenzyme: “WT” for Wildtype Cas9 (WT-SpCas9), “HF” for high-fidelity Cas9(SpCas9-HF), or “ESP” for enhancedCas9 (eSpCas9). For wildtype Cas9, users canalso specify the promoter used for expressing sgRNAs using the argumentpromoter (“U6” by default). See ?getDeepHFScores for more details.

4.1.5 DeepSpCas9

The DeepSpCas9 algorithm is an on-target cutting efficiency predictionalgorithm for the SpCas9 nuclease (Kim et al. 2019). Similar to the Azimuth score,it generates a probability of cutting at the intended on-target.4 nucleotides upstream of the protospacer sequence, and 3 nucleotidesdownstream of the PAM sequence are needed in top of the protospacersequence for scoring:

flank5 <- "ACCT" #4bpspacer <- "ATCGATGCTGATGCTAGATA" #20bppam <- "AGG" #3bp flank3 <- "TTG" #3bpinput <- paste0(flank5, spacer, pam, flank3) results <- getDeepSpCas9Scores(input)
spacer <- "ATCGATGCTGATGCTAGATA" #20bppam <- "AGG" #3bp input <- paste0(spacer, pam) results <- getDeepHFScores(input)

Users can specify for which Cas9 they wish to score sgRNAs by using the argumentenzyme: “WT” for Wildtype Cas9 (WT-SpCas9), “HF” for high-fidelity Cas9(SpCas9-HF), or “ESP” for enhancedCas9 (eSpCas9). For wildtype Cas9, users canalso specify the promoter used for expressing sgRNAs using the argumentpromoter (“U6” by default). See ?getDeepHFScores for more details.

(Video) The growing concern of off-target effects: How to measure and minimize off-target effects...

4.1.6 CRISPRscan

The CRISPRscan algorithm, also known as the Moreno-Mateos score, is anon-target efficiency method for the SpCas9 nuclease developed for sgRNAsexpressed from a T7 promoter, and trained on zebrafish data (Moreno-Mateos et al. 2015).It generates a probability (therefore a score between 0 and 1) that a givensgRNA will cut at its intended target.6 nucleotides upstream of the protospacer sequenceand 6 nucleotides downstream of the PAM sequence are needed for scoring:

flank5 <- "ACCTAA" #6bpspacer <- "ATCGATGCTGATGCTAGATA" #20bppam <- "AGG" #3bp flank3 <- "TTGAAT" #6bpinput <- paste0(flank5, spacer, pam, flank3) results <- getCRISPRscanScores(input)

4.1.7 CRISPRater

The CRISPRater algorithm is an on-target efficiency method for the SpCas9 nuclease (Labuhn et al. 2018).It generates a probability (therefore a score between 0 and 1) that a givensgRNA will cut at its intended target.Only the 20bp spacer sequence is required.

spacer <- "ATCGATGCTGATGCTAGATA" #20bpresults <- getCRISPRaterScores(spacer)

4.1.8 CRISPRai

The CRISPRai algorithm was developed by the Weissman lab to score SpCas9gRNAs for CRISPRa and CRISPRi applications (Horlbeck et al. 2016), for the human genome.The function getCrispraiScores requires several inputs.

First, it requires a data.frame specifying the genomic coordinates ofthe transcription starting sites (TSSs). An example of such a data.frameis provided in the crisprScore package:

head(tssExampleCrispri)
## tss_id gene_symbol promoter transcripts position strand chr## 1 A1BG_P1 A1BG P1 ENST00000596924 58347625 - chr19## 2 A1BG_P2 A1BG P2 ENST00000263100 58353463 - chr19## 3 KRAS_P1 KRAS P1 ENST00000311936 25250929 - chr12## 4 SMARCA2_P1 SMARCA2 P1 ENST00000357248 2015347 + chr9## 5 SMARCA2_P2 SMARCA2 P2 ENST00000382194 2017615 + chr9## 6 SMARCA2_P3 SMARCA2 P3 ENST00000635133 2158470 + chr9

It also requires a data.frame specifying the genomic coordinates of thegRNA sequences to score. An example of such a data.frameis provided in the crisprScore package:

head(sgrnaExampleCrispri)
## grna_id tss_id pam_site strand spacer_19mer## 1 A1BG_P1_1 A1BG_P1 58347601 - CTCCGGGCGACGTGGAGTG## 2 A1BG_P1_2 A1BG_P1 58347421 - GGGCACCCAGGAGCGGTAG## 3 A1BG_P1_3 A1BG_P1 58347624 - TCCACGTCGCCCGGAGCTG## 4 A1BG_P1_4 A1BG_P1 58347583 - GCAGCGCAGGACGGCATCT## 5 A1BG_P1_5 A1BG_P1 58347548 - AGCAGCTCGAAGGTGACGT## 6 A1BG_P2_1 A1BG_P2 58353455 - ATGATGGTCGCGCTCACTC

All columns present in tssExampleCrispri and sgrnaExampleCrispri aremandatory for getCrispraiScores to work.

Two additional arguments are required: fastaFile, to specify the path ofthe fasta file of the human reference genome, and chromatinFiles, which isa list of length 3 specifying the path of files containing the chromatinaccessibility data needed for the algorithm in hg38 coordinates.The chromatin files can be downloaded from Zenodohere.The fasta file for the human genome (hg38) can be downloaded directly from here:https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz

One can obtain the CRISPRai scores using the following command:

results <- getCrispraiScores(tss_df=tssExampleCrispri, sgrna_df=sgrnaExampleCrispri, modality="CRISPRi", fastaFile="your/path/hg38.fa", chromatinFiles=list(mnase="path/to/mnaseFile.bw", dnase="path/to/dnaseFile.bw", faire="oath/to/faireFile.bw"))

The function works identically for CRISPRa applications, with modality replacedby CRISPRa.

4.2 Cas12a methods

Different algorithms require different input nucleotidesequences to predict cutting efficiency as illustrated in the figure below.

On-target and off-target scoring for CRISPR gRNAs (2)

Figure 2: Sequence inputs for Cas12a scoring methods

4.2.1 DeepCpf1 score

The DeepCpf1 algorithm is an on-target cutting efficiency prediction algorithmfor the Cas12a nuclease (Kim et al. 2018) using a convolutional neural network (CNN)framework. It generates a score between 0 and 1 to quantify the likelihood ofCas12a to cut for a given sgRNA. 3 nucleotides upstream and 4 nucleotidesdownstream of the PAM sequence are needed for scoring:

flank5 <- "ACC" #3bppam <- "TTTT" #4bpspacer <- "AATCGATGCTGATGCTAGATATT" #23bpflank3 <- "AAGT" #4bpinput <- paste0(flank5, pam, spacer, flank3) results <- getDeepCpf1Scores(input)

4.2.2 enPAM+GB score

The enPAM+GB algorithm is an on-target cutting efficiency prediction algorithmfor the enhanced Cas12a (enCas12a) nuclease (DeWeirdt et al. 2020) using a gradient-booster(GB) model. The enCas12a nuclease as an extended set of active PAM sequences incomparison to the wildtype Cas12 nuclease (Kleinstiver et al. 2019), and the enPAM+GBalgorithm takes PAM activity into account in the calculation of the final score.It generates a probability (therefore a score between 0 and 1) of a given sgRNAto cut at the intended target. 3 nucleotides upstream of the PAM sequence and 4 nucleotides downstream of the protospacer sequence are needed for scoring:

flank5 <- "ACC" #3bppam <- "TTTT" #4bpspacer <- "AATCGATGCTGATGCTAGATATT" #23bpflank3 <- "AAGT" #4bpinput <- paste0(flank5, pam, spacer, flank3) results <- getEnPAMGBScores(input)

4.3 Cas13d methods

4.3.1 CasRxRF

The CasRxRF method was developed to characterize on-target efficiency of the RNA-targeting nuclease RfxCas13d, abbreviated as CasRx (Wessels et al. 2020).

It requires as an input the mRNA sequence targeted by the gRNAs, and returns as an output on-target efficiency scores for all gRNAs targeting the mRNA sequence.

As an example, we predict on-target efficiency for gRNAs targeting the mRNA sequence stored in the file test.fa:

fasta <- file.path(system.file(package="crisprScore"), "casrxrf/test.fa")mrnaSequence <- Biostrings::readDNAStringSet(filepath=fasta format="fasta", use.names=TRUE)results <- getCasRxRFScores(mrnaSequence)

Note that the function has a default argument directRepeat set to aacccctaccaactggtcggggtttgaaac, specifying the direct repeat used in theCasRx construct (see (Wessels et al. 2020).) The function also has an argument binariesthat specifies the file path of the binaries for threeprograms necessary by the CasRxRF algorithm:

  • RNAfold: available as part of the ViennaRNA package
  • RNAplfold: available as part of the ViennaRNA package
  • RNAhybrid: available as part of the RNAhybrid package

Those programs can be installed from their respective websites: VienneRNA and RNAhybrid.

If the argument is NULL, the binaries are assumed to be available onthe PATH.

(Video) Week #7: CRISPR 3/3: gRNA Design

For CRISPR knockout systems, off-targeting effects can occur when the CRISPRnuclease tolerates some levels of imperfect complementarity between gRNA spacersequences and protospacer sequences of the targeted genome. Generally, a greaternumber of mismatches between spacer and protospacer sequences decreases thelikelihood of cleavage by a nuclease, but the nature of the nucleotidesubstitution can module the likelihood as well. Several off-target specificityscores were developed to predict the likelihood of a nuclease to cut at anunintended off-target site given a position-specific set of nucleotidemismatches.

We provide in crisprScore two popular off-target specificity scoringmethods for CRISPR/Cas9 knockout systems: the MIT score (Hsu et al. 2013) and thecutting frequency determination (CFD) score (Doench et al. 2016).

5.1 MIT score

The MIT score was an early off-target specificity prediction algorithm developedfor the CRISPR/Cas9 system (Hsu et al. 2013). It predicts the likelihood that the Cas9nuclease will cut at an off-target site using position-specific mismatchtolerance weights. It also takes into consideration the total number ofmismatches, as well as the average distance between mismatches.However, it does not take into account the nature of the nucleotidesubstitutions. The exact formula used to estimate the cutting likelihood is

\[\text{MIT} = \biggl(\prod_{p \inM}{w_p}\biggr)\times\frac{1}{\frac{19-d}{19}\times4+1}\times\frac{1}{m^2}\]

where \(M\) is the set of positions for which there is a mismatch between thesgRNA spacer sequence and the off-target sequence, \(w_p\) is anexperimentally-derived mismatch tolerance weight at position \(p\), \(d\) is theaverage distance between mismatches, and \(m\) is the total numberof mismatches. As the number of mismatches increases, the cuttinglikelihood decreases. In addition, off-targets with more adjacent mismatcheswill have a lower cutting likelihood.

The getMITScores function takes as argument a character vector of 20bpsequences specifying the spacer sequences of sgRNAs (spacers argument), aswell as a vector of 20bp sequences representing the protospacer sequences of the putative off-targets in the targetedgenome (protospacers argument). PAM sequences (pams) must also be provided. If only one spacer sequence is provided,it will reused for all provided protospacers.

The following code will generate MIT scores for 3 off-targets with respect tothe sgRNA ATCGATGCTGATGCTAGATA:

spacer <- "ATCGATGCTGATGCTAGATA"protospacers <- c("ACCGATGCTGATGCTAGATA", "ATCGATGCTGATGCTAGATT", "ATCGATGCTGATGCTAGATA")pams <- c("AGG", "AGG", "AGA")getMITScores(spacers=spacer, protospacers=protospacers, pams=pams)
## spacer protospacer score## 1 ATCGATGCTGATGCTAGATA ACCGATGCTGATGCTAGATA 1.00000000## 2 ATCGATGCTGATGCTAGATA ATCGATGCTGATGCTAGATT 0.41700000## 3 ATCGATGCTGATGCTAGATA ATCGATGCTGATGCTAGATA 0.06944444

5.2 CFD score

The CFD off-target specificity prediction algorithm was initially developed forthe CRISPR/Cas9 system, and was shown to be superior to the MIT score(Doench et al. 2016). Unlike the MIT score, position-specific mismatch weights varyaccording to the nature of the nucleotide substitution (e.g.an A->G mismatch atposition 15 has a different weight than an A->T mismatch at position 15).

Similar to the getMITScores function, the getCFDScores function takes asargument a character vector of 20bp sequences specifying the spacer sequences ofsgRNAs (spacers argument), as well as a vector of 20bp sequences representingthe protospacer sequences of the putativeoff-targets in the targeted genome (protospacers argument).pams must also be provided.If only one spacersequence is provided, it will be used for all provided protospacers.

The following code will generate CFD scores for 3 off-targets with respect tothe sgRNA ATCGATGCTGATGCTAGATA:

spacer <- "ATCGATGCTGATGCTAGATA"protospacers <- c("ACCGATGCTGATGCTAGATA", "ATCGATGCTGATGCTAGATT", "ATCGATGCTGATGCTAGATA")pams <- c("AGG", "AGG", "AGA")getCFDScores(spacers=spacer, protospacers=protospacers, pams=pams)
## spacer protospacer score## 1 ATCGATGCTGATGCTAGATA ACCGATGCTGATGCTAGATA 0.85714286## 2 ATCGATGCTGATGCTAGATA ATCGATGCTGATGCTAGATT 0.60000000## 3 ATCGATGCTGATGCTAGATA ATCGATGCTGATGCTAGATA 0.06944444

6.1 Lindel score (Cas9)

Non-homologous end-joining (NHEJ) plays an important role in double-strand break(DSB) repair of DNA. Error patterns of NHEJ can be strongly biased by sequencecontext, and several studies have shown that microhomology can be used topredict indels resulting from CRISPR/Cas9-mediated cleavage. Among other usefulmetrics, the frequency of frameshift-causing indels can be estimated for a givensgRNA.

Lindel (Chen et al. 2019) is a logistic regression model that was trained to use localsequence context to predict the distribution of mutational outcomes.In crisprScore, the function getLindelScores return the proportion of“frameshifting” indels estimated by Lindel. By chance, assuming a randomdistribution of indel lengths, frameshifting proportions should be roughlyaround 0.66. A Lindel score higher than 0.66 indicates a higher than by chanceprobability that a sgRNA induces a frameshift mutation.

The Lindel algorithm requires nucleotide context around the protospacersequence; the following full sequence is needed:[13bp upstream flanking sequence][23bp protospacer sequence][29bp downstream flanking sequence], for a total of 65bp.The function getLindelScores takes as inputs such 65bp sequences:

flank5 <- "ACCTTTTAATCGA" #13bpspacer <- "TGCTGATGCTAGATATTAAG" #20bppam <- "TGG" #3bpflank3 <- "CTTTTAATCGATGCTGATGCTAGATATTA" #29bpinput <- paste0(flank5, spacer, pam, flank3)results <- getLindelScores(input)
(Video) How to find sgRNA target sequences using CHOPCHOP

The project as a whole is covered by the MIT license. The code for allunderlying Python packages, with their original licenses, can be found ininst/python. We made sure that all licenses are compatible with the MITlicense and to indicate changes that we have made to the original code.

sessionInfo()
## R version 4.3.0 RC (2023-04-13 r84269)## Platform: x86_64-pc-linux-gnu (64-bit)## Running under: Ubuntu 22.04.2 LTS## ## Matrix products: default## BLAS: /home/biocbuild/bbs-3.17-bioc/R/lib/libRblas.so ## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0## ## locale:## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_GB LC_COLLATE=C ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: America/New_York## tzcode source: system (glibc)## ## attached base packages:## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages:## [1] crisprScore_1.4.0 crisprScoreData_1.3.0 ExperimentHub_2.8.0 ## [4] AnnotationHub_3.8.0 BiocFileCache_2.8.0 dbplyr_2.3.2 ## [7] BiocGenerics_0.46.0 BiocStyle_2.28.0 ## ## loaded via a namespace (and not attached):## [1] tidyselect_1.2.0 dplyr_1.1.2 ## [3] blob_1.2.4 filelock_1.0.2 ## [5] Biostrings_2.68.0 bitops_1.0-7 ## [7] fastmap_1.1.1 RCurl_1.98-1.12 ## [9] promises_1.2.0.1 digest_0.6.31 ## [11] mime_0.12 lifecycle_1.0.3 ## [13] ellipsis_0.3.2 KEGGREST_1.40.0 ## [15] interactiveDisplayBase_1.38.0 RSQLite_2.3.1 ## [17] magrittr_2.0.3 compiler_4.3.0 ## [19] rlang_1.1.0 sass_0.4.5 ## [21] tools_4.3.0 utf8_1.2.3 ## [23] yaml_2.3.7 knitr_1.42 ## [25] bit_4.0.5 curl_5.0.0 ## [27] reticulate_1.28 grid_4.3.0 ## [29] stats4_4.3.0 fansi_1.0.4 ## [31] xtable_1.8-4 cli_3.6.1 ## [33] rmarkdown_2.21 crayon_1.5.2 ## [35] generics_0.1.3 httr_1.4.5 ## [37] DBI_1.1.3 cachem_1.0.7 ## [39] stringr_1.5.0 zlibbioc_1.46.0 ## [41] parallel_4.3.0 AnnotationDbi_1.62.0 ## [43] BiocManager_1.30.20 XVector_0.40.0 ## [45] basilisk_1.12.0 vctrs_0.6.2 ## [47] Matrix_1.5-4 jsonlite_1.8.4 ## [49] dir.expiry_1.8.0 bookdown_0.33 ## [51] IRanges_2.34.0 S4Vectors_0.38.0 ## [53] bit64_4.0.5 jquerylib_0.1.4 ## [55] glue_1.6.2 stringi_1.7.12 ## [57] BiocVersion_3.17.1 later_1.3.0 ## [59] GenomeInfoDb_1.36.0 tibble_3.2.1 ## [61] pillar_1.9.0 basilisk.utils_1.12.0 ## [63] rappdirs_0.3.3 htmltools_0.5.5 ## [65] randomForest_4.7-1.1 GenomeInfoDbData_1.2.10 ## [67] R6_2.5.1 evaluate_0.20 ## [69] shiny_1.7.4 Biobase_2.60.0 ## [71] lattice_0.21-8 highr_0.10 ## [73] png_0.1-8 memoise_2.0.1 ## [75] httpuv_1.6.9 bslib_0.4.2 ## [77] Rcpp_1.0.10 xfun_0.39 ## [79] pkgconfig_2.0.3

Chen, Wei, Aaron McKenna, Jacob Schreiber, Maximilian Haeussler, Yi Yin, Vikram Agarwal, William Stafford Noble, and Jay Shendure. 2019. “Massively Parallel Profiling and Predictive Modeling of the Outcomes of Crispr/Cas9-Mediated Double-Strand Break Repair.” Nucleic Acids Research 47 (15): 7989–8003.

DeWeirdt, Peter C, Abby V McGee, Fengyi Zheng, Ifunanya Nwolah, Mudra Hegde, and John G Doench. 2022. “Accounting for Small Variations in the tracrRNA Sequence Improves sgRNA Activity Predictions for Crispr Screening.” bioRxiv. https://doi.org/10.1101/2022.06.27.497780.

DeWeirdt, Peter C, Kendall R Sanson, Annabel K Sangree, Mudra Hegde, Ruth E Hanna, Marissa N Feeley, Audrey L Griffith, et al. 2020. “Optimization of Ascas12a for Combinatorial Genetic Screens in Human Cells.” Nature Biotechnology, 1–11.

Doench, John G, Nicolo Fusi, Meagan Sullender, Mudra Hegde, Emma W Vaimberg, Katherine F Donovan, Ian Smith, et al. 2016. “Optimized sgRNA Design to Maximize Activity and Minimize Off-Target Effects of Crispr-Cas9.” Nature Biotechnology 34 (2): 184.

Doench, John G, Ella Hartenian, Daniel B Graham, Zuzana Tothova, Mudra Hegde, Ian Smith, Meagan Sullender, Benjamin L Ebert, Ramnik J Xavier, and David E Root. 2014. “Rational Design of Highly Active sgRNAs for Crispr-Cas9–Mediated Gene Inactivation.” Nature Biotechnology 32 (12): 1262–7.

Horlbeck, Max A, Luke A Gilbert, Jacqueline E Villalta, Britt Adamson, Ryan A Pak, Yuwen Chen, Alexander P Fields, et al. 2016. “Compact and Highly Active Next-Generation Libraries for Crispr-Mediated Gene Repression and Activation.” Elife 5.

Hsu, Patrick D, David A Scott, Joshua A Weinstein, F Ann Ran, Silvana Konermann, Vineeta Agarwala, Yinqing Li, et al. 2013. “DNA Targeting Specificity of Rna-Guided Cas9 Nucleases.” Nature Biotechnology 31 (9): 827.

Kim, Hui Kwon, Younggwang Kim, Sungtae Lee, Seonwoo Min, Jung Yoon Bae, Jae Woo Choi, Jinman Park, Dongmin Jung, Sungroh Yoon, and Hyongbum Henry Kim. 2019. “SpCas9 Activity Prediction by Deepspcas9, a Deep Learning–Based Model with High Generalization Performance.” Science Advances 5 (11): eaax9249.

Kim, Hui Kwon, Seonwoo Min, Myungjae Song, Soobin Jung, Jae Woo Choi, Younggwang Kim, Sangeun Lee, Sungroh Yoon, and Hyongbum Henry Kim. 2018. “Deep Learning Improves Prediction of Crispr–Cpf1 Guide Rna Activity.” Nature Biotechnology 36 (3): 239.

Kleinstiver, Benjamin P, Alexander A Sousa, Russell T Walton, Y Esther Tak, Jonathan Y Hsu, Kendell Clement, Moira M Welch, et al. 2019. “Engineered Crispr–Cas12a Variants with Increased Activities and Improved Targeting Ranges for Gene, Epigenetic and Base Editing.” Nature Biotechnology 37 (3): 276–82.

Labuhn, Maurice, Felix F Adams, Michelle Ng, Sabine Knoess, Axel Schambach, Emmanuelle M Charpentier, Adrian Schwarzer, Juan L Mateo, Jan-Henning Klusmann, and Dirk Heckl. 2018. “Refined sgRNA Efficacy Prediction Improves Large-and Small-Scale Crispr–Cas9 Applications.” Nucleic Acids Research 46 (3): 1375–85.

Moreno-Mateos, Miguel A, Charles E Vejnar, Jean-Denis Beaudoin, Juan P Fernandez, Emily K Mis, Mustafa K Khokha, and Antonio J Giraldez. 2015. “CRISPRscan: Designing Highly Efficient sgRNAs for Crispr-Cas9 Targeting in Vivo.” Nature Methods 12 (10): 982–88.

Wang, Daqi, Chengdong Zhang, Bei Wang, Bin Li, Qiang Wang, Dong Liu, Hongyan Wang, et al. 2019. “Optimized Crispr Guide Rna Design for Two High-Fidelity Cas9 Variants by Deep Learning.” Nature Communications 10 (1): 1–14.

Wessels, Hans-Hermann, Alejandro Méndez-Mancilla, Xinyi Guo, Mateusz Legut, Zharko Daniloski, and Neville E Sanjana. 2020. “Massively Parallel Cas13 Screens Reveal Principles for Guide Rna Design.” Nature Biotechnology 38 (6): 722–27.

(Video) CRISPRseek and GUIDEseq for Design of Target-Specific Guide RNAs in CRISPR-Cas9

FAQs

What is the on target score for gRNA? ›

The on-target score represents the cleavage efficiency of Cas9 [6]. You can think of the score as the probability a given gRNA will be in top 20% of cleavage activity. Note that the scoring system is not linear, and only 5% of gRNAs receive a score of 60 or higher.

What is the off-target score for CRISPR guide? ›

A higher score for an off-target site indicates a higher similarity to the original CRISPR site (and thus a higher likelihood of the CRISPR/Cas complex binding to the off target). The overall specificity score for a CRISPR site is 100% minus a weighted sum of off-target scores in the target genome.

What is off-target activity in CRISPR? ›

Off-target genome editing refers to nonspecific and unintended genetic modifications that can arise through the use of engineered nuclease technologies such as: clustered, regularly interspaced, short palindromic repeats (CRISPR)-Cas9, transcription activator-like effector nucleases (TALEN), meganucleases, and zinc ...

How would you decide whether the risk of off-target activity for a CRISPR Cas9 therapy is low enough to be considered safe? ›

How would you decide whether the risk of off-target activity for a CRISPR-Cas9 therapy is low enough to be considered safe? I would look at the specific off target effects for each person based on their genome. It would also depend if the Cas9 was going to effect all the cells in the body or just in a specific region.

What is optimal gRNA length? ›

The most commonly used gRNA is about 100 base pairs in length. By altering the 20 base pairs towards the 5' end of the gRNA, the CRISPR Cas9 system can be targeted towards any genomic region complementary to that sequence.

What is your target score? ›

A target score is the score you will shoot if you “play to your handicap.” Think of it in the same way as a professional would think of par.

What is the average size of gRNA in CRISPR-Cas9 system? ›

CRISPR-Cas9 uses a 20 nucleotide gRNA as a guide to find the complementary protospacer DNA target in a genome where it cuts the double stranded DNA precisely 3 base pairs upstream of the PAM sequence, a process that requires CRISPR-Cas9 to undergo several complicated but finely-tuned conformational changes (Fig.

What is the FDR cutoff for CRISPR screen? ›

Design and synthesis of secondary CRISPR library

An FDR cutoff of 50% and 75% was used to select can- didate positive and negative regulators, respectively.

How to choose sgRNA in CRISPR? ›

Choosing the Right sgRNA for Your CRISPR Experiments
  1. Not all sgRNA are created equal. ...
  2. Predicting on-target activity. ...
  3. Minimizing off-target effects. ...
  4. It's all a balance. ...
  5. References.
Sep 8, 2020

What is on target and off-target activity? ›

On-target refers to exaggerated and adverse pharmacologic effects at the target of interest in the test system. Off-target refers to adverse effects as a result of modulation of other targets; these may be related biologically or totally unrelated to the target of interest.

What are on and off-target effects? ›

Describes the effects that can occur when a drug binds to targets (proteins or other molecules in the body) other than those for which the drug was meant to bind. This can lead to unexpected side effects that may be harmful. Learning about the off-target effects of drugs may help in drug development.

What is the on and off switch for a gene? ›

The process of turning genes on and off is known as gene regulation. Gene regulation is an important part of normal development. Genes are turned on and off in different patterns during development to make a brain cell look and act different from a liver cell or a muscle cell, for example.

How do you avoid off-target effects of Crispr Cas9? ›

One strategy is to inactivate one of the restriction sites in Cas9 and obtain mutant D10A Cas9 and H840A Cas9 enzymes. Since mutant Cas9 can only cut single strand, two sgRNA are required to guide simultaneously, and two adjacent incisions are generated in different strands of DNA to cause double-strand DNA fracture.

How could you determine whether off targeting has occurred? ›

Off-target mutations occur when a nuclease-induced DSB is repaired by error-prone NHEJ. The most direct way to detect and quantify the off-target activity of a given nuclease is to track these breaks in the genome.

Can this Crispr Cas9 technique turn genes on and off? ›

When the target DNA is found, Cas9 – one of the enzymes produced by the CRISPR system – binds to the DNA and cuts it, shutting the targeted gene off. Using modified versions of Cas9, researchers can activate gene expression instead of cutting the DNA.

What is the ratio of gRNA to Cas9 mrna? ›

OriGene recommends Cas9:gRNA ratios between 1:3 and 1:9 for RNP formation. 2 Incubate RNPs for 5-10 minutes at room temperature. down.

What is the optimal ratio of Cas9 to sgRNA? ›

Optimize molar ratio of sgRNA and Cas9

Synthego recommends starting at a molar concentration ratio of 1:1 (sgRNA:Cas9) and testing ratios up to 9:1 for cell lines and up to 5:1 for primary cells. Experiments have shown that increasing the molar quantity of sgRNA relative to Cas9 increases the indel frequency.

What is a good sequencing depth for RNA seq? ›

In many cases 5 M – 15 M mapped reads are sufficient. You will be able to get a good snapshot of highly expressed genes. A higher sequencing depth generates more informational reads, which increases the statistical power to detect differential expression also among genes with lower expression levels.

What is a target score of the minimum where you can expect the control to work reliably with minor flaws or occasional re work? ›

Five is the minimum target where you can expect the control to work reliably with minor flaws or occasional rework. Six and seven are midpoints where the control is easier to operate and more reliable.

What does my score range mean? ›

In small type below your score is your Score Range. This refers to the range of scores you might expect to get if you took the SAT multiple times on different days. Some colleges look at your score range rather than your Total Score in considering your application.

Can Cas9 cut DNA without gRNA? ›

Without gRNA binding, the Cas9 protein is inactive and binds DNA weakly and nonspecifically. Structural studies also revealed there is a large conformational change in Cas9 where Helix-III (Hel-III) in the REC domain moves towards the HNH nuclease domain upon guide RNA loading, illustrated in the Figure 1 cartoon.

What is sgRNA vs gRNA in CRISPR? ›

In this guide, we have used the conventional definitions to avoid confusion: gRNA is the term that describes all CRISPR guide RNA formats, and sgRNA refers to the simpler alternative that combines both the crRNA and tracrRNA elements into a single RNA molecule.

What type of gRNA is used in CRISPR technology? ›

CRISPR-Cas9 is a complexed, two-component system using a short guide RNA (gRNA) sequence to direct the Cas9 endonuclease to the target site. Modifying the gRNA independent of the Cas9 protein confers ease and flexibility to improve the CRISPR-Cas9 system as a genome-editing tool.

Is FDR 0.1 acceptable? ›

You may use FDR or 0.1 if the number of diff. expressed genes (DEGs) from DESeq2 is not large (>100 or more). Typically FDR of 0.1 means that there is a chance that 10% of the genes are not false positive i.e. if 100 genes are called DEGs then about 10 genes are false positive.

What is an acceptable FDR value? ›

When you report your results, you should make FDR to 0.1 or 0.05 unless you have a good reason not to. Your report will be consistent to the literature because other people also use 0.1 or 0.05. If you set your FDR to something like 0.0456, your paper might be rejected for inconsistency.

Is FDR 0.1 significant? ›

P values and associated FDR-controlled P values are ways of saying how certain you are that a result is real. Hence at FDR of 0.1 you're 90% sure its 'real', given the multiple testing associated with something like RNAseq. So an FDR of 0.1 in DEseq2 is the same as 0.1 any other statistical test.

Is sgRNA and gRNA the same? ›

sgRNA is another phrase for gRNA. gRNA is synthetic and short RNA sequences utilised in specifying target sequence in the genome for endonuclease in the CRISPR system. Both these form the elements of CRISPR-based genome editing. Hence, it is just the difference in the terms, and no difference as such.

What is the most important parameter when selecting for the best gRNA? ›

Location and sequence are important considerations for designing your gRNAs. For indels, it's not so important what location in the gene you target, but it is important that your gRNA sequence is designed to be highly active and reduce off targets.

Is sgRNA necessary for CRISPR? ›

CRISPR/Cas9 gene targeting requires a custom single guide RNA (sgRNA) that contains a targeting sequence (crRNA sequence) and a Cas9 nuclease-recruiting sequence (tracrRNA).

What is the difference between target at and aim at? ›

Target: the exact result of what you want to get. Aim: what you hope to get and you want to do this. Goal: what you hope to get, but may take a long period of time.

What is a target vs indicator? ›

The targets are interesting because they allow a more local, specific, and targeted lense through which to view each Goal. Indicators: Indicators help measure if the targets are being met. In order to do this, a tremendous amount of data needs to be collected.

What does on target group mean? ›

the particular group of people that an advertisement is intended to reach: An ad will be of no interest to a viewer or reader who is not in the target group.

What is an example of on target effects? ›

Common on-target side effects include skin rash from inhibitors of the MAP kinase pathway or ocular toxicities from MEK inhibitors, Hsp90 inhibitors, and selective FGFR inhibitors.

What are off-target modifications? ›

However, off-target modifications, which are usually defined as changes to the DNA or RNA, in regions other than the target site, are known to occur as a consequence of gene editing despite the specificity of the Cas9 enzyme and other CRISPR-associated endonucleases.

What does slightly off-target mean? ›

Prepositional phrase. off target. Inaccurate, or inaccurately predicted.

How do you know if a gene is turned on or off? ›

To go about answering these types of questions, researchers often use laboratory techniques such as a Northern blot or serial analysis of gene expression (SAGE). Both of these techniques make it possible to identify which genes are turned on and which are turned off within cells.

How does Crispr off work? ›

In the CRISPR system, a short RNA guide sequence and a variant of the Cas9 protein are introduced into cells. The RNA will bind to an identical sequence in the genome, as long as it is followed by the PAM sequence (usually -GG), and mark it for subsequent cutting or modification by the Cas9 protein.

How do scientists use technology to determine if genes are turned on or off? ›

Microarray analysis involves breaking open a cell, isolating its genetic contents, identifying all the genes that are turned on in that particular cell, and generating a list of those genes. DNA microarray analysis is a technique that scientists use to determine whether genes are on or off.

How do you detect CRISPR off targets? ›

Next generation sequencing (NGS) is the recommended method for full investigation of CRISPR edits. Highly precise and accurate, NGS allows identification of even small numbers of unintended edits at both the target site and at off-target sites.

How often is CRISPR off-target? ›

No off-target modifications caused by CRISPR endonucleases were reported in 63% of the studies (67/107). In 11% (12/107) of the studies, the outcome of the off-target analysis was not reported. In 26% (28/107) of the selected literature, modifications caused by CRISPR endonucleases were reported (Table 1).

What is off-target in CRISPR-Cas9 system? ›

The off-target effects occur when Cas9 acts on untargeted genomic sites and creates cleavages that may lead to adverse outcomes. The off-target sites are often sgRNA-dependent, since Cas9 is known to tolerate up to 3 mismatches between sgRNA and genomic DNA (Fu et al., 2013; Hsu et al., 2013; Wang et al., 2016a).

What is the off-target score? ›

An off-target score is generated (between 0 - 1) that indicates the inverse probability of off-target cutting, with a higher score denoting targets with lower off-target potential.

How accurate is Crispr-Cas9? ›

To this end, they are using new amino acids in the Cas9 protein and changing its architecture as a result. For example, using the eSpCas9 and Cas9-HF1 variants, scientists have developed extremely precise Cas9 proteins which, in the case of HF1, achieve an accuracy rate of over 99.9 percent.

How do you target genes with CRISPR? ›

CRISPR-introduced single gene targeting can be achieved by delivering in vitro prepared Cas protein-sgRNA ribonucleoproteins (RNPs), Cas protein mRNA and sgRNA, or plasmid(s) encoding Cas protein and sgRNA. In these approaches, adding additional sgRNAs allows targeting multiple genes at the same time.

What is the on off switch in CRISPR? ›

MIT and UCSF researchers create CRISPR 'on-off switch' that controls gene expression without changing DNA. The gene editing system CRISPR-Cas9 makes breaks in DNA strands that are repaired by cells—a process that can be hard to control, resulting in unwanted genetic changes.

How does CRISPR-Cas9 know which gene to cut? ›

“First, Cas9 recognizes a short DNA segment next to the target – the PAM – then the target DNA is matched up with the guide RNA via Watson-Crick base-pairing. Finally, when a perfect match is identified, the last part of the protein swings into place to enable cutting and initiate genome editing.”

What are the two main ways that CRISPR-Cas9 can be used? ›

CRISPR/Cas9 Gene Editing
  • DISRUPT. If a single cut is made, a process called non-homologous end joining can result in the addition or deletion of base pairs, disrupting the original DNA sequence and causing gene inactivation.
  • DELETE. ...
  • CORRECT OR INSERT.

What is Target Priority score? ›

Target priority scores

To facilitate the identification of candidate drug targets, fitness genes are assigned a cancer type-specific and pan-cancer target priority score. Ranging from 0 to 100, the target priority score integrates multiple lines of evidence to prioritize a target.

What is CRISPR target identification? ›

CRISPR-based drug target identification works by generating a selectable phenotype upon compound treatment, usually based on proliferation assays. BDW568 is a potent IFN-I signaling activator; however, its mechanism of action is not based on proliferation.

What is CRISPR scan score? ›

CRISPRscan is a novel scoring algorithm from the Giraldez Lab (Yale University) that helps you select the best gRNAs. CRISPRscan. Now you can mutate (almost) anything. NEW in 2023. Many new echinoderms: Bat star & Purple and green sea urchins.

What is CRISPR score? ›

We defined the CRISPR score (CS) as the average log2 fold-change in the abundance of all sgRNAs targeting a given gene, with replicate experiments showing a high degree of reproducibility (r=0.90) (Fig.

What are the four levels of priority? ›

Priority scales are usually defined as:
  • Critical/severe.
  • Major/high.
  • Medium.
  • Minor/low.
Jan 1, 2020

How do you calculate priority score? ›

Your final priority score is calculated as: Value/Effort *10; the higher the score, the better but keep in mind that your score isn't everything. You can also use a value vs. effort priority matrix to visualize your key priorities by placing 'value' on the x axis, and 'effort' on the y axis.

What is the priority grading scale? ›

Priority Score (PS) can be any number between 2 to 10. 10 means that it has the highest priority and that it's a no-brainer to pursue. 2 means that it has the lowest priority and that you should consider not doing it at all. PS is the result of adding up the Value Score and Effort Score together.

Can CRISPR target more than one gene? ›

The CRISPR systems can be easily adapted to target multiple genes because CRISPR-introduced targeting relies on base-pairing between guide and target nucleotide sequences. Therefore, multiplexing can be achieved by introducing several sgRNAs simultaneously.

How is CRISPR effectiveness measured? ›

The simplest but often the least specific method of identifying successful CRISPR genome editing is to observe phenotypes of edited cells. In some experiments, the expected phenotype from the gene editing process is known. For example, expression of a fluorescent protein may be turned on or off.

What is the CFD score in CRISPR? ›

The CFD score ranges between 0 and 100 for each guide, with 100 being the strongest interaction between the guide and the target and 0 being the weakest interaction due to mismatches between the guide and the DNA target.

What is the difference between gRNA and sgRNA? ›

sgRNA is another phrase for gRNA. gRNA is synthetic and short RNA sequences utilised in specifying target sequence in the genome for endonuclease in the CRISPR system. Both these form the elements of CRISPR-based genome editing. Hence, it is just the difference in the terms, and no difference as such.

What is gene set score? ›

An FCS method often consists of a set of common components such as a gene score that is a statistic summarizing the expression level of a gene across control and case samples, a gene set score that summarizes the expression level of genes within a gene set as a single statistic, a procedure for significance assessment, ...

What is gene enrichment score? ›

The enrichment score (ES) is the maximum deviation from zero encountered during that walk. The ES reflects the degree to which the genes in a gene set are overrepresented at the top or bottom of the entire ranked list of genes.

What is the score of the gene? ›

A “gene score” is any value that is applied to genes in an experiment and which represents some measure of “quality” or “interest”. Examples might be a t-test p value or fold change. These scores must be computed separately and then supplied to the software for analysis.

Videos

1. gUIDEbook™ gRNA Design - for CRISPR genome editing experiments
(Horizon Discovery)
2. Simplified Online Tools for CRISPR Cas9 Gene Editing Design and Confirmation
(Labroots)
3. Designing gRNA Oligos to Clone into Cas9 Expression Plasmids for KO Experiments
(Jacob Elmer)
4. Design high specificity CRISPR Cas9 gRNAs principles and tools
(GenScript USA Inc.)
5. How to design gRNA for CRISPR genome editing
(Genomics Lab)
6. CRISPR-Cas9 Target Specific gRNA Design for High Efficiency Target Cleavage
(Rahul Patharkar)
Top Articles
Latest Posts
Article information

Author: Edwin Metz

Last Updated: 08/05/2023

Views: 6179

Rating: 4.8 / 5 (78 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Edwin Metz

Birthday: 1997-04-16

Address: 51593 Leanne Light, Kuphalmouth, DE 50012-5183

Phone: +639107620957

Job: Corporate Banking Technician

Hobby: Reading, scrapbook, role-playing games, Fishing, Fishing, Scuba diving, Beekeeping

Introduction: My name is Edwin Metz, I am a fair, energetic, helpful, brave, outstanding, nice, helpful person who loves writing and wants to share my knowledge and understanding with you.