From computational models of the splicing code to regulatory mechanisms and therapeutic implications

Denti, M. A., Viero, G., Provenzani, A., Quattrone, A. & Macchi, P. mRNA fate: life and death of the mRNA in the cytoplasm. RNA Biol. 10, 360–366 (2013).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Nikom, D. & Zheng, S. Alternative splicing in neurodegenerative disease and the promise of RNA therapies. Nat. Rev. Neurosci. 24, 457–473 (2023).Article 
CAS 
PubMed 

Google Scholar 
Bradley, R. K. & Anczuków, O. RNA splicing dysregulation and the hallmarks of cancer. Nat. Rev. Cancer 23, 135–155 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ule, J. & Blencowe, B. J. Alternative splicing regulatory networks: functions, mechanisms, and evolution. Mol. Cell 76, 329–345 (2019).Article 
CAS 
PubMed 

Google Scholar 
Rogalska, M. E., Vivori, C. & Valcárcel, J. Regulation of pre-mRNA splicing: roles in physiology and disease, and therapeutic prospects. Nat. Rev. Genet. 24, 251–269 (2022).Article 
PubMed 

Google Scholar 
Wang, Z. & Burge, C. B. Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. RNA 14, 802–813 (2008).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ule, J. et al. An RNA map predicting Nova-dependent splicing regulation. Nature 444, 580–586 (2006).Article 
CAS 
PubMed 

Google Scholar 
Wang, Z., Xiao, X., Van Nostrand, E. & Burge, C. B. General and specific functions of exonic splicing silencers in splicing control. Mol. Cell 23, 61–70 (2006).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
de Boer, C. G. & Taipale, J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 625, 41–50 (2024).Article 
PubMed 

Google Scholar 
Scotti, M. M. & Swanson, M. S. RNA mis-splicing in disease. Nat. Rev. Genet. 17, 19–32 (2016).Article 
CAS 
PubMed 

Google Scholar 
Rowlands, C. F., Baralle, D. & Ellingford, J. M. Machine learning approaches for the prioritization of genomic variants impacting pre-mRNA splicing. Cells 8, 1513 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Eraslan, G., Avsec, Ž., Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389–403 (2019).Article 
CAS 
PubMed 

Google Scholar 
Hwang, H., Jeon, H., Yeo, N. & Baek, D. Big data and deep learning for RNA biology. Exp. Mol. Med. 56, 1293–1321 (2024).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Shapiro, M. B. & Senapathy, P. RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. Nucleic Acids Res. 15, 7155–7174 (1987).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Carmel, I., Tal, S., Vig, I. & Ast, G. Comparative analysis detects dependencies among the 5′ splice-site positions. RNA 10, 828–840 (2004).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Berglund, J. A., Abovich, N. & Rosbash, M. A cooperative interaction between U2AF65 and mBBP/SF1 facilitates branchpoint region recognition. Genes Dev. 12, 858–867 (1998).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Paggi, J. M. & Bejerano, G. A sequence-based, deep learning model accurately predicts RNA splicing branchpoints. RNA 24, 1647–1658 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Breathnach, R., Benoist, C., O’Hare, K., Gannon, F. & Chambon, P. Ovalbumin gene: evidence for a leader sequence in mRNA and DNA sequences at the exon-intron boundaries. Proc. Natl Acad. Sci. USA 75, 4853–4857 (1978).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Yoshida, H. et al. Elucidation of the aberrant 3′ splice site selection by cancer-associated mutations on the U2AF1. Nat. Commun. 11, 4744 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ast, G. How did alternative splicing evolve? Nat. Rev. Genet. 5, 773–782 (2004).Article 
CAS 
PubMed 

Google Scholar 
Parker, M. T. et al. m6A modification of U6 snRNA modulates usage of two major classes of pre-mRNA 5′ splice site. eLife 11, e78808 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Shenasa, H. & Bentley, D. L. Pre-mRNA splicing and its cotranscriptional connections. Trends Genet. 39, 672–685 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Izaurralde, E. et al. A nuclear cap binding protein complex involved in pre-mRNA splicing. Cell 78, 657–668 (1994).Article 
CAS 
PubMed 

Google Scholar 
Cooke, C., Hans, H. & Alwine, J. C. Utilization of splicing elements and polyadenylation signal elements in the coupling of polyadenylation and last-intron removal. Mol. Cell. Biol. 19, 4971–4979 (1999).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Licatalosi, D. D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464 (2008).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Rot, G. et al. High-resolution RNA maps suggest common principles of splicing and polyadenylation regulation by TDP-43. Cell Rep. 19, 1056–1067 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Fiszbein, A., Krick, K. S., Begg, B. E. & Burge, C. B. Exon-mediated activation of transcription starts. Cell 179, 1551–1565.e17 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Furger, A., O’Sullivan, J. M., Binnie, A., Lee, B. A. & Proudfoot, N. J. Promoter proximal splice sites enhance transcription. Genes Dev. 16, 2792–2799 (2002).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Reimer, K. A., Mimoso, C. A., Adelman, K. & Neugebauer, K. M. Co-transcriptional splicing regulates 3′ end cleavage during mammalian erythropoiesis. Mol. Cell 81, 998–1012.e7 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Tilgner, H. et al. Nucleosome positioning as a determinant of exon recognition. Nat. Struct. Mol. Biol. 16, 996–1001 (2009).Article 
CAS 
PubMed 

Google Scholar 
Kfir, N. et al. SF3B1 association with chromatin determines splicing outcomes. Cell Rep. 11, 618–629 (2015).Article 
CAS 
PubMed 

Google Scholar 
Deutsch, M. & Long, M. Intron–exon structures of eukaryotic model organisms. Nucleic Acids Res. 27, 3219–3228 (1999).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Santoni, M. J. et al. Differential exon usage involving an unusual splicing mechanism generates at least eight types of NCAM cDNA in mouse brain. EMBO J. 8, 385–392 (1989).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Piovesan, A. et al. 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics. Database 2016, baw153 (2016).Article 
PubMed 
PubMed Central 

Google Scholar 
Amit, M. et al. Differential GC content between exons and introns establishes distinct strategies of splice-site recognition. Cell Rep. 1, 543–556 (2012).Article 
CAS 
PubMed 

Google Scholar 
Black, D. L. Mechanisms of alternative pre-messenger RNA splicing. Annu. Rev. Biochem. 72, 291–336 (2003).Article 
CAS 
PubMed 

Google Scholar 
Witten, J. T. & Ule, J. Understanding splicing regulation through RNA splicing maps. Trends Genet. 27, 89–97 (2011).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA-binding proteins. Nature 583, 711–719 (2020).Article 
PubMed 
PubMed Central 

Google Scholar 
Erkelenz, S. et al. Position-dependent splicing activation and repression by SR and hnRNP proteins rely on common mechanisms. RNA 19, 96–102 (2013).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Slišković, I., Eich, H. & Müller-McNicoll, M. Exploring the multifunctionality of SR proteins. Biochem. Soc. Trans. 50, 187–198 (2022).Article 
PubMed 

Google Scholar 
Ule, J. et al. CLIP identifies nova-regulated RNA networks in the brain. Science 302, 1212–1215 (2003).Article 
CAS 
PubMed 

Google Scholar 
Hallegger, M. et al. TDP-43 condensation properties specify its RNA-binding and regulatory repertoire. Cell 184, 4680–4696.e22 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Sharma, D. et al. The kinetic landscape of an RNA-binding protein in cells. Nature 591, 152–156 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Keller, E. B. & Noon, W. A. Intron splicing: a conserved internal signal in introns of animal pre-mRNAs. Proc. Natl Acad. Sci. USA 81, 7417–7420 (1984).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Fairbrother, W. G., Yeh, R.-F., Sharp, P. A. & Burge, C. B. Predictive identification of exonic splicing enhancers in human genes. Science 297, 1007–1013 (2002).Article 
CAS 
PubMed 

Google Scholar 
Fairbrother, W. G. et al. RESCUE-ESE identifies candidate exonic splicing enhancers in vertebrate exons. Nucleic Acids Res. 32, W187–W190 (2004).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Cartegni, L. et al. A web resource to identify exonic splicing enhancers. Nucleic Acids Res. 31, 3568–3571 (2003).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wang, Z. et al. Systematic identification and analysis of exonic splicing silencers. Cell 119, 831–845 (2004).Article 
CAS 
PubMed 

Google Scholar 
Kupfer, D. M. et al. Introns and splicing elements of five diverse fungi. Eukaryot. Cell 3, 1088–1100 (2004).Article 
PubMed 
PubMed Central 

Google Scholar 
Desmet, F.-O. et al. Human splicing finder: an online bioinformatics tool to predict splicing signals. Nucleic Acids Res. 37, e67 (2009).Article 
PubMed 
PubMed Central 

Google Scholar 
Sonnenburg, S., Schweikert, G., Philips, P., Behr, J. & Rätsch, G. Accurate splice site prediction using support vector machines. BMC Bioinformatics 8, (Suppl. 10), S7 (2007).Article 
PubMed 
PubMed Central 

Google Scholar 
Zhang, X. H.-F., Leslie, C. S. & Chasin, L. A. Computational searches for splicing signals. Methods 37, 292–305 (2005).Article 
CAS 
PubMed 

Google Scholar 
Yeo, G. & Burge, C. B. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J. Comput. Biol. 11, 377–394 (2004).Article 
CAS 
PubMed 

Google Scholar 
Salzberg, S. L. A method for identifying splice sites and translational start sites in eukaryotic mRNA. Comput. Appl. Biosci. 13, 365–376 (1997).CAS 
PubMed 

Google Scholar 
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).Article 
CAS 
PubMed 

Google Scholar 
Reese, M. G., Eeckman, F. H., Kulp, D. & Haussler, D. Improved splice site detection in Genie. J. Comput. Biol. 4, 311–323 (1997).Article 
CAS 
PubMed 

Google Scholar 
Barash, Y. et al. Deciphering the splicing code. Nature 465, 53–59 (2010). This paper reports the original splicing code model, describing an integrative model containing more than 1,000 input features and taking on a tissue-specific prediction task that is still challenging today.Article 
CAS 
PubMed 

Google Scholar 
Xiong, H. Y., Barash, Y. & Frey, B. J. Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context. Bioinformatics 27, 2554–2562 (2011).Article 
CAS 
PubMed 

Google Scholar 
Xiong, H. Y. et al. RNA splicing. The human splicing code reveals new insights into the genetic determinants of disease. Science 347, 1254806 (2015).Article 
PubMed 

Google Scholar 
Leung, M. K. K., Xiong, H. Y., Lee, L. J. & Frey, B. J. Deep learning of the tissue-regulated splicing code. Bioinformatics 30, i121–i129 (2014).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Zhang, Z. et al. Deep-learning augmented RNA-seq analysis of transcript splicing. Nat. Methods 16, 307–310 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Xu, Y., Wang, Y., Luo, J., Zhao, W. & Zhou, X. Deep learning of the splicing (epi)genetic code reveals a novel candidate mechanism linking histone modifications to ESC fate decision. Nucleic Acids Res. 45, 12100–12112 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Kim, S., Kim, H., Fong, N., Erickson, B. & Bentley, D. L. Pre-mRNA splicing is a determinant of histone H3K36 methylation. Proc. Natl Acad. Sci. USA 108, 13564–13569 (2011).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Bhattacharya, S. et al. The methyltransferase SETD2 couples transcription and splicing by engaging mRNA processing factors through its SHI domain. Nat. Commun. 12, 1443 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Kolasinska-Zwierz, P. et al. Differential chromatin marking of introns and expressed exons by H3K36me3. Nat. Genet. 41, 376–381 (2009).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hu, Q., Greene, C. S. & Heller, E. A. Specific histone modifications associate with alternative exon selection during mammalian development. Nucleic Acids Res. 48, 4709–4724 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e24 (2019). This paper describes SpliceAI, which uses dilated convolutional residual neural networks for splice site prediction, enabling efficient training of deeper networks with wider sequence context, improving prediction accuracy.Article 
CAS 
PubMed 

Google Scholar 
Zeng, T. & Li, Y. I. Predicting RNA splicing from DNA sequence using Pangolin. Genome Biol. 23, 103 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Cheng, J. et al. MMSplice: modular modeling improves the predictions of genetic variant effects on splicing. Genome Biol. 20, 48 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Rentzsch, P., Schubach, M., Shendure, J. & Kircher, M. CADD-Splice—improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med. 13, 31 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Smith, C. & Kitzman, J. O. Benchmarking splice variant prediction algorithms using massively parallel splicing assays. Genome Biol. 24, 294 (2023). This paper shows that independent benchmarking of splicing models using MPRA data provides valuable insights into areas for future model improvement.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Cheng, J., Çelik, M. H., Kundaje, A. & Gagneur, J. MTSplice predicts effects of genetic variants on tissue-specific splicing. Genome Biol. 22, 94 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ling, J. P. et al. ASCOT identifies key regulators of neuronal subtype-specific splicing. Nat. Commun. 11, 137 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Xu, C. et al. Reference-informed prediction of alternative splicing and splicing-altering mutations from sequences. Genome Res. 34, 1052–1056 (2024).Article 
PubMed 
PubMed Central 

Google Scholar 
Linder, J., Srivastava, D., Yuan, H., Agarwal, V. & Kelley, D. R. Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. Preprint at bioRxiv https://doi.org/10.1101/2023.08.30.555582 (2023).Celaj, A. et al. An RNA foundation model enables discovery of disease mechanisms and candidate therapeutics. Preprint at bioRxiv https://doi.org/10.1101/2023.09.20.558508 (2023).Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ghanbari, M. & Ohler, U. Deep neural networks for interpreting RNA-binding protein target preferences. Genome Res. 30, 214–226 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Agarwal, V., Bell, G. W., Nam, J.-W. & Bartel, D. P. Predicting effective microRNA target sites in mammalian mRNAs. eLife 4, e05005 (2015).Article 
PubMed 
PubMed Central 

Google Scholar 
Chen, K. et al. Self-supervised learning on millions of primary RNA sequences from 72 vertebrates improves sequence-based RNA splicing prediction. Brief. Bioinform. 25, bbae163 (2024).Article 
PubMed 
PubMed Central 

Google Scholar 
Karollus, A. et al. Species-aware DNA language models capture regulatory elements and their evolution. Genome Biol. 25, 83 (2024).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
de Almeida, B. P. et al. SegmentNT: annotating the genome at single-nucleotide resolution with DNA foundation models. Preprint at bioRxiv https://doi.org/10.1101/2024.03.14.584712 (2024).Dalla-Torre, H. et al. The nucleotide transformer: building and evaluating robust foundation models for human genomics. Preprint at bioRxiv https://doi.org/10.1101/2023.01.11.523679 (2023).Zoonomia Consortium. A comparative genomics multitool for scientific discovery and conservation. Nature 587, 240–245 (2020).Article 
CAS 

Google Scholar 
Gupta, K. et al. Improved modeling of RNA-binding protein motifs in an interpretable neural model of RNA splicing. Genome Biol. 25, 23 (2024). This article describes an interpretable-by-design model, in which prior knowledge of RBP motifs is refined by convolutional neural networks that adjust in vitro-derived motif representations to more accurately represent in vivo binding.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Liao, S. E., Sudarshan, M. & Regev, O. Deciphering RNA splicing logic with interpretable machine learning. Proc. Natl Acad. Sci. USA 120, e2221165120 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
McCue, K. & Burge, C. B. An interpretable model of pre-mRNA splicing for animal and plant genes. Sci. Adv. 10, eadn1547 (2024).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Bretschneider, H., Gandhi, S., Deshwar, A. G., Zuberi, K. & Frey, B. J. COSSMO: predicting competitive alternative splice site selection using deep learning. Bioinformatics 34, i429–i437 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
GTEx Consortium The GTEx consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).Article 

Google Scholar 
Tapial, J. et al. An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms. Genome Res. 27, 1759–1768 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Sibley, C. R., Blazquez, L. & Ule, J. Lessons from non-canonical splicing. Nat. Rev. Genet. 17, 407–421 (2016).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Glinos, D. A. et al. Transcriptome variation in human tissues revealed by long-read sequencing. Nature 608, 353–359 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Sharon, D., Tilgner, H., Grubert, F. & Snyder, M. A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol. 31, 1009–1014 (2013).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hagemann-Jensen, M. et al. Single-cell RNA counting at allele and isoform resolution using Smart-seq3. Nat. Biotechnol. 38, 708–714 (2020).Article 
CAS 
PubMed 

Google Scholar 
Salmen, F. et al. High-throughput total RNA sequencing in single cells using VASA-seq. Nat. Biotechnol. 40, 1780–1793 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Lebrigand, K., Magnone, V., Barbry, P. & Waldmann, R. High throughput error corrected Nanopore single cell transcriptome sequencing. Nat. Commun. 11, 4025 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hardwick, S. A. et al. Single-nuclei isoform RNA sequencing unlocks barcoded exon connectivity in frozen brain tissue. Nat. Biotechnol. 40, 1082–1092 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Shiau, C.-K. et al. High throughput single cell long-read sequencing analyses of same-cell genotypes and phenotypes in human tumors. Nat. Commun. 14, 4124 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Gilbert, W. V. & Nachtergaele, S. mRNA regulation by RNA modifications. Annu. Rev. Biochem. 92, 175–198 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Novakovsky, G., Dexter, N., Libbrecht, M. W., Wasserman, W. W. & Mostafavi, S. Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat. Rev. Genet. 24, 125–137 (2023).Article 
CAS 
PubMed 

Google Scholar 
Kainth, A. S., Haddad, G. A., Hall, J. M. & Ruthenburg, A. J. Merging short and stranded long reads improves transcript assembly. PLoS Comput. Biol. 19, e1011576 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Joglekar, A. et al. Single-cell long-read mRNA isoform regulation is pervasive across mammalian brain regions, cell types, and development. Preprint at bioRxiv https://doi.org/10.1101/2023.04.02.535281 (2023).Baeza-Centurion, P. et al. Deep indel mutagenesis reveals the regulatory and modulatory architecture of alternative exon splicing. Preprint at bioRxiv https://doi.org/10.1101/2024.04.21.590414 (2024).Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Karollus, A., Mauermeier, T. & Gagneur, J. Current sequence-based models capture gene expression determinants in promoters but mostly ignore distal enhancers. Genome Biol. 24, 56 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021). This paper describes the introduction of natural language processing concepts to DNA sequence modelling.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
da Silva, P. T. et al. Nucleotide dependency analysis of DNA language models reveals genomic functional elements. Preprint at bioRxiv https://doi.org/10.1101/2024.07.27.605418 (2024).Jha, A. et al. Enhanced integrated gradients: improving interpretability of deep learning models using splicing codes as a case study. Genome Biol. 21, 149 (2020). This study goes from deep learning to testing biological insight at the bench, a great example of what is possible with crosstalk between explainable AI and experimental biology.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ray, D. et al. Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins. Nat. Biotechnol. 27, 667–670 (2009).Article 
CAS 
PubMed 

Google Scholar 
Sutandy, F. X. R. et al. In vitro iCLIP-based modeling uncovers how the splicing factor U2AF2 relies on regulation by cofactors. Genome Res. 28, 699–713 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hafner, M. et al. CLIP and complementary methods. Nat. Rev. Methods Prim. 1, 20 (2021).Article 
CAS 

Google Scholar 
Briese, M. et al. A systems view of spliceosomal assembly and branchpoints with iCLIP. Nat. Struct. Mol. Biol. 26, 930–940 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wolin, E. et al. SPIDR: a highly multiplexed method for mapping RNA-protein interactions uncovers a potential mechanism for selective translational suppression upon cellular stress. Preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2023.06.05.543769v1 (2023).Lorenz, D. A. et al. Multiplexed transcriptome discovery of RNA-binding protein binding sites by antibody-barcode eCLIP. Nat. Methods 20, 65–69 (2023).Article 
CAS 
PubMed 

Google Scholar 
West, C. et al. nf-core/clipseq-a robust Nextflow pipeline for comprehensive CLIP data analysis. Wellcome Open Res. 8, 286 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Katsantoni, M., van Nimwegen, E. & Zavolan, M. Improved analysis of (e)CLIP data with RCRUNCH yields a compendium of RNA-binding protein binding sites and motifs. Genome Biol. 24, 77 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Boyle, E. A. et al. Skipper analysis of eCLIP datasets enables sensitive detection of constrained translation factor binding sites. Cell Genom. 3, 100317 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Capitanchik, C. et al. Flow: a web platform and open database to analyse, store, curate and share bioinformatics data at scale. Preprint at bioRxiv https://doi.org/10.1101/2023.08.22.544179 (2023).Horlacher, M. et al. Towards in silico CLIP-seq: predicting protein-RNA interaction via sequence-to-signal learning. Genome Biol. 24, 180 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Zhu, H. et al. Dynamic characterization and interpretation for protein–RNA interactions across diverse cellular conditions using HDRNet. Nat. Commun. 14, 6824 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Shen, X. & Li, X. Reformer: deep learning model for characterizing protein-RNA interactions from sequence at single-base resolution. Preprint at bioRxiv https://doi.org/10.1101/2024.01.14.575540 (2024).Quinn, T. P., Nguyen, D., Gupta, S. & Venkatesh, S. A neural model of RNA splicing: learning motif distances with self-attention and toeplitz max pooling. Preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2021.05.24.445518v1 (2021).Welzel, M., Di Liddo, A., Möckel, M. M. & Zarnack, K. FUBP1 is a general splicing factor facilitating 3′ splice site recognition and splicing of long introns. Mol. Cell 83, 2653–2672 (2023).Article 
PubMed 

Google Scholar 
Signal, B., Gloss, B. S., Dinger, M. E. & Mercer, T. R. Machine learning annotation of human branchpoints. Bioinformatics 34, 920–927 (2018).Article 
CAS 
PubMed 

Google Scholar 
Ye, R. et al. Capture RIC-seq reveals positional rules of PTBP1-associated RNA loops in splicing regulation. Mol. Cell 83, 1311–1327.e7 (2023).Article 
CAS 
PubMed 

Google Scholar 
Liu, N. et al. N6-methyladenosine-dependent RNA structural switches regulate RNA-protein interactions. Nature 518, 560–564 (2015).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Barrass, J. D. et al. Transcriptome-wide RNA processing kinetics revealed using extremely short 4tU labeling. Genome Biol. 16, 282 (2015).Article 
PubMed 
PubMed Central 

Google Scholar 
Spitale, R. C. & Incarnato, D. Probing the dynamic RNA structurome and its functions. Nat. Rev. Genet. 24, 178–196 (2023).Article 
CAS 
PubMed 

Google Scholar 
Rangan, R. et al. RNA structure landscape of S. cerevisiae introns. Preprint at bioRxiv https://doi.org/10.1101/2022.07.22.501175 (2024).Wang, J. et al. RNA structure profiling at single-cell resolution reveals new determinants of cell identity. Nat. Methods 21, 411–422 (2024).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wang, R., Helbig, I., Edmondson, A. C., Lin, L. & Xing, Y. Splicing defects in rare diseases: transcriptomics and machine learning strategies towards genetic diagnosis. Brief. Bioinform. 24, bbad284 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Liu, E. Y. et al. Loss of nuclear TDP-43 is associated with decondensation of LINE retrotransposons. Cell Rep. 27, 1409–1421.e6 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Sparber, P. et al. Deciphering the impact of coding and non-coding SCN1A gene variants on RNA splicing. Brain 147, 1278–1293 (2023).Article 

Google Scholar 
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).Article 
PubMed 
PubMed Central 

Google Scholar 
Walker, L. C. et al. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: recommendations from the ClinGen SVI Splicing Subgroup. Am. J. Hum. Genet. 110, 1046–1067 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Riepe et al. Benchmarking deep learning splice prediction tools using functional splice assays. Hum. Mutat. 42, 799–810 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wagner, N. et al. Aberrant splicing prediction across human tissues. Nat. Genet. 55, 861–870 (2023). In this study, tissue-specific splice site usage is quantified transcriptome-wide and used to build Absplice, a model that predicts the probability that a given variant causes aberrant splicing in a given tissue.Article 
CAS 
PubMed 

Google Scholar 
Dawes, R., Joshi, H. & Cooper, S. T. Empirical prediction of variant-activated cryptic splice donors using population-based RNA-Seq data. Nat. Commun. 13, 1655 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Dawes, R. et al. SpliceVault predicts the precise nature of variant-associated mis-splicing. Nat. Genet. 55, 324–332 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Havens, M. A. & Hastings, M. L. Splice-switching antisense oligonucleotides as therapeutic drugs. Nucleic Acids Res. 44, 6549–6563 (2016).Article 
PubMed 
PubMed Central 

Google Scholar 
Baughn, M. W. et al. Mechanism of STMN2 cryptic splice-polyadenylation and its correction for TDP-43 proteinopathies. Science 379, 1140–1149 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Aslesh, T. & Yokota, T. Restoring SMN expression: an overview of the therapeutic developments for the treatment of spinal muscular atrophy. Cells 11, 417 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Villemaire, J., Dion, I., Elela, S. A. & Chabot, B. Reprogramming alternative pre-messenger RNA splicing through the use of protein-binding antisense oligonucleotides. J. Biol. Chem. 278, 50031–50039 (2003).Article 
CAS 
PubMed 

Google Scholar 
Peacey, E., Rodriguez, L., Liu, Y. & Wolfe, M. S. Targeting a pre-mRNA structure with bipartite antisense molecules modulates tau alternative splicing. Nucleic Acids Res. 40, 9836–9849 (2012).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Denichenko, P. et al. Specific inhibition of splicing factor activity by decoy RNA oligonucleotides. Nat. Commun. 10, 1590 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Sergeeva, O. V., Shcherbinina, E. Y., Shomron, N. & Zatsepin, T. S. Modulation of RNA splicing by oligonucleotides: mechanisms of action and therapeutic implications. Nucleic Acid Ther. 32, 123–138 (2022).Article 
CAS 
PubMed 

Google Scholar 
Konermann, S. et al. Transcriptome engineering with RNA-targeting type VI-D CRISPR effectors. Cell 173, 665–676.e14 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Li, J. D., Taipale, M. & Blencowe, B. J. Efficient, specific, and combinatorial control of endogenous exon splicing with dCasRx-RBM25. Mol. Cell 84, 2573–2589 (2024).Article 
CAS 
PubMed 

Google Scholar 
Recinos, Y. et al. CRISPR-dCas13d-based deep screening of proximal and distal splicing-regulatory elements. Nat. Commun. 15, 3839 (2024).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Goyenvalle, A., Babbs, A., van Ommen, G.-J. B., Garcia, L. & Davies, K. E. Enhanced exon-skipping induced by U7 snRNA carrying a splicing silencer sequence: promising tool for DMD therapy. Mol. Ther. 17, 1234–1240 (2009).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ling, J. P., Pletnikova, O., Troncoso, J. C. & Wong, P. C. TDP-43 repression of nonconserved cryptic exons is compromised in ALS-FTD. Science 349, 650–655 (2015). This work revealed that pathological aggregation of a splicing regulator in neurodegenerative disease results in new exons being expressed in mature mRNA, which has led to numerous potential new therapeutic approaches for these diseases.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Taskiran, I. I. et al. Cell-type-directed design of synthetic enhancers. Nature 626, 212–220 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Monteys, A. M. et al. Regulated control of gene therapies by drug-induced splicing. Nature 596, 291–295 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ling, J. P. et al. Cell-specific regulation of gene expression using splicing-dependent frameshifting. Nat. Commun. 13, 5773 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Stanley, R. F. & Abdel-Wahab, O. Dysregulation and therapeutic targeting of RNA splicing in cancer. Nat. Cancer 3, 536–546 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wilkins, O. G. et al. Creation of de novo cryptic splicing for ALS/FTD precision medicine. Preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2023.11.15.565967 (2023). This paper presents SpliceNouveau, which enables computational design of therapeutic transgenes that are regulated by alternative splicing events; they are expressed only upon disease-activated splicing, thus ensuring that gene therapies are activated only in diseased cells and ensuring that the correct dosage is delivered via autoregulation.Breiman, L. Statistical modeling: the two cultures (with comments and a rejoinder by the author). SSO Schweiz. Monatsschr. Zahnheilkd. 16, 199–231 (2001).
Google Scholar 
Sapoval, N. et al. Current progress and open challenges for applying deep learning across the biosciences. Nat. Commun. 13, 1728 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Cheng, J., Çelik, M. H., Nguyen, T. Y. D., Avsec, Ž. & Gagneur, J. CAGI 5 splicing challenge: improved exon skipping and intron retention predictions with MMSplice. Hum. Mutat. 40, 1243–1251 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
He, S. et al. Ribonanza: deep learning of RNA structure through dual crowdsourcing. Preprint at bioRxiv https://doi.org/10.1101/2024.02.24.581671 (2024).Barbosa-Morais, N. L. et al. The evolutionary landscape of alternative splicing in vertebrate species. Science 338, 1587–1593 (2012).Article 
CAS 
PubMed 

Google Scholar 
Merkin, J., Russell, C., Chen, P. & Burge, C. B. Evolutionary dynamics of gene and isoform regulation in mammalian tissues. Science 338, 1593–1599 (2012). This paper and Barbosa-Morais et al. (ref. 162) calculate alternative splicing measurements across species, finding that alternative splicing is frequently lineage specific, with conservation dependent partly on the tissue in which the exon is most highly included.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Mazin, P. V., Khaitovich, P., Cardoso-Moreira, M. & Kaessmann, H. Alternative splicing during mammalian organ development. Nat. Genet. 53, 925–934 (2021). This paper finds that alternative splicing events that dynamically change during organ development are substantially more conserved than non-dynamic events.Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Agarwal, V. & Kelley, D. R. The genetic and biochemical determinants of mRNA degradation rates in mammals. Genome Biol. 23, 245 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Karollus, A., Avsec, Ž. & Gagneur, J. Predicting mean ribosome load for 5′UTR of any length using deep learning. PLoS Comput. Biol. 17, e1008982 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Braun, S. et al. Decoding a cancer-relevant splicing decision in the RON proto-oncogene using high-throughput mutagenesis. Nat. Commun. 9, 3315 (2018).Article 
PubMed 
PubMed Central 

Google Scholar 
Julien, P., Miñana, B., Baeza-Centurion, P., Valcárcel, J. & Lehner, B. The complete local genotype-phenotype landscape for the alternative splicing of a human exon. Nat. Commun. 7, 11558 (2016).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Baeza-Centurion, P., Miñana, B., Schmiedel, J. M., Valcárcel, J. & Lehner, B. Combinatorial genetics reveals a scaling law for the effects of mutations on splicing. Cell 176, 549–563.e23 (2019).Article 
CAS 
PubMed 

Google Scholar 
Ke, S. et al. Saturation mutagenesis reveals manifold determinants of exon definition. Genome Res. 28, 11–24 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Gergics, P. et al. High-throughput splicing assays identify missense and silent splice-disruptive POU1F1 variants underlying pituitary hormone deficiency. Am. J. Hum. Genet. 108, 1526–1539 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Smith, C. et al. High-throughput splicing assays identify known and novel WT1 exon 9 variants in nephrotic syndrome. Kidney Int. Rep. 8, 2117–2125 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Cortés-López, M., Schulz, L. & Enculescu, M. High-throughput mutagenesis identifies mutations and RNA-binding proteins controlling CD19 splicing and CART-19 therapy resistance. Nat. Commun. 13, 5570 (2022).Article 
PubMed 
PubMed Central 

Google Scholar 
Soemedi, R. et al. Pathogenic variants that alter protein code often disrupt splicing. Nat. Genet. 49, 848–855 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Chiang, H.-L. et al. Mechanism and modeling of human disease-associated near-exon intronic variants that perturb RNA splicing. Nat. Struct. Mol. Biol. 29, 1043–1055 (2022).Article 
CAS 
PubMed 

Google Scholar 
Adamson, S. I., Zhan, L. & Graveley, B. R. Vex-seq: high-throughput identification of the impact of genetic variation on pre-mRNA splicing efficiency. Genome Biol. 19, 71 (2018).Article 
PubMed 
PubMed Central 

Google Scholar 
Cheung, R. et al. A multiplexed assay for exon recognition reveals that an unappreciated fraction of rare genetic variants cause large-effect splicing disruptions. Mol. Cell 73, 183–194.e8 (2019).Article 
CAS 

Google Scholar 
Rosenberg, A. B., Patwardhan, R. P., Shendure, J. & Seelig, G. Learning the sequence determinants of alternative splicing from millions of random sequences. Cell 163, 698–711 (2015).Article 
CAS 
PubMed 

Google Scholar 
Ke, S. et al. Quantitative evaluation of all hexamers as exonic splicing elements. Genome Res. 21, 1360–1374 (2011).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Mikl, M., Hamburg, A., Pilpel, Y. & Segal, E. Dissecting splicing decisions and cell-to-cell variability with designed sequence libraries. Nat. Commun. 10, 4572 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Nguyen, E. et al. HyenaDNA: long-range genomic sequence modeling at single nucleotide resolution. Preprint at arXiv https://doi.org/10.48550/arXiv.2306.15794 (2023).Lucks, J. B. et al. Multiplexed RNA structure characterization with selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq). Proc. Natl Acad. Sci. USA 108, 11063–11068 (2011).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Lu, Z., Gong, J. & Zhang, Q. C. PARIS: psoralen analysis of RNA interactions and structures with high throughput and resolution. Methods Mol. Biol. 1649, 59–84 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Cai, Z. et al. RIC-seq for global in situ profiling of RNA-RNA spatial interactions. Nature 582, 432–437 (2020).Article 
CAS 
PubMed 

Google Scholar 
Turunen, J. J., Niemelä, E. H., Verma, B. & Frilander, M. J. The significant other: splicing by the minor spliceosome. Wiley Interdiscip. Rev. RNA 4, 61–76 (2013).Article 
CAS 
PubMed 

Google Scholar 
Zarnack, K. et al. Direct competition between hnRNP C and U2AF65 protects the transcriptome from the exonization of Alu elements. Cell 152, 453–466 (2013).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Attig, J. et al. Heteromeric RNP assembly at LINEs controls lineage-specific RNA processing. Cell 174, 1067–1081.e17 (2018).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Ilık, İ. A. et al. Autonomous transposons tune their sequences to ensure somatic suppression. Nature 626, 1116–1124 (2024).Article 
PubMed 
PubMed Central 

Google Scholar 
Attig, J. et al. Splicing repression allows the gradual emergence of new Alu-exons in primate evolution. eLife 5, e19545 (2016).Article 
PubMed 
PubMed Central 

Google Scholar 
Darman, R. B. et al. Cancer-associated SF3B1 hotspot mutations induce cryptic 3’ splice site selection through use of a different branch point. Cell Rep. 13, 1033–1045 (2015).Article 
CAS 
PubMed 

Google Scholar 
Katz, Y., Wang, E. T., Airoldi, E. M. & Burge, C. B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Kakaradov, B., Xiong, H. Y., Lee, L. J., Jojic, N. & Frey, B. J. Challenges in estimating percent inclusion of alternatively spliced junctions from RNA-seq data. BMC Bioinformatics 13, S11 (2012).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Venables, J. P. et al. Identification of alternative splicing markers for breast cancer. Cancer Res. 68, 9525–9531 (2008).Article 
CAS 
PubMed 

Google Scholar 
Pervouchine, D. D., Knowles, D. G. & Guigó, R. Intron-centric estimation of alternative splicing from RNA-seq data. Bioinformatics 29, 273–274 (2013).Article 
CAS 
PubMed 

Google Scholar 
Herzel, L. & Neugebauer, K. M. Quantification of co-transcriptional splicing from RNA-Seq data. Methods 85, 36–43 (2015).Article 
CAS 
PubMed 

Google Scholar 
Dent, C. I. et al. Quantifying splice-site usage: a simple yet powerful approach to analyze splicing. NAR Genom. Bioinform. 3, lqab041 (2021).Article 
PubMed 
PubMed Central 

Google Scholar 
Jha, A., Gazzara, M. R. & Barash, Y. Integrative deep models for alternative splicing. Bioinformatics 33, i274–i282 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wachutka, L., Caizzi, L., Gagneur, J. & Cramer, P. Global donor and acceptor splicing site kinetics in human cells. eLife 8, e45056 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wachutka, L. & Gagneur, J. Measures of RNA metabolism rates: toward a definition at the level of single bonds. Transcription 8, 75–80 (2017).Article 
CAS 
PubMed 

Google Scholar 
Windhager, L. et al. Ultrashort and progressive 4sU-tagging reveals key characteristics of RNA processing at nucleotide resolution. Genome Res. 22, 2031–2042 (2012).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Schwalb, B. et al. TT-seq maps the human transient transcriptome. Science 352, 1225–1228 (2016).Article 
CAS 
PubMed 

Google Scholar 
Herzog, V. A. et al. Thiol-linked alkylation of RNA to assess expression dynamics. Nat. Methods 14, 1198–1204 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Yuan, J. et al. Genetic modulation of RNA splicing with a CRISPR-guided cytidine deaminase. Mol. Cell 72, 380–394.e7 (2018).Article 
CAS 
PubMed 

Google Scholar 

Hot Topics

Related Articles