scASfind – mining alternative splicing patterns in scRNA-seq data


Single-cell RNA sequencing (scRNA-seq) is a powerful tool researchers use to understand how genes work in individual cells. However, most studies using this tool have focused on looking at whole genes, missing out on a finer level of detail called alternative splicing.
What is Alternative Splicing?
Alternative splicing is like a gene’s way of producing different versions of its instructions. Imagine a recipe where you can skip certain steps or swap ingredients to create variations of the same dish. Similarly, genes can include or exclude certain pieces of their instructions (called exons) to produce different proteins. This process is vital because it increases the diversity of proteins a single gene can produce, playing a critical role in how cells function and respond to different conditions.
Introducing scASfind
To better understand alternative splicing, researchers at the Wellcome Sanger Institute have developed a new computational method called scASfind. This innovative tool allows scientists to analyze cell type-specific splicing events using full-length scRNA-seq data. In simpler terms, scASfind helps researchers zoom in on the different ways genes can be spliced in various cell types.
Overview of scASfind

a Schematic of the scASfind workflow. Single-cell full-length transcriptome sequencing data such as Smart-seq2 or VASA-seq are suitable inputs for the scASfind workflow. Cells from the same cell type are pooled to increase the accuracy of splicing event detection (default 5 cells per pool) with MicroExonator. The PSI value for each splicing node is calculated by Whippet to obtain a splice node-by-cell pool PSI matrix, and we then build a scASfind index containing information about splicing events that are differentially spliced in or spliced out in each cell pool. Finally, we query the index to search for cell type-specific differential splicing events, mutually exclusive node pairs and consecutive nodes that are similarly spliced-in or coordinated splicing events. b The size of the file saved to disk containing either the raw PSI values and metadata objects or the scASfind index with metadata objects built with a two-bit quantization. c The elapsed time of searching all cells with increased inclusion, i.e., has a PSI no less than 0.2 higher than the dataset mean, in any of five randomly selected splicing nodes. The process is repeated 30 times. The bar in the boxplot shows the arithmetic mean, lower and upper hinges correspond to the first and third quartiles, whiskers extend from the hinge to the largest value no further than 1.5 * interquartile range from the hinge, and outliers beyond this range are plotted as individual data points. PSI, percent spliced-in
How Does scASfind Work?

Efficient Data Storage: scASfind uses a smart way to store information about splicing events. It keeps track of the “percent spliced-in” value for each event, which indicates how often a particular exon is included in the final gene product.
Exhaustive Search for Patterns: By organizing the data efficiently, scASfind can quickly search through all the splicing events to find patterns. This means it can identify unique splicing events that occur in specific cell types.
Identifying Key Events: scASfind can spot three important types of splicing events:

Marker Events: These are splicing events unique to a particular cell type, acting as a signature or marker.
Mutually Exclusive Events: These events involve choices where one exon is included, and another is excluded, creating distinct gene variants.
Large Block Events: These involve big chunks of exons being spliced in or out in a way specific to certain cell types.

Why is This Important?
Understanding alternative splicing is essential because it can reveal how different cell types function and how they respond to changes. For instance, it can help scientists discover why certain cells become cancerous or how they react to treatments. By providing a detailed map of splicing events, scASfind opens up new possibilities for diagnosing diseases and developing targeted therapies.
Conclusion
scASfind represents a significant step forward in genetics research. By focusing on the intricate process of alternative splicing, this tool provides a deeper understanding of how genes create diversity in cells. With scASfind, researchers can uncover the hidden details of gene regulation, paving the way for new insights into health and disease.

Song Y, Parada G, Lee JTH, Hemberg M. (2024) Mining alternative splicing patterns in scRNA-seq data using scASfind. Genome Biol  25(1):197. [article]

Hot Topics

Related Articles