biVI – biophysical modeling with variational autoencoders for bimodal, single-cell RNA sequencing data


Understanding how genes are expressed in individual cells is key to unlocking the mysteries of cellular functions and disease mechanisms. Single-cell RNA sequencing is a revolutionary technique that lets scientists investigate the activity of genes in separate cells. However, analyzing this vast amount of data can be daunting, especially when trying to understand the complex processes of gene expression.
Introducing biVI, a cutting-edge tool developed by researchers at Cal Tech designed to simplify and enhance the analysis of single-cell RNA sequencing data. biVI is an advanced method that merges two powerful approaches: a variational autoencoder framework, known from scVI, and detailed biophysical models of RNA behavior.
Here’s how biVI makes a difference:

Variational Autoencoder (VAE): This is a type of artificial intelligence used to simplify complex data. In the context of single-cell RNA sequencing, VAEs help to identify and categorize different cell types based on their gene expression profiles. They effectively reduce the complexity of the data into a more manageable form while retaining crucial information.
Biophysical Models: These models focus on the physical and chemical processes involved in RNA production and processing, such as how RNA molecules are synthesized, modified, and broken down. By integrating these models, biVI can provide insights into the dynamic aspects of gene expression, such as how often RNA molecules are produced (burst sizes) and how quickly they degrade.

biVI reinterprets and extends scVI to infer biophysical parameters

a, scVI can take in concatenated nascent (N) and mature (M) RNA count matrices, encode each cell with neural networks (NN) to a low-dimensional space z and learn per-cell parameters and per-gene parameters for independent nascent and mature count distributions, which are by default negative binomial distributions (PNB). This is not motivated by a biophysical model. b, The telegraph model of transcription: a gene locus has the on rate k, the off rate koff and the RNA polymerase binding rate kRNAP. Nascent RNA molecules are produced in geometrically distributed bursts with mean b = kRNAP/koff, which are spliced and degraded at rates β and γ, respectively. The model’s steady-state distribution can be approximated by a pretrained neural network F and a set of basis functions. c, biVI intakes nascent and mature count matrices, produces a low-dimensional representation for each cell and outputs per-cell and per-gene parameters for a mechanistically motivated joint distribution of nascent and mature counts.
What Makes biVI Special?
biVI stands out because it combines the best features of both approaches. It retains the VAE’s ability to distill complex gene expression data into understandable patterns, while also offering a deeper look into the fundamental processes governing RNA behavior. This dual approach allows researchers to not only identify different cell types but also explore the underlying mechanisms that influence gene expression.
In practical terms, this means that biVI can help scientists understand not just which genes are active in different cells, but also why and how these activities change over time. For example, it can reveal how often cells produce RNA molecules and how quickly they break down, providing a more comprehensive picture of cellular functions.
Applications and Impact
The insights gained from biVI are valuable for various fields of research, from studying development and disease to exploring how cells respond to treatments. By offering a clearer view of RNA dynamics and gene expression, biVI paves the way for new discoveries and more effective therapeutic strategies.
In summary, biVI represents a significant advancement in the analysis of single-cell RNA sequencing data. It enhances our ability to understand the complex world of gene expression and RNA behavior, offering new opportunities for scientific discovery and medical breakthroughs.

Carilli M, Gorin G, Choi Y, Chari T, Pachter L. (2024) Biophysical modeling with variational autoencoders for bimodal, single-cell RNA sequencing data. Nat Methods [Epub ahead of print]. [abstract]

Hot Topics

Related Articles