Biophysically interpretable inference of cell types from multimodal sequencing data

La Manno, G. et al. Molecular architecture of the developing mouse brain. Nature 596, 92–96 (2021).Chari, T. et al. Whole-animal multiplexed single-cell RNA-seq reveals transcriptional shifts across Clytia medusa cell types. Sci Adv 7, eabh1683 (2021).Article 

Google Scholar 
Chamberlin, J. T., Lee, Y., Marth, G. T. & Quinlan, A. R. Differences in molecular sampling and data processing explain variation among single-cell and single-nucleus RNA-seq experiments. Genome Res. 34, 179–188 (2024).Article 

Google Scholar 
Reyes, M., Billman, K., Hacohen, N. & Blainey, P. C. Simultaneous profiling of gene expression and chromatin accessibility in single cells. Adv Biosyst 3, 1900065 (2019).Article 

Google Scholar 
Xie, H. & Ding, X. The intriguing landscape of single-cell protein analysis. Adv. Sci. 9, e2105932 (2022).Article 

Google Scholar 
Rabani, M. et al. Metabolic labeling of RNA uncovers principles of RNA production and degradation dynamics in mammalian cells. Nat. Biotechnol. 29, 436–442 (2011).Article 

Google Scholar 
Munsky, B., Fox, Z. & Neuert, G. Integrating single-molecule experiments and discrete stochastic models to understand heterogeneous gene transcription dynamics. Methods 85, 12–21 (2015).Article 

Google Scholar 
Xu, Z., Sziraki, A., Lee, J., Zhou, W. & Cao, J. Dissecting key regulators of transcriptome kinetics through scalable single-cell RNA profiling of pooled CRISPR screens. Nat. Biotechnol. 42, 1218–1223 (2023).Chen, P.-T., Zoller, B., Levo, M. & Gregor, T. Gene activity fully predicts transcriptional bursting dynamics. Preprint at https://arxiv.org/abs/2304.08770 (2023).Zeng, H. What is a cell type and how to define it? Cell 185, 2739–2755 (2022).Article 

Google Scholar 
Domcke, S. & Shendure, J. A reference cell tree will serve science better than a reference cell atlas. Cell 186, 1103–1114 (2023).Article 

Google Scholar 
De Meo, P., Ferrara, E., Fiumara, G. & Provetti, A. Generalized Louvain method for community detection in large networks. In 2011 11th International Conference on Intelligent Systems Design and Applications 88–93 (IEEE, 2011).Traag, V. A., Waltman, L. & Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019).Article 

Google Scholar 
Yao, Z. et al. A transcriptomic and epigenomic cell atlas of the mouse primary motor cortex. Nature 598, 103–110 (2021).Article 

Google Scholar 
Chen, S. et al. Dissecting heterogeneous cell populations across drug and disease conditions with PopAlign. Proc. Natl Acad. Sci. USA 117, 28784–28794 (2020).Cai, B., Zhang, J. & Sun, W. W. Jointly modeling and clustering tensors in high dimensions. Preprint at https://arxiv.org/abs/2104.07773 (2021).Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).Article 

Google Scholar 
Kiselev, V. Y., Andrews, T. S. & Hemberg, M. Publisher correction: challenges in unsupervised clustering of single-cell RNA-seq data. Nat. Rev. Genet. 20, 310 (2019).Article 

Google Scholar 
McInnes, L., Healy, J. & Melville, J. UMAP: Uniform Manifold Approximation and Projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37, 38–44 (2018).You, Y. et al. Benchmarking UMI-based single-cell RNA-seq preprocessing workflows. Genome Biol. 22, 339 (2021).Article 

Google Scholar 
Tabula Muris Consortium. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562, 367–372 (2018).Article 

Google Scholar 
Han, J. et al. Human serous cavity macrophages and dendritic cells possess counterparts in the mouse with a distinct distribution between species. Nat. Immunol. 25, 155–165 (2024).Article 

Google Scholar 
Sun, G. et al. A single-cell transcriptomic atlas of the lungs of patients with pulmonary tuberculosis. Preprint at Research Square https://doi.org/10.21203/rs.3.rs-2752256/v1 (2024).Hjörleifsson, K. et al. Accurate quantification of single-nucleus and single-cell RNA-seq transcripts. Preprint at bioRxiv https://doi.org/10.1101/2022.12.02.518832 (2022).Sullivan, D. K. et al. kallisto, bustools, and kb-python for quantifying bulk, single-cell, and single-nucleus RNA-seq. Preprint at bioRxiv https://doi.org/10.1101/2023.11.21.568164 (2024).Bhat, P. et al. Genome organization around nuclear speckles drives mRNA splicing efficiency. Nature 629, 1165–1173 (2024).Mayère, C. et al. Single-cell transcriptomics reveal temporal dynamics of critical regulators of germ cell fate during mouse sex determination. FASEB J. 35, e21452 (2021).Xiao, C., Chen, Y., Meng, Q., Wei, L. & Zhang, X. Benchmarking multi-omics integration algorithms across single-cell RNA and ATAC data. Brief. Bioinform. 25, bbae095 (2024).Heumos, L. et al. Best practices for single-cell analysis across modalities. Nat. Rev. Genet. 24, 550–572 (2023).Article 

Google Scholar 
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).Gayoso, A. et al. Joint probabilistic modeling of single-cell multi-omic data with totalVI. Nat. Methods 18, 272–282 (2021).Article 

Google Scholar 
Lin, X., Tian, T., Wei, Z. & Hakonarson, H. Clustering of single-cell multi-omics data with a multimodal deep learning method. Nat. Commun. 13, 7705 (2022).Article 

Google Scholar 
Gupta, R. & Claassen, M. Factorial state-space modelling for kinetic clustering and lineage inference. Preprint at bioRxiv https://doi.org/10.1101/2023.08.21.554135 (2023).Gorin, G., Fang, M., Chari, T. & Pachter, L. RNA velocity unraveled. PLoS Comput. Biol. 18, e1010492 (2022).Article 

Google Scholar 
Bokes, P., King, J. R., Wood, A. T. A. & Loose, M. Exact and approximate distributions of protein and mRNA levels in the low-copy regime of gene expression. J. Math. Biol. 64, 829–854 (2012).Singh, A. & Bokes, P. Consequences of mRNA transport on stochastic variability in protein levels. Biophys. J. 103, 1087–1096 (2012).Article 

Google Scholar 
Gorin, G. & Pachter, L. Length biases in single-cell RNA sequencing of pre-mRNA. Biophys. Rep. 3, 100097 (2023).
Google Scholar 
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).Article 

Google Scholar 
MacQueen, J. et al. Some methods for classification and analysis of multivariate observations. In Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability 281–297 (Univ. California, Berkeley, 1967).Cao, Z.-J. & Gao, G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat. Biotechnol. 40, 1458–1466 (2022).Argelaguet, R. et al. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biol. 21, 111 (2020).Article 

Google Scholar 
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).Article 

Google Scholar 
Xiong, Y. et al. A comparison of mRNA sequencing with random primed and 3′-directed libraries. Sci. Rep. 7, 14626 (2017).Amarasinghe, S. L. et al. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 21, 30 (2020).Article 

Google Scholar 
Andrews, G. L. & Mastick, G. S. R-cadherin is a Pax6-regulated, growth-promoting cue for pioneer axons. J. Neurosci. 23, 9873–9880 (2003).Article 

Google Scholar 
Kogo, H. et al. HORMAD2 is essential for synapsis surveillance during meiotic prophase via the recruitment of ATR activity. Genes Cells 17, 897–912 (2012).Article 

Google Scholar 
Liang, J., Shi, J., Wang, N., Zhao, H. & Sun, J. Tuning the protein phosphorylation by receptor type protein tyrosine phosphatase epsilon (PTPRE) in normal and cancer cells. J. Cancer 10, 105–111 (2019).Article 

Google Scholar 
Koedoot, E., Wolters, L., van de Water, B. & Le Dévédec, S. E. Splicing regulatory factors in breast cancer hallmarks and disease progression. Oncotarget 10, 6021–6037 (2019).Amodio, N. et al. MALAT1: a druggable long non-coding RNA for targeted anti-cancer approaches. J. Hematol. Oncol. 11, 63 (2018).Article 

Google Scholar 
Yeo, S. K. et al. Single-cell RNA-sequencing reveals distinct patterns of cell state heterogeneity in mouse models of breast cancer. eLife 9, e58810(2020).Gökmen-Polar, Y. et al. Splicing factor ESRP1 controls ER-positive breast cancer by altering metabolic pathways. EMBO Rep. 20, e46078 (2019).Article 

Google Scholar 
Qiao, F.-H., Tu, M. & Liu, H.-Y. Role of MALAT1 in gynecological cancers: pathologic and therapeutic aspects. Oncol. Lett. 21, 333 (2021).Chen, Q., Zhu, C. & Jin, Y. The oncogenic and tumor suppressive functions of the long noncoding RNA MALAT1: an emerging controversy. Front. Genet. 11, 93 (2020).Article 

Google Scholar 
Dumitrascu, B., Villar, S., Mixon, D. G. & Engelhardt, B. E. Optimal marker gene selection for cell type discrimination in single cell analyses. Nat. Commun. 12, 1186 (2021).Article 

Google Scholar 
Chen, X., Chen, S. & Thomson, M. Minimal gene set discovery in single-cell mRNA-seq datasets with ActiveSVM. Nat. Comput. Sci. 2, 387–398 (2022).Article 

Google Scholar 
Kreutz, C. et al. Encyclopedia of Systems 1576–1579 (Springer, 2013).Fox, Z. R., Neuert, G. & Munsky, B. Optimal design of single-cell experiments within temporally fluctuating environments. Complexity https://doi.org/10.1155/2020/8536365 (2020).Carilli, M., Gorin, G., Choi, Y., Chari, T. & Pachter, L. Biophysical modeling with variational autoencoders for bimodal, single-cell RNA sequencing data. Nat. Methods, 21, 1466–1469 (2024).Sukys, A., Öcal, K. & Grima, R. Approximating solutions of the Chemical Master equation using neural networks. iScience 25, 105010 (2022).Article 

Google Scholar 
Gorin, G., Carilli, M., Chari, T. & Pachter, L. Spectral neural approximations for models of transcriptional dynamics. Biophys. J. 123, 2892–2901 (2024).Gorin, G., Vastola, J. J., Fang, M. & Pachter, L. Interpretable and tractable models of transcriptional noise for the rational design of single-molecule quantification experiments. Nat. Commun. 13, 7620 (2022).Article 

Google Scholar 
Felce, C., Gorin, G. & Pachter, L. A Biophysical model for ATAC-seq data analysis. Preprint at bioRxiv https://doi.org/10.1101/2024.01.25.577262 (2024).Friedman, N., Cai, L. & Xie, X. S. Stochasticity in gene expression as observed by single-molecule experiments in live cells. Israel J. Chem. 49, 333–342 (2009).Gorin, G. & Pachter, L. Monod: mechanistic analysis of single-cell RNA sequencing count data. Preprint at bioRxiv https://doi.org/10.1101/2022.06.11.495771 (2022).Larsson, A. J. M. et al. Genomic encoding of transcriptional burst kinetics. Nature 565, 251–254 (2019).Article 

Google Scholar 
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Erratum: near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 888 (2016).Melsted, P. et al. Modular, efficient and constant-memory single-cell RNA-seq preprocessing. Nat Biotechnol. 39, 813–818 (2021).Jiang, S. et al. Cell Taxonomy: a curated repository of cell types with multifaceted characterization. Nucleic Acids Res. 51, D853–D860 (2023).Article 

Google Scholar 
Chari, T. meK-means all benchmark and simulation datasets. CaltechDATA https://doi.org/10.22002/v4gg9-qsr24 (2024).Chari, T. & Pachter, L. pachterlab/CGP_2023: meK-means repo DOI (v1.0.0). Zenodo https://doi.org/10.5281/zenodo.13253144 (2024).

Hot Topics

Related Articles