Interpretable discovery of patterns in tabular data via spatially semantic topographic maps

Shilo, S., Rossman, H. & Segal, E. Axes of a revolution: challenges and promises of big data in healthcare. Nat. Med. 26, 29–38 (2020).Article 
CAS 
PubMed 

Google Scholar 
Obermeyer, Z. & Emanuel, E. J. Predicting the future—big data, machine learning, and clinical medicine. N. Engl. J. Med. 375, 1216–1219 (2016).Article 
PubMed 

Google Scholar 
Marx, V. The big challenges of big data. Nature 498, 255–260 (2013).Article 
CAS 
PubMed 

Google Scholar 
Wu, X., Zhu, X., Wu, G.-Q. & Ding, W. Data mining with big data. IEEE Trans. Knowl. Data Eng. 26, 97–107 (2013).
Google Scholar 
LaValle, S., Lesser, E., Shockley, R., Hopkins, M. S. & Kruschwitz, N. Big data, analytics and the path from insights to value. MIT Sloan Manage. Rev. 52, 21–32 (2011).
Google Scholar 
Xing, L., Giger, M. L. & Min, J. K. Artificial Intelligence in Medicine: Technical Basis and Clinical Applications (Academic Press, 2020).Wee-Chung Liew, A., Yan, H. & Yang, M. Pattern recognition techniques for the emerging field of bioinformatics: a review. Pattern Recognit. 38, 2055–2073 (2005).Article 

Google Scholar 
Tang, B., Pan, Z., Yin, K. & Khateeb, A. Recent advances of deep learning in bioinformatics and computational biology. Front. Genet. 10, 214 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Karim, M. R. et al. Deep learning-based clustering approaches for bioinformatics. Brief. Bioinform. 22, 393–415 (2021).Article 
PubMed 

Google Scholar 
Kiselev, V. Y., Andrews, T. S. & Hemberg, M. Challenges in unsupervised clustering of single-cell RNA-seq data. Nat. Rev. Genet. 20, 273–282 (2019).Article 
CAS 
PubMed 

Google Scholar 
Nelder, J. A. & Wedderburn, R. W. M. Generalized linear models. J. R. Stat. Soc. A 135, 370–384 (1972).Article 

Google Scholar 
Tolles, J. & Meurer, W. J. Logistic regression: relating patient characteristics to outcomes. JAMA 316, 533–534 (2016).Article 
PubMed 

Google Scholar 
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).Article 

Google Scholar 
Chen, T. &` Guestrin, C. Xgboost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (Association for Computing Machinery, 2016).Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).Article 
CAS 
PubMed 

Google Scholar 
Ronao, C. A. & Cho, S.-B. Human activity recognition with smartphone sensors using deep learning neural networks. Expert Syst. Appl. 59, 235–244 (2016).Article 

Google Scholar 
Arik, S. Ö. & Pfister, T. Tabnet: attentive interpretable tabular learning. Proc. AAAI Conf. Artif. Intell. 35, 6679–6687 (2021).Huang, X., Khetan, A., Cvitkovic, M. & Karnin, Z. Tabtransformer: tabular data modeling using contextual embeddings. Preprint at https://arxiv.org/abs/2012.06678 (2020).Kadra, A., Lindauer, M., Hutter, F. & Grabocka, J. Well-tuned simple nets excel on tabular datasets. Adv. Neural Inf. Process. Syst. 34, 23928–23941 (2021).
Google Scholar 
Borisov, V. et al. Deep neural networks and tabular data: a survey. IEEE Trans. Neural Netw. Learn. Syst. 35, 7499–7519 (2022).Gorishniy, Y., Rubachev, I., Khrulkov, V. & Babenko, A. Revisiting deep learning models for tabular data. Adv. Neural Inf. Process. Syst. 34, 18932–18943 (2021).
Google Scholar 
Shwartz-Ziv, R. & Armon, A. Tabular data: deep learning is not all you need. Inf. Fusion 81, 84–90 (2022).Article 

Google Scholar 
Zhu, Y. et al. Converting tabular data into images for deep learning with convolutional neural networks. Sci. Rep. 11, 11325 (2021).Article 
CAS 
PubMed 

Google Scholar 
Anguita, D., Ghio, A., Oneto, L., Parra, X. & Reyes-Ortiz, J. L. Human activity recognition on smartphones using a multiclass hardware-friendly support vector machine. In Ambient Assisted Living and Home Care. 4th International Workshop IWAAL 2012 (eds Bravo, J. et al.) 216–223 (Springer, 2012).Jayaram, N. & Baker, J. W. Correlation model for spatially distributed ground-motion intensities. Earthq. Eng. Struct. Dyn. 38, 1687–1708 (2009).Article 

Google Scholar 
ElShawi, R., Sherif, Y., Al-Mallah, M. & Sakr, S. Interpretability in healthcare: a comparative study of local machine learning interpretability techniques. Comput. Intell. 37, 1633–1650 (2021).Article 

Google Scholar 
Tjoa, E. & Guan, C. A survey on explainable artificial intelligence (xai): toward medical xai. IEEE Trans. Neural Netw. Learn. Syst. 32, 4793–4813 (2020).Article 

Google Scholar 
Yu, K.-H., Beam, A. L. & Kohane, I. S. Artificial intelligence in healthcare. Nat. Biomed. Eng. 2, 719–731 (2018).Article 
PubMed 

Google Scholar 
Shortliffe, E. H. & Sepúlveda, M. J. Clinical decision support in the era of artificial intelligence. JAMA 320, 2199–2200 (2018).Article 
PubMed 

Google Scholar 
Sharma, A., Vans, E., Shigemizu, D., Boroevich, K. A. & Tsunoda, T. Deepinsight: a methodology to transform a non-image data to an image for convolution neural network architecture. Sci. Rep. 9, 11399 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 31, 4768–4777 (2017).Savas, P. et al. Single-cell profiling of breast cancer T cells reveals a tissue-resident memory subset associated with improved prognosis. Nat. Med. 24, 986–993 (2018).Article 
CAS 
PubMed 

Google Scholar 
Jia, J., Li, H., Huang, Z., Yu, J. & Cao, B. Comprehensive immune landscape of lung-resident memory CD8+ T cells after influenza infection and reinfection in a mouse model. Front. Microbiol. 14, 1184884 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Lelliott, E. J. et al. NKG7 enhances cd8+ T cell synapse efficiency to limit inflammation. Front. Immunol. 13, 931630 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wen, T. et al. NKG7 is a T-cell–intrinsic therapeutic target for improving antitumor cytotoxicity and cancer immunotherapy. Cancer Immunol. Res. 10, 162–181 (2022).Article 
CAS 
PubMed 

Google Scholar 
Ting, D. S. W., Carin, L., Dzau, V. & Wong, T. Y. Digital technology and COVID-19. Nat. Med. 26, 459–461 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).Article 
CAS 
PubMed 

Google Scholar 
Bazgir, O. et al. Representation of features as images with neighborhood dependencies for compatibility with convolutional neural networks. Nat. Commun. 11, 4391 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Shavitt, I. & Segal, E. Regularization learning networks: deep learning for tabular datasets. Adv. Neural Inf. Process. Syst. 31, 1386–1396 (2018).Kossen, J. et al. Self-attention between datapoints: going beyond individual input–output pairs in deep learning. Adv. Neural Inf. Process. Syst. 34, 28742–28756 (2021).
Google Scholar 
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. & Torralba, A. Learning deep features for discriminative localization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2921–2929 (IEEE, 2016).Selvaraju, R. R. et al. Grad-cam: visual explanations from deep networks via gradient-based localization. In Proc. IEEE International Conference on Computer Vision 618–626 (IEEE, 2017).Ribeiro, M. T., Singh, S. & Guestrin, C. “Why should i trust you?”: explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (Association for Computing Machinery, 2016).Peyré, G. et al. Computational optimal transport: with applications to data science. Found. Trends Mach. Learn. 11, 355–607 (2019).Article 

Google Scholar 
Moon, I. et al. Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Nat. Med. 29, 2057–2067 (2023).Peyré, G., Cuturi, M. & Solomon, J. Gromov–Wasserstein averaging of kernel and distance matrices. In International Conference on Machine Learning 2664–2672 (PMLR, 2016).Cuturi, M. Sinkhorn distances: lightspeed computation of optimal transport. Adv. Neural Inf. Process. Syst. 26, 2292–2300 (2013).Crouse, D. F. On implementing 2D rectangular assignment algorithms. IEEE Trans. Aerosp. Electron. Syst. 52, 1679–1696 (2016).Article 

Google Scholar 
Shapley, L. S. in Contributions to the Theory of Games II (eds Kuhn, H. W. & Tucker, A. W.) 307–317 (Princeton Univ. Press, 1953).Deng, X. & Papadimitriou, C. H. On the complexity of cooperative solution concepts. Math. Oper. Res. 19, 257–266 (1994).Article 

Google Scholar 
Datta, A., Sen, S. & Zick, Y. Algorithmic transparency via quantitative input influence: theory and experiments with learning systems. In 2016 IEEE Symposium on Security and Privacy (SP) 598–617 (IEEE, 2016).Štrumbelj, E. & Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41, 647–665 (2014).Article 

Google Scholar 
Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. In International Conference on Machine Learning 3145–3153 (PMLR, 2017).Sakar, C., Serbes, G., Gunduz, A., Nizam, H. & Sakar, B. Parkinson’s disease classification. UCI Machine Learning Repository https://doi.org/10.24432/C5MS4X (2018).Mansouri, K., Ringsted, T., Ballabio, D., Todeschini, R. & Consonni, V. QSAR biodegradation. UCI Machine Learning Repository https://doi.org/10.24432/C5H60M (2013).Reyes-Ortiz, J., Anguita, D., Ghio, A., Oneto, L. & Parra, X. Human activity recognition using smartphones. UCI Machine Learning Repository https://doi.org/10.24432/C54S4K (2012).Mah, P. & Veyrieras, J.-B. MicroMass. UCI Machine Learning Repository https://doi.org/10.24432/C5T61S (2013).Guyon, I., Gunn, S., Ben-Hur, A. & Dror, G. Arcene. UCI Machine Learning Repository https://doi.org/10.24432/C58P55 (2008).Cole, R. & Fanty, M. ISOLET. UCI Machine Learning Repository https://doi.org/10.24432/C51G69 (1994).Lathrop, R. p53 Mutants. UCI Machine Learning Repository https://doi.org/10.24432/C5T89H (2010).Wolberg, W., Mangasarian, O., Street, N. & Street, W. Breast cancer Wisconsin (diagnostic). UCI Machine Learning Repository https://doi.org/10.24432/C5DW2B (1995).Bhattacharjee, A. et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl Acad. Sci. USA 98, 13790–13795 (2001).Article 
CAS 
PubMed 

Google Scholar 
Li, J. et al. Feature selection: a data perspective. ACM Comput. Surv. 50, 1–45 (2017).
Google Scholar 
Li, J. et al. scikit-feature feature selection repository. GitHub https://jundongl.github.io/scikit-feature (2018).UCI Machine Learning Repository; https://archive.ics.uci.edu

Hot Topics

Related Articles