STGAT – integrating spatial transcriptomics and bulk RNA-seq to predict gene expression with enhanced resolution


In cancer research, understanding how genes are expressed spatially within tumor tissues is crucial for uncovering their complexity and identifying potential therapeutic targets. Spatial transcriptomics plays a vital role in this endeavor, offering detailed insights into the spatial organization of gene activity, which can vary significantly across different regions of a tumor.
Traditionally, large-scale cancer studies have relied heavily on bulk RNA sequencing (RNA-seq) combined with Whole Slide Image (WSI) data to analyze gene expression patterns. While effective, this approach often overlooks spatial nuances within tumors, limiting our ability to fully grasp tumor heterogeneity and identify biomarkers critical for prognosis and treatment.
To bridge this gap, at team led by researchers at the University of Central Florida, have developed innovative methodologies like STGAT (Spatial Transcriptomics Graph Attention Network). STGAT leverages advanced Graph Attention Networks (GAT) to analyze spatial dependencies among different spots within tissue samples. By training on spatial transcriptomics data, STGAT can estimate gene expression profiles at near-cellular resolution, even when only WSI and bulk RNA-seq data are available.
An overall illustration of the proposed framework

The SEG produces embeddings for spot images extracted from a WSI. The GEP then estimates spot-level gene expression profiles for the WSI by leveraging the embeddings generated by SEG and the bulk RNA-seq gene expression data of the corresponding WSI. The SLP classifies each spot on the WSI as either tumor or non-tumor. In this study, SEG is trained and evaluated on spatial transcriptomics data, GEP is trained and evaluated on TCGA data, and SLP is trained on spatial transcriptomics data and applied to TCGA data.
The significance of STGAT lies in its ability to enhance the resolution of gene expression analysis in tumor tissues. It not only predicts whether a spot represents tumor or non-tumor tissue but also identifies subtle variations in gene activity that may be crucial for understanding tumor progression and response to treatment.
Recent studies using STGAT on breast cancer datasets have demonstrated its superiority over existing methods in accurately predicting gene expression profiles. By focusing on tumor-only spots identified by STGAT, these researchers have uncovered more precise molecular signatures that correlate with breast cancer subtypes and tumor stages. These insights have profound implications for improving patient outcomes, including survival rates and disease-free intervals.
Moving forward, the integration of spatial transcriptomics with advanced computational tools like STGAT promises to revolutionize cancer research. It opens avenues for reanalyzing existing cohort studies to discover novel biomarkers and therapeutic targets that were previously obscured. Ultimately, this approach holds the potential to personalize cancer treatment strategies, offering hope for more effective therapies tailored to individual patients’ molecular profiles.
Availability: Code is available at https://github.com/compbiolabucf/STGAT.

Baul S, Tanvir Ahmed K, Jiang Q, Wang G, Li Q, Yong J, Zhang W. (2024) Integrating spatial transcriptomics and bulk RNA-seq: predicting gene expression with enhanced resolution through graph attention networks. Brief Bioinform 25(4):bbae316. [article]

Hot Topics

Related Articles