Nanodynamo quantifies subcellular RNA dynamics revealing extensive coupling between steps of the RNA life cycle

Nanodynamo: model formulation, parameters identifiability and inferenceWe modelled the RNA life cycle with a set of deterministic Ordinary Differential Equations (ODEs) describing the temporal modulation of RNA species in various RNA pools, including cellular compartments and polysomes. More in detail, premature RNA associated with chromatin is transcribed at rate k1 and is either co-transcriptionally processed into its mature form or detached from chromatin to become premature nucleoplasmic RNA with rates k2 and k4, respectively. Premature nucleoplasmic RNA is then post-transcriptionally spliced into mature nucleoplasmic RNA with rate k5. The latter is also produced from the detachment of mature RNA from chromatin with rate k3. Mature nucleoplasmic RNA is then exported to cytoplasm at rate k6 to become cytoplasmic RNA, which can be finally either directly degraded at rate k7 or associated with actively translating polysomes at rate k8 and subsequently degraded at rate k9 (Fig. 1A, B).Fig. 1: Nanodynamo: model formulation, parameters identifiability and inference validation.A The nuclear and cytoplasmic steps of the RNA life cycle considered by Nanodynamo, with the corresponding kinetic rates (squared brackets). Coloured ellipses represent the profiled RNA pools, including cellular and polysomal fractions. B Mathematical formulation in terms of Ordinary Differential Equations of the model presented in A. Ch, N, C, and P denote chromatin associated, nucleoplasmic, cytoplasmic, and polysomal RNA, respectively, while p and m additionally specify premature or mature forms. C Scheme of the experimental workflow, from cell fractionation to reads classification. D Spearman correlation coefficients between modelled expression levels and their experimental counterparts for a simulated dataset composed of 1000 genes generated with CVs retrieved from untreated experimental data, 2 replicates, and 1 labelling time of 20 min; abbreviations of the legend as in B. E Smooth density scatterplots between inferred kinetic rates and their expected counterparts for the same dataset described in D. For each plot, Spearman correlation coefficient (Cor.), identity line (red) and loess (green) lines are reported; red dots represent outliers (i.e., data points in the bottom and top 2.5% of the distribution). For D, E, source data are provided as a Source Data file.Each parameter of the model, except for the time (t), is a rate which represents the efficiency of a specific step of the RNA life cycle. In particular, k2-9 are pure rates expressed as h−1, while k1 is the net amount of newly transcribed RNA per hour per million of cells. The rates inference relies as input data on the quantification of RNAs bound to chromatin, present in the nucleoplasm, in the cytoplasm, and associated with actively translating ribosomes. Transcripts associated with different RNA pools are obtained by extracting polyA+ RNAs following the fractionation of the cells into chromatin, nucleoplasm and cytoplasm. Transcripts undergoing translation are obtained by retrieving RNAs associated with more than three ribosomes following polysome fractionation. Direct Nanopore sequencing of native RNA (dRNA-seq) is then performed for each of the four pools. Thus, k7 refers to the rate of degradation of cytoplasmatic transcripts not associated with polysomes, while k9 refers to the rate of degradation of transcripts associated with polysomes. From here on we will refer to k7 as cytoplasmatic degradation, and to k9 as polysomal degradation.For the quantification of premature and mature RNA species within each pool, we relied on the in silico classification of the reads based on the detection of intronic signal, as widely adopted in the field16,17,23. To validate our ability to classify reads associated with premature transcripts, we reanalysed publicly available dRNA-seq generated by nano-COP on K562 cells (GEO Sample ID GSM4663623)25. Nano-COP relies on the profiling of nascent RNA associated with chromatin, and Nanodynamo confirmed the previously reported enrichment in premature RNA (83% of the reads).To overcome the undetermined nature of the resulting algebraic system we opted for the inclusion of nascent RNA profiling, similarly to what has been previously proposed by us and others17. For the identification of newly synthetized transcripts, we adopted metabolic labelling with 4-Thiouridine (4sU) for fixed amounts of time. This modified nucleotide can be detected on dRNA-seq data using the nano-ID tool26, which relies on the impact of 4sU on Nanopore ionic current for the supervised classification of the reads (Median gene accuracy ~0.80; Supplementary Fig. 2). This additional piece of data discloses the temporal evolution of the system and guarantees global structural identifiability for all the parameters of the model, i.e., the existence of a unique set of rates for every model output (see the “Parameters identifiability” methods section).Following the verification of the theoretical feasibility of kinetic rates inference, we developed an R framework to identify the optimal set of rates given a vector of gene expression levels. This framework relies on the minimization of a cost function (sum of absolute logarithmic fold changes) regularised with the L2 norm of the rates and provides the numerical value of the rates at the single gene level.Evaluation of Nanodynamo through simulated dataThe analysis of parameters global identifiability disregards the impact of noise and the numerical issues which may affect the rates inference. Therefore, we relied on simulated data to test the performance of our approach accounting for these additional factors.Given a set of kinetic rates, the numerical solution of the ODEs system returns pre-existing and nascent RNA expression levels for all the species involved in the model at any time of interest. We bound each rate of the model to the median of the rates of synthesis, processing and degradation previously obtained with INSPEcT for 3T9 mouse fibroblast cells27. In particular, Nanodynamo rates from k2 to k6 were linked to the rate of processing while rates from k7 to k9 to the rate of degradation, in order to maintain a biologically reasonable ranking of the efficiency of the processes (see methods for details). These values were used as means for a set of gaussian distributions from which the rates were independently sampled for 1000 simulated genes. The variation coefficient (CV) of each gaussian distribution was set to 5 in order to explore a broad parameter space covering several orders of magnitude for each rate. The expression level of each RNA species resulting from these rates is exact. To mimic experimental noise, we used this value as the mean of a normal distribution from which we sampled n simulated replicates. Distributions variation coefficients were determined from experimental expression data taking the median CV across all genes for each RNA species (see “Quantification of gene expression levels” and “Data simulation” methods sections).The temporal design of metabolic labelling is an important experimental aspect which could strongly impact the inference performance26,28. The optimal labelling time(s) should allow capturing the sharp increase of nascent transcription, while accommodating the slower dynamics of nascent RNA accumulation for steps further downstream the RNA life cycle. The chosen labelling time(s) should also allow producing non-negligible amounts of labelled RNA in the various RNA pools. By means of mathematical simulations (see methods), we identified a labelling time of 20 min as a good trade-off for all the RNA species where the time derivatives of their nascent RNA saturation curves are significantly higher than zero, i.e., far from the steady state, but also distant from the initial exponential transition (Supplementary Fig. 3). Altogether, we simulated three datasets (CVs assigned as previously explained, 1000 genes, 2 replicates) with an increasing number of labelling pulses of 20, 60 and 120 min. We ran the inference pipeline, and we compared the resulting rates against the real counterparts to quantify the goodness of fit. The increase in performance gained with multiple labelling times was minor (maximum increase of 15%; Supplementary Fig. 4A).We adopted a similar approach to evaluate the optimal number of replicates. We simulated five datasets (CVs assigned as previously explained, 1000 genes, 20 min labelling) with an increasing number of replicates. The gain in correlation between inferred and real rates at an increasing number of experiments suggested that two samples are sufficient to have a performance remarkably close to the optimal one (maximum increase of 20%; Supplementary Fig. 4B).The abundances of RNA species, inferred with two replicates and 20 min labelling pulse, were highly correlated with their expected expression levels (median Spearman correlations >0.98; Fig. 1D and Supplementary Fig. 5). The inferred kinetic rates were well correlated with the expected counterparts (Spearman coefficients in the 0.68–0.96 range; Fig. 1E). The variability in rates correlations for the last steps of the RNA life cycle likely derived from the increased complexity of the model moving away from the RNA synthesis step. In fact, the determination of k7-9 involved upstream rates in defining cytoplasmic and polysomal RNA species, complicating the inferences based on these data. Instead, k4 and k7 were affected by the presence of branching points in the model. Indeed, the simplification of the equations disregarding certain steps of the RNA life cycle improved the correlation coefficients of the remaining ones (see Supplemental Material). Nevertheless, we considered this performance a reasonable compromise between inference quality, experimental workload, and sequencing cost.Nanodynamo unravels complex RNA dynamics of SUM159 triple-negative breast cancer cellsWe used Nanodynamo to profile the RNA dynamics of SUM159 triple-negative breast cancer cells. We profiled transcription by dRNA-seq with two replicates for each RNA pool: chromatin-associated RNA, nucleoplasmic RNA, cytoplasmic RNA and transcripts associated with actively translating polysomes (Figs. 1C, 2A, and Supplementary Fig. 6A). See the methods for details on reads number and statistics for the sequencing runs.Fig. 2: Nanodynamo unravels complex RNA dynamics of SUM159 triple-negative breast cancer cells.A Typical polysome trace for untreated SUM159 cells. B Schematic representation of the Nanodynamo framework, from input data to kinetic rates interpretation, for the gene GSTP1. Left, RNA species expression levels (bars and dots report mean and single replicate values respectively). Centre, kinetic rates coloured according to their relative values; the synthesis rate was reported in grey because of its different unit of measurement. Right, cartoon depicting the different efficiencies of the RNA life cycle steps. C Kinetic rates distributions in log10 scale. D Heatmap reporting for each modelled gene (rows) RNA species abundance and kinetic rates magnitude (columns). For each column, values are saturated between the 1st and 99th percentiles and normalised against the latter. The left bar indicates groups of genes identified by k-means clustering according to normalised abundances and kinetic rates. E Gene-level median polyA tail length distributions across RNA pools. F RNA species and kinetic rates for the simplified models (rows) implemented in Nanodynamo, compared to the complete model (1st row); input data required for each model are indicated in blue (not required in yellow) while the provided kinetic rates are indicated in grey (missing rates in red). In the figure, Ch, N, C, and P denote chromatin associated, nucleoplasmic, cytoplasmic, and polysomal RNA, respectively, while p and m additionally specify premature or mature forms. Analyses in C, D were performed on 1914 genes processed with the complete model while the analysis in E was performed on all the genes with at least one read with profiled polyA tail after replicates pooling (11774, 11957, 11760, 12739 genes for the chromatin, nucleoplasmic, cytoplasmic, and polysomal fractions respectively). For boxplots, the horizontal line represents the median value, the box edges represent the 25th (Q1) and 75th (Q3) percentiles, and the whiskers show the range of data excluding outliers (observations lower that Q1 − 1.5 * interquartile range or larger than Q3 + 1.5 * interquartile range). Source data are provided as a Source Data file.To assess the reproducibility of Nanodynamo we separately analysed the two replicated datasets. In order to compare the results, we had to focus on the genes that could be analysed in both replicates (397 genes), the bottleneck being requiring the quantification of premature RNA species in all nuclear RNA pools. The Spearman correlation between the abundance of nascent, pre-existing and total RNA species between the two replicates ranged from 0.60 to 0.99 (median 0.95; Supplementary Fig. 7), with the smallest values associated with the lowest expressed species, while the correlation between the kinetic rates ranged from 0.37 to 0.95 (maximum p < 2.2e-16; median 0.67; Supplementary Fig. 8).Jointly modelling the information coming from the two replicates, we were able to model 1914 genes. These were the genes that fulfilled the requirement of having at least one read for all the RNA species included in the model in at least one replicate. The minimum and median Spearman correlations between modelled expression levels and their experimental counterparts (replicates means) for these genes were 0.43 and 0.95 respectively, supporting the goodness of models’ fits (Supplementary Fig. 9). The median proportion of reads classified as nascent, upon 20’ 4sU metabolic labelling, ranged from 24% to 45% and followed the expected decreasing trend from chromatin to cytoplasmic RNA (Supplementary Fig. 10). The median proportion of reads classified as premature ranged from 7% to 15%, with the highest value detected in chromatin associated RNA (Supplementary Fig. 10), as expected. The dynamics of RNA metabolism for GSTP1, a representative gene involved in triple-negative breast cancer cells metabolism and pathogenicity29, are shown in Fig. 2B.Among the nine inferred kinetic rates, RNA synthesis (k1) is the only one that is expressed in terms of polyA+ RNA produced per million cells, thus its absolute value (median of 15 pg Mcells−1 h−1) cannot be compared with the pure rates of the other RNA life-cycle steps (Fig. 2C). We extrapolated that this value is compatible with the expected yield of nascent RNA following a 20’ pulse30 (see the “Nascent RNA yield” methods section). Among the pure rates (k2-8), the fastest step was the association to actively translating polysomes, which had a median of 32.23 h−1 and spanned various orders of magnitude, suggesting that it could significantly shape gene expression programs. The remaining rates were slower with medians ranging from 0.25 h−1 (cytoplasmic decay) to 7.73 h−1 (polysomal decay). In terms of sequence features, the rate of RNA synthesis negatively associated with the size of transcriptional units and 5′ and 3′UTRs (Unpaired two-sided Wilcoxon test p < 1e-4), and genes with the fastest rates were involved in translation and focal adhesion (Unpaired two-sided Wilcoxon test p < 1e-10). All cytoplasmic rates (k7-9) were positively associated with the size of transcriptional units and 3′UTRs, and negatively associated with 5′UTR CG content (Supplementary Fig. 11; Unpaired two-sided Wilcoxon test p < 1e-4).The rates involved in co-transcriptional splicing (k2-3) and those involved in post-transcriptional splicing (k4-5) had peculiar bimodal distributions (Fig. 2C). The genes characterised by low and high rates were conserved when analysing individual replicates (accuracy between 0.80 and 0.83, maximum two-sided exact Fisher test p < 0.0023; Supplementary Fig. 8). We excluded the numerical origin of these bimodal distributions which were not present in the simulated data (Supplementary Fig. 12A), nor did they depend on the initial conditions used for inference (see methods for details, Supplementary Fig. 12B). The clustering of genes based on the abundance of RNA species and magnitude of the kinetic rates revealed that 65% of the genes adopted co-transcriptional processing, in agreement with31, with a preference for genes with sustained transcriptional activity (Fig. 2D). Notably, distinct sets of genes relied on either co- or post-transcriptional processing pathways (i.e., genes with high k2-3 had low k4-5, and vice-versa). This was confirmed by the Spearman correlations between kinetic rates which were positive (>0.30) between synthesis and co-transcriptional processing rates, and negative (<−0.62) between co- and post-transcriptional processing rates. The same correlative analysis performed on simulated data provided 0.03 and −0.15 respectively (most extreme correlations), confirming that the structure of the model was not sufficient to generate similar results. Finally, these analyses also revealed that high expression levels were mainly mediated by high rates of synthesis, that genes efficiently translated tended to be efficiently degraded (Spearman correlations 0.28 and 0.26 against cytoplasmic and polysomal degradation, respectively), and that the rate of polysomal degradation was faster than the rate of cytoplasmic degradation (Unpaired two-sided Wilcoxon test p < 2.2e-16).We took advantage that dRNA-seq data offer the possibility of quantifying the length of polyA tails. The median length of the tails for transcripts associated with chromatin was 152nt, which reduced to 105nt for transcripts retrieved from other RNA pools (Fig. 2E). This is in agreement with the recently reported rapid nuclear deadenylation of polyA tails occurring after transcription32. In addition, we found that transcripts with high rates of synthesis (Fig. 2D, cluster A) had particularly short tails (median 110 nt, Supplementary Fig. 13), derived from compact transcriptional units accounting for short 3′UTRs and a low number and size of exons and introns (Supplementary Fig. 14), and were often involved in translation (Hypergeometric test p < 8.85e-9).Gathering all the expected data could be complicated due to various reasons. We thus developed alternative simplified models accommodating the lack of one or more RNA species (Fig. 2F, Supplemental Material and Supplementary Figs. 42–56). For instance, premature RNA might not be found in the nucleoplasm for genes that do not go through post-transcriptional processing, or due to insufficient sequencing depth, which could prevent the quantification of these low abundant species. We thus developed an alternative model in which RNA processing is only co-transcriptional. More in general, processing might be not applicable at all for certain genes like intron-less transcriptional units (~13% of the UCSC annotated genes). Modelling RNA processing could also be complicated for very compact genomes and short introns, such as yeast and A. thaliana. To accommodate these scenarios, we developed an alternative model in which transcripts are synthetized directly into their mature form. Finally, to deal with non-coding genes, we implemented a framework lacking the step of association with polysomes. These models could be useful also when technical reasons prevent the acquisition of polysomal RNA which typically requires dedicated instruments and specific expertise. Supplemental Material reports the reanalysis of untreated SUM159 cells with all these simplified models, and their comparison with the complete model.Finally, we took advantage of the simplified models to compare RNA synthesis and cytoplasmic degradation rates returned by Nanodynamo against those determined with INSPEcT, which requires only metabolically labelled and total RNA sequencing data. Spearman correlations for the rates of RNA synthesis and degradation are 0.96 and 0.21 respectively. Noticeably, the modest score for RNA degradation likely reflects the differences in modelling between INSPEcT and Nanodynamo for this step of the RNA life cycle (Supplementary Fig. 1). We also extended the analysis including the rates estimations from nano-ID (Spearman correlation of 0.75 and 0.92 for synthesis and degradation, respectively).Altogether, the application of Nanodynamo for analysing the dynamics of RNA metabolism in an untreated cell line revealed sets of genes characterised by substantially different combinations of kinetic rates and suggested the coordinated regulation of various steps of the RNA life cycle. We therefore set out to characterise how RNA dynamics are shaped by perturbations directed against specific steps of the RNA life cycle.Blocking the spliceosomal machinery leads to a switch from co- to post-transcriptional processingAfter characterising the transcriptional programs of untreated SUM159 cells, we investigated the response of the same cell system to splicing perturbation mediated by Pladienolide B. This drug targets the splicing factor SF3B1, resulting in an increase of intronic signal33. We confirmed the expected accumulation of intronic signal by RT-PCR and by inspecting various genes following dRNA-seq (Fig. 3A, B and Supplementary Figs. 15, 16). Globally, the proportion of premature reads increased by at least 4-fold for 30% of genes, compared to untreated cells (Fig. 3C). We also observed a significant reduction of nascent RNA for all nuclear fractions (between 17% and 27%, two-sided Kolmogorov-Smirnov test p = 0; Supplementary Fig. 17) suggesting a reduction in RNA synthesis. Finally, polysome fractionation revealed a marked reduction in the presence of polysome-bound RNAs (Fig. 3D, Supplementary Fig. 6B), suggesting a reduced association with the polysome.Fig. 3: Blocking the spliceosomal machinery with Pladienolide B leads to a switch from co- to post-transcriptional processing.A Coverage for IRX2 in untreated (blue) and treated (grey) cells. B Expression, relative to RPLPO, of CNBP spliced (black) and unspliced (blue) reads over time after treatment. Control is set to 1 (dashed line), 3 technical replicates and their mean values are reported. C Scatterplot comparing, for each gene, the proportion of intron-containing reads in untreated and treated cells (identity line in black, 2 and 4 folds increases lines compared to untreated cells in red and green). D Typical polysome trace for treated cells. E Distributions of log10 kinetic rates for untreated (purple) and treated (light blue) cells. Distributions medians and interquartile ranges as horizontal and vertical black lines respectively. F Heatmap reporting for each modelled gene (rows) RNA species abundance and kinetic rates magnitude (columns). For each column, values are saturated between the 1st and 99th percentiles and normalised against the latter. The left bar indicates gene sets similar in abundances and kinetic rates (k-means clustering). G Heatmap depicting abundances and kinetic rates modulations in response to treatment (log2 fold changes saturated between −2 and +2). The left bar indicates gene sets similar in abundances and kinetic rates modulations (k-means clustering). H Distributions of log10 synthesis and processing rates in untreated cells for genes clusters defined in G. I Gene-level median polyA tail length distributions across RNA pools. In the figure, Ch, N, C, and P denote chromatin associated, nucleoplasmic, cytoplasmic, and polysomal RNA while p and m specify premature and mature forms. Analyses reported in E, F were performed on 1950 genes processed with the complete model in the treated condition. Analyses reported in C, G, H were performed on 996 genes processed with the complete model also in untreated condition. The analysis reported in I was performed on genes with at least one profiled polyA tail after replicates pooling (11319, 11385, 11262, 11168 genes for Ch, N, C, and P). For boxplots, the horizontal line represents the median value, the box edges represent the 25th and 75th percentiles, and the whiskers show the range of data excluding outliers. Source data are provided as a Source Data file.Following the same approach presented for the untreated condition, we checked the reproducibility of our results with the independent analysis of the two replicates (Supplementary Figs. 18 and 19). We were able to model 1950 genes with Nanodynamo following the treatment of SUM159 cells with Pladienolide B (Fig. 3E; minimum and median Spearman correlations between modelled and experimental data 0.51 and 0.92, respectively; Supplementary Fig. 20). Pladienolide B treatment led to a significant albeit mild reduction in RNA synthesis (Fig. 3F, Unpaired one-sided Wilcoxon test p < 5.51e-25). The rates that presented bimodal distribution in the untreated cells confirmed their bimodality following the drug treatment (Fig. 3F). We observed the reduction in co-transcriptional splicing and detachment of mature RNA from chromatin (Unpaired one-sided Wilcoxon test p < 8.6e-5), matching an increase in premature RNA associated with chromatin (Fig. 3F, G). The slowdown in co-transcriptional RNA processing is an expected consequence of Pladienolide B treatment and reassures about the rates modelled with Nanodynamo. In addition, the reduction in the co-transcriptional steps were accompanied by a slight increase in the detachment of premature RNA from chromatin and in the processing of premature nucleoplasmic transcripts. These data suggested a switch from co- to post-transcriptional processing pathways. Indeed, for 26% of the 966 genes that could be modelled in both the untreated and Pladienolide B treated cells, the drug treatment led to reduced co-transcriptional rates (k2-3) and increased nuclear post-transcriptional rates (k4-5), compared to untreated cells. Rather, 15% of genes switched in the opposite direction. Eventually, the reduction in co-transcriptional processing prevailed, since the net result of these opposite modulations resulted in a marked reduction of mature RNA in the nucleoplasm (Fig. 3G). Noticeably, genes repressed in co-transcriptional processing (Fig. 3G cluster C) were the least expressed among the most efficiently co-transcriptionally spliced (low k1 and high k2, Fig. 3H). Similarly, genes repressed in post-transcriptional processing (Fig. 3G cluster A) were the most efficiently post-transcriptionally spliced and were also particularly low expressed (high k5 and low k1, Fig. 3H).We also observed an increase in nucleoplasmic RNA export following Pladienolide B treatment, which might represent an attempt to compensate for the aforementioned shortage of nucleoplasmic transcripts (Fig. 3F, G). However, in the cytoplasm, the rate of association with actively translating polysomes decreased, as a consequence of the strong reduction in the yield of the polysomal transcripts (Fig. 3G). This was accompanied by a marked increase in polysomal degradation, suggesting the need of removing transcripts that could not be properly translated, potentially through the nonsense-mediated decay pathway34. Transcripts containing Terminal Oligo Pyrimidine motifs at their 5′ end (5′ TOP) encode proteins that are essential for protein synthesis, whose translation is decreased under stress conditions. The vast majority of 5′ TOP factors that we could assess had indeed a markedly reduced rate of polysomal association, suggesting their involvement in the broad reduction of this RNA life cycle step (Supplementary Fig. 21A)35.Finally, we found that the length of polyA tails was reduced following Pladienolide B treatment and that their shortening after transcription was less pronounced, compared to untreated cells (Fig. 3I). Changes in polyA tails length were positively correlated with changes in the rate of RNA export (Spearman correlation 0.25, p < 8.29e-15), suggesting that the tails length could impact export efficiency36. Rather, polyA tails length for genes in cluster E are more similar to those of untreated cells (Supplementary Fig. 22). Genes in this cluster are characterised by a marked increase in RNA export, and a mild reduction in RNA synthesis, potentially as an attempt to compensate for the drug effects.Altogether, Nanodynamo revealed that treating cells with a drug directed against the splicing machinery has broad consequences that go beyond the expected repression of RNA processing, involving a switch from co- to post-transcriptional RNA processing and impacting the export and translational machineries.Perturbation of the export machinery leads to downstream alterations of translation and degradation machineriesFor the block of RNA export, we opted for treating SUM159 cells for 16 h with Leptomycin B, an inhibitor of CRM1, a major receptor for the export of RNA and proteins to cytoplasm37. Polysome fractionation revealed a marked reduction in the presence of polysome-bound RNAs (Fig. 4A, Supplementary Fig. 6D), suggesting a reduced rate of association with the polysome. After a check on the reproducibility of the modelling across two independent replicates (Supplementary Figs. 23 and 24), we performed the inference of the complete model as described above resulting in 1371 genes (Fig. 4B).Fig. 4: Perturbation of the export machinery with Leptomycin B leads to downstream alterations of translation and degradation machineries.A Typical polysome trace for treated cells. B Heatmap reporting for each modelled gene (rows) RNA species abundance and kinetic rates magnitude (columns). For each column, values are saturated between the 1st and 99th percentiles and normalised against the latter. The left bar indicates gene sets similar in abundances and kinetic rates (k-means clustering). C Heatmap depicting genes (rows) abundances and kinetic rates modulations (columns – log2 fold changes saturated between −2 and +2) in response to Leptomycin B treatment for genes reduced or induced in export (k6 log2 fold change < −1.5 or > 1.5 respectively). D Nuclear vs cytoplasmic RNA expression, relative to RPLPO, for the indicated genes in untreated cells (purple) and after 16 h of Leptomycin B treatment (light blue); 3 technical replicates and their mean values are reported. E Kinetic rates distributions in log10 scale for untreated (purple) and treated (light blue) cells; horizontal and vertical black lines represent medians and interquartile ranges of the distributions, respectively. F As in C for 1371 genes modelled both in untreated and treated cells. The left bar indicates gene sets similar in abundances and kinetic rates modulations (k-means clustering). G Gene-level median polyA tail length distributions across RNA pools. In the figure, Ch, N, C, and P denote chromatin associated, nucleoplasmic, cytoplasmic, and polysomal RNA while p and m specify premature and mature forms. Analyses reported in B, E were performed on 1371 genes processed with the complete model in the treated condition. The analysis reported in F was performed on 778 genes processed with the complete model in both untreated and treated conditions. The analysis reported in G was performed on genes with at least one profiled polyA tail after replicates pooling (12620, 11369, 10138, 12028 genes for Ch, N, C, and P). For boxplots, the horizontal line represents the median value, the box edges represent the 25th (Q1) and 75th (Q3) percentiles, and the whiskers show the range of data excluding outliers (observations lower that Q1 − 1.5 * interquartile range or larger than Q3 + 1.5 * interquartile range). Source data are provided as a Source Data file.According to a recent report, overnight Leptomycin B treatment impacted RNA export only for selected genes. Consistently, we found 40 genes that were markedly impacted in their export rates (k6 ratio > 2.5 compared to untreated cells), which were reduced for 32 of them (Fig. 4C). RT-PCR validation in an independent experiment confirmed the altered ratio of nuclear vs cytoplasmic RNA for 6 genes previously reported to be affected by Leptomycin B. Specifically, the transcripts for 4 genes (LGALS, PRKAG2, POLE4, ADARB1) were accumulated in the nucleus and depleted in the cytoplasm, while RNAs for 2 genes (CYREN, DOTIL) were accumulated in the cytoplasm and depleted in the nucleus, as previously described38. We additionally validated 2 genes that were not previously reported being altered in RNA export (DOTIL and LGALS) (Fig. 4D) and two genes that we identified as not impacted in their RNA export (Supplementary Fig. 25). Genes up-regulated in k6 were enriched in transcriptional units down-regulated in nuclear mature RNA (Two-sided exact Fisher test p < 4.0e-4) and vice versa for genes down-regulated in k6 (Two-sided exact Fisher test p < 1.5e-3—see the “Differential RNA species” methods section for the definition of differentially expressed genes. Notably, genes modulated in opposite directions in RNA export were differently affected in other RNA life cycle steps. In particular, RNAs whose export was reduced were more likely to be reduced in RNA synthesis and co-transcriptional events (while being promoted in post-transcriptional nuclear events), whilst those that were promoted in the export were more likely to be increased in cytoplasmic degradation and association with polysomes.Comparing all the modelled genes to untreated cells revealed that 48% of these were polarised in terms of co- or post-transcriptional processing (Fig. 4E, F). Indeed, these genes were either promoted in co-transcriptional processing and detachment of mature RNA from chromatin, while being hampered in the detachment of premature RNA from chromatin and post-transcriptional processing (Fig. 4F cluster D), or subjected to the opposite modulation (Fig. 4F clusters A-C). Even more strikingly and consistently, most of the modelled genes had altered rates of polysomal association and degradation (Fig. 4E, F). In particular, we observed a marked increase in polysomal degradation (98% of the genes). Similarly to Pladienolide B, the polysomal association of transcripts encoding for 5’ TOP factors was markedly reduced, suggesting their involvement in the broad reduction of this RNA life cycle step (Supplementary Fig. 21B)35. Analysis of sequence features for the modelled genes indicated that transcripts repressed in RNA export (Fig. 4C top) had shorter 5′ UTRs and fewer, shorter exons compared to genes unaffected in RNA export and, even more prominently, compared to genes with increased RNA export (Supplementary Fig. 26).Finally, the analysis of polyA tails length following Leptomycin B treatment revealed that, while they were shortened following the detachment from chromatin as discussed for the untreated cells, they had a marked increase in length for the transcripts associated with chromatin (Fig. 4G). Similarly to Pladienolide B, changes in polyA tails length were positively correlated with changes in the rate of RNA export (Spearman correlation 0.11, p < 0.0023).Eventually, given the length of the Leptomycin B treatment and the impact on the export of transcripts encoding for proteins involved in RNA decay and translation (Fig. 4C), it is possible that the observed major alterations in polysomal association and decay could be attributed to indirect downstream effects of the alteration of those Leptomycin B targets. Altogether, Nanodynamo revealed that treating the cells with a commonly used drug against RNA and protein export has consequences that are broader and more complex than expected, possibly due to nonspecific effects following the prolonged drug treatment.Blocking the translational machinery hampers RNA export and cytoplasmatic degradationThe block of translation was obtained by treating SUM159 cells for 1 h with Harringtonine, an inhibitor of translation initiation. The net result of this treatment was a clear block of translation, as determined through Polysome fractionation profiles (Fig. 5A, Supplementary Fig. 6C). As a result, we could not isolate enough RNA associated with polysomes for the sequencing (Fig. 5B). For this reason, we applied Nanodynamo based on a simplified model which neglects RNA translation and has only one step of cytoplasmic degradation (Supplementary Fig. 27). After a check on the reproducibility of the modelling across two independent replicates (Supplementary Figs. 28 and 29), we succeeded in modelling 1616 genes (Fig. 5C). The block of translation had positive consequences on RNA synthesis for 17% of the modelled genes, and negative consequences for the remaining ones (Fig. 5D, E). 40% of the genes switched from co- to post-transcriptional processing or vice-versa. These changes were accompanied by a reduction in RNA export (71% of the genes). The slowdown in RNA export was also associated with a marked reduction in cytoplasmic degradation (77% of the genes), leading to an accumulation of transcripts in the cytoplasm.Fig. 5: Blocking the translational machinery hampers RNA export and cytoplasmatic degradation.A Typical polysome trace for Harringtonine treated cells. B Yield of polysomal RNA in untreated and Harringtonine treated cells; two biological replicates per condition reported. C Heatmap reporting for each of the modelled genes (rows) RNA species abundance and kinetic rates magnitude (columns). For each column, values are saturated between the 1st and 99th percentiles and normalised against the latter. The left bar indicates gene sets similar in abundances and kinetic rates (k-means clustering). D Heatmap depicting genes (rows) abundances and kinetic rates modulations (columns) in response to Harringtonine treatment (log2 fold changes saturated between −2 and +2). The left bar indicates gene sets similar in abundances and kinetic rates modulations (k-means clustering). E Kinetic rates distributions in log10 scale for untreated (purple) and Harringtonine treated (light blue) cells; horizontal and vertical black lines represent medians and interquartile ranges of the distributions, respectively. F Gene-level median polyA tail length distributions across RNA pools. In the figure, Ch, N, and C denote chromatin associated, nucleoplasmic, and cytoplasmic RNA while p and m specify premature and mature forms. Analyses performed in C, E were performed on 1616 genes processed with a model accounting for all the steps of the RNA life cycle except for association with polysomes and polysomal degradation in the Harringtonine treated condition. The analysis reported in D was performed on 896 genes processed with the same model used for C, E in both untreated and Harringtonine treated conditions. The analysis reported in F was performed on genes with at least one profiled polyA tail after replicates pooling (11832, 12242, 11646 genes for Ch, N, and C). For boxplots, the horizontal line represents the median value, the box edges represent the 25th (Q1) and 75th (Q3) percentiles, and the whiskers show the range of data excluding outliers (observations lower that Q1 − 1.5 * interquartile range or larger than Q3 + 1.5 * interquartile range). Source data are provided as a Source Data file.Finally, similarly to the Leptomycin B treatment, also in the case of Harringtonine treatment the length of polyA tails showed a marked increase for the transcripts associated with chromatin (Fig. 5F).Altogether, Nanodynamo revealed that blocking RNA translation has major consequences on all steps of the RNA life cycle, leading to an accumulation of transcripts in both the nucleus and the cytoplasm.Coupling of RNA life cycle steps markedly influence the coordinated response of RNA metabolism to perturbationsThe comprehensive characterisation of RNA dynamics in untreated cells and how they are impacted by a set of perturbations is well suited for studying the coupling between steps of the RNA life cycle. We reasoned that we could study the coordination of the corresponding machineries by determining whether changes in the kinetic rates following the described drug treatments are correlated. We then determined, for each pair of kinetic rates and for each drug treatment, the correlation between kinetic rates log2 Fold Changes to the untreated condition. Significant correlations (p < 1e-4) upon treatment with Pladienolide B were displayed as edges in a graph, whose width and colour identify strength and direction of coupling, respectively. The same analysis was repeated upon perturbation with Leptomycin B and Harringtonine (Fig. 6A). Importantly, the latter is only partially informative to this regard, as it derives from the implementation of a simplified model lacking RNA life cycle steps associated with polysome. In addition, while the readout of the Leptomycin B treatment could be complicated by potentially prevalent indirect effects, this was not relevant for the identification of couplings. Indeed, what mattered for these analyses is that whichever the perturbation was, we could determine how RNA metabolism adapted to it. Eventually, 92% of the couplings identified upon Pladienolide B treatment were shared with and had the same direction of those identified upon Leptomycin B treatment. Similarly, the couplings were shared and had consistent direction with those reported for the Harringtonine treatment (Fig. 6A).Fig. 6: Coupling of RNA life cycle steps markedly influence the coordinated response of RNA metabolism to perturbations.A Networks depicting, for each treatment, coupling among steps of the RNA life cycle identified through Spearman correlation of rates modulations in treated versus untreated cells (log2 fold changes). Nodes and edges represent steps of the RNA life cycle and significant correlations among them respectively (p < 1e-4 – significance estimated with the AS89 algorithm of the cor.test R function). Edges thicknesses and colours represent magnitude and sign of correlations respectively (negative and positive values in blue and red respectively). B As in A for couplings conserved between Pladienolide B and Leptomycin B treated cells (edges thicknesses given by correlations mean across treatments). C Heatmap depicting, for the Pladienolide B treatment, the couplings (columns—see panel A) exploited by each gene (rows). Colours represent specific regulations: beige indicates that at least one of the two rates for a given coupling is not regulated, yellow denotes their coordinated up-regulation, green denotes their coordinated down-regulation, and blue indicates their opposite regulation. A rate is considered modulated given an absolute log2 fold change larger than 0.5. D Bar plot reporting the 5 most significant proteins (RNA Binding Proteins—RBPs—in light blue and Transcription Factors—TF—in grey), according to a GSEA analysis, for each coupling identified in Pladienolide B treated cells (see panel A). For a pair of rates defining a coupling, the GSEA ranking was defined according to the product of their modulations in Pladienolide B treated versus untreated cells (log2 fold changes) times the sign of their Spearman correlation. GSEA adjusted p-values (Benjamini-Hochberg) are reported above each bar. Analyses in (A – “Pladienolide B”) and C, D were performed on 996 genes processed with the complete model in both untreated and Pladienolide B treated conditions. Analysis in (A – “Leptomycin B”) was performed on 778 genes processed with the complete model in both untreated and Leptomycin B treated conditions. Analysis in (A – “Harringtonine”) was performed on 896 genes processed with a model accounting for all RNA life cycle steps except for association with polysomes and polysomal degradation in both untreated and Harringtonine treated conditions. Source data are provided as a Source Data file.The final network of couplings shared between Pladienolide B and Leptomycin B revealed couplings linking processes within and between nucleus and cytoplasm (Fig. 6B). RNA synthesis (k1) turned out to be positively coordinated with co-transcriptional processing and detachment of mature RNA from chromatin (k2-3). Detachment of premature transcripts and post-transcriptional processing (k4-5) were also positively associated. Rather, transcription and co-transcriptional processing steps were anti-correlated with the post-transcriptional processing counterparts. Additionally, the rate of synthesis (k1) was positively coupled with RNA export (k6) and polysomal degradation (k9), while both k1 and k9 were negatively coupled with polysomal association (k8). k1 and k9 were globally down- and up- regulated in response to the Pladienolide B (Fig. 3G). Consequently, their positive coupling denoted that strong regulations of the former were associated with weak down-regulations of the latter, and vice-versa (Supplementary Fig. 30). Finally, these analyses revealed that co-transcriptional processing steps were positively coupled with several downstream steps, including export and both degradation rates. The same holds true for post-transcriptional processing steps, while this occurred through negative couplings. These observations suggested the relevance of RNA processing in shaping gene expression responses.We next wondered how the response of individual genes is shaped by the reported couplings. For each of the identified global couplings, and for each modelled gene we determined whether that gene exploited it, and whether the coupling involved the coordinated increase or decrease of the kinetic rates or their opposite modulation (Fig. 6C for Pladienolide B response and Supplementary Fig. 31 for the other drug treatments).Genes responding to Pladienolide B exploited a remarkable fraction of the possible mechanisms (14 couplings for 50% of the genes) typically involving the coordination between rates in different cellular compartments. Half of the genes (clusters E-L) were pervasively regulated by couplings, involving most of the interactions reported for Pladienolide B (Fig. 6C). Among these, only those in cluster F differed since they did not exploit couplings stemming from the synthesis step, while they relied on the coordinated response between co- and post-transcriptional processing steps (k2-5) and between these and cytoplasmic steps. We characterized the clusters of genes in terms of structural features, CG content, and polyA tail length. Genes within cluster G, which were down-regulated in synthesis and in both co- and post- transcriptional processing rates, were the only that had longer tails compared to untreated cells in contrast with the trend observed for the other gene sets (compare Supplementary Fig. 32 with Figs. 2E and 3I). In terms of size and genomic complexity, genes in clusters I, which did not exploit the coupling to polysomal degradation, were characterized by the longest transcriptional units. Rather, genes in the aforementioned cluster F, which did not exploit the coupling with RNA synthesis, were characterized by the shortest transcriptional units (Supplementary Fig. 33). These results were largely confirmed by the analysis of Leptomycin B response.We used the Hamming distance to compare the coupling profiles of the genes modelled upon both Pladienolide B and Leptomycin B treatments. Using as a reference a null distribution of Hamming distances, obtained through the shuffling of the heatmaps columns (Supplementary Fig. 34—see methods section “Couplings” for details), we identified 97 genes whose coupling profiles were consistent following the two treatments. This analysis indicated the conservation of gene-level coupling for a sizable fraction of the analysed genes, involving genes with either a high or a low number of couplings (Supplementary Fig. 35).Finally, we sought to identify regulatory factors potentially responsible for each of the detected couplings. To this end, we took advantage of ENCODE data to search for RNA Binding Proteins (RBPs) and Transcription Factors (TFs) targeting genes supported by a given coupling. Specifically, we computed the product of log2 Fold Changes for each pair of coupled rates, and we used this quantity to perform Gene Set Enrichment Analyses (see methods section “Couplings” for further details). Overall, we identified 100 and 114 factors for Pladienolide B and Leptomycin B respectively (GSEA adjusted p < 0.05 for at least one edge), the vast majority deriving from the k1,9 edge. 93 of these factors were shared further supporting the conservation of coupling mechanisms. The 5 most significant factors for each coupling were reported in Fig. 6D for the Pladienolide B treatment, and in Supplementary Fig. 36 for the Leptomycin B one. Three proteins involved in gene expression regulation emerged as top candidates for implementing the couplings between RNA synthesis and processing (k3-5) in both the treatments: NIPBL, APEX1, and PABPN1. Interestingly, the latter takes part in RNA polyadenylation which might suggest the involvement of this regulatory layer in mediating transcriptional couplings39,40.

Hot Topics

Related Articles