Complex transcriptional regulations of a hyperparasitic quadripartite system in giant viruses infecting protists

A. castellanii host (co-)infections by megavirus chilensis giant virus, zamilon vitis virophage and megavirus vitis transpovironTo study the transcriptional impact of the different players of this hyperparasitic system, we used 4 different infection setups (Fig. 1A). A. castellanii cells were infected with megavirus chilensis, a GV devoid of an associated Vp or Tpv16. The results were compared to 3 additional coinfection experiments, adding the Vp or the Tpv, or both (Fig. 1A). All 4 conditions were followed during a complete GV infectious cycle, with RNA samples collected at the same timepoints from 30 min to 12 h post infection (pi), resulting in 6 samples per condition (Fig. 1B). In addition, mock-infected cells were included as controls using heat-inactivated giant viruses and associated players (see “Methods”). All experiments were carried out in three biological replicates.Fig. 1: Schematic diagram of (co-)infection experiments.A Schematic representation of the 4 infection experiment conditions (in columns) for which RNA-seq was performed. Four partners are involved: A. castellanii (C: purple), megavirus chilensis (GV: green), zamilon vitis (Vp: orange) and megavirus vitis transpoviron (Tpv: blue). The players involved in each condition are indicated at the top of each column. B The timeline summarizes the time points at which RNA samples were collected, with the time post-infection indicated below each collection point. Prior RNA extraction, some samples were pooled, as indicated by braces, and will be subsequently referred to by the names shown at the top (from T1 to T6, and mock). Icons representing partners were created with BioRender.com.A total of 82 polyA-enriched RNA samples were successfully sequenced, resulting in an average of 15.8 million read pairs per sample, of which 98.5% passed quality control (Supplementary Data 2A). These were mapped on the reference genomes of A. castellanii (C), zamilon vitis (Vp), megavirus vitis transpoviron (Tpv) and megavirus chilensis (GV). Of note, we reassembled the genomic sequence of the latter (see “Methods”), which added 19,414 bp terminal inverted repeats (TIRs) to the previously reported sequence27. The genomic structure is similar to other viruses from the same genus28 (Figure S1).Most of the read pairs (mean = 85.4%, sd = 1.7%, Supplementary Data 2B) were successfully aligned to a combined reference gene set of all partners present. In each condition, all present players were detected, with a maximum proportion of mapped reads of 100%, 98.3%, 18.3% and 1%, for C, GV, Vp and Tpv, respectively (Figure S2). After filtration for low expression (see “Methods”), we also found that the vast majority of the annotated genes were expressed, with 82.7% for C (12845/15532), 98.2% for GV (1134/1155), and 100% for Vp and Tpv (with 20/20 and 7/7, respectively). Additionally, saturation curves show that sequencing depth was adequate to capture the gene expression dynamics of each individual partner (Figure S3).The A. castellanii host transcriptome is reprogrammed by megavirus chilensis infectionWe first explored the transcriptional response of the A. castellanii host to its infection with GV alone. As shown by principal component analysis (PCA) of normalized host genes expression values (in transcript per million, TPM), the biological replicates are robust as they cluster together at each timepoint (Figure S4A). Moreover, infection time is the main source of variance since timepoint-specific segregation of the samples was observed. Importantly, mock and T1 timepoints do not overlap, demonstrating that the host transcriptome is disrupted as early as 30 min pi. By contrast, the proximity of T2 and T3 timepoints, as well as T5 and T6, indicates that there is no global shift in host transcription between 1h15min and 4 h pi, and from 7 h pi to the end of the infectious cycle.We first analyzed the host transcriptome at the beginning of the infectious cycle by comparing mock and productive GV infection (T1). This revealed 347 differentially expressed genes (FDR Pvalues < 0.01 and |log2(Fold Change)| ≥ 1.5). Among them, the 162 whose expression increases at T1 are mainly involved in signal transduction (Pvalue = 1.3 × 10−4, Supplementary Data 3A) and are enriched in Rho family small GTPases (Supplementary Data 4A). These proteins are potentially linked to cytoskeletal remodeling at the initial stages of infection29. We also found 6 kinesins among the 185 genes significantly underexpressed at T1, supporting the microtubule-based movement gene ontology (GO) term enrichment (Pvalue = 6.1 × 10−6, Supplementary Data 4A). Finally, nucleosome assembly is also potentially disrupted by infection (GO term Pvalue = 1.5 × 10−3), as 3 cellular histones (1 H1 and 2 H4, Supplementary Data 4A) are also underexpressed at T1 compared to mock infection.We next expanded the analysis by comparing all timepoints (T1 to T6) to mock infection and found that from 2.7% (T1 vs mock) to 26.8% (T4 vs mock) of the host’s expressed genes were differentially expressed. Combined together, we identified a total of 5859 differentially expressed genes between at least one timepoint and mock, which corresponds to 45.6% of the expressed host genes. Hybrid hierarchical k-means (hkmeans) clustering of those genes (k = 4) shows specific expression patterns and functions (Fig. 2 and Supplementary Data 3B). In addition, we scrutinized the A. castellanii promoter sequences to identify enriched motifs around transcription start sites (TSS) in relation to expression timing (Figure S5).Fig. 2: Transcriptional patterns of differentially expressed A. castellanii genes during megavirus chilensis infectious cycle.Hkmeans clustering of the 5859 cellular genes differentially expressed between mock and GV infection. The leftmost part of the figure shows clusters names and the total number of genes associated to each cluster. Average normalized expression (Z-score of log2-transformed TPM values) over time post-infection is then showed for each cluster (solid lines), as well as the corresponding standard deviation (colored areas). The heatmaps show the expression patterns (Z-scores of log2-transformed TPM values) of each gene (rows) along the different timepoints (columns), averaged over the replicates. Colored histograms display one-sided Fisher Exact test Pvalues of the biological processes (GO terms) significantly enriched in each cluster (Pvalues ≤ 0.005). The numbers on the right indicate the number of genes with a given GO term (in cluster vs all differentially expressed genes). Functional annotations are displayed on the rightmost part of the figure. Source data are provided as a Source Data file.The first cluster (C-1), with gradually decreased expression over the time-course, is enriched in carbohydrate metabolism, specifically galactose metabolism (Fig. 2 and Supplementary Data 4B). Galactose is a major component of the A. castellanii cyst wall30. Here we found that genes associated with encystment, such as CSP21 (BAESF_04785) and encystation-mediating serine proteinase (BAESF_02870), are either weakly expressed or exhibit reduced transcriptional levels over time (Supplementary Data 5A). Thus, like mimivirus, megavirus likely represses encystment-mediating genes that would prevent viral infection31. C-1 cluster also contains the majority of licensing factors involved in DNA replication initiation (5/6) and a cell division control protein (CDC45-like), suggesting an arrest of the cell cycle upon infection. Promoters of genes found in this cluster are enriched in 4 motifs that resemble known transcription factors (TF) binding sites (Figure S5). One corresponds to E2F transcription factor and two others to HAP2 that recognizes CCAAT-box motifs.The second cluster (C-2) shows a similar profile but with stabilized expression from T1 to T3. Thus, viral infection induces steady expression of cellular genes that are mostly involved in cell redox homeostasis (including numerous glutathione and thioredoxin reductases) and in lipid metabolism (Supplementary Data 4B).In the third cluster (C-3), genes are first slightly repressed by the infection, then upregulated in T2 and T3 (1h15min to 4 h pi) to recover their basal expression level, before steadily decreasing again. They span various cellular functions that include 6 genes involved in autophagy (Supplementary Data 4B), a cellular process frequently deployed to restrict viral infections32. Numerous genes involved in protein modifications such as phosphorylation/dephosphorylation and ubiquitination/deubiquitination are also activated (Fig. 2). The manipulation of the host ubiquitin system to promote viral replication is widespread in viruses, in particular giant viruses from the Nucleocytoviricota phylum33,34. Although giant viruses strikingly encode translation-related genes35, they remain dependent on the cellular host ribosomes for protein synthesis. Accordingly, we found cellular translation-related genes, several of which are involved in ribosome biogenesis and maturation, also enriched in this cluster. Surprisingly, we also found 87 transcription-related genes enriched in this cluster, including several units of the RNA polymerase I, II, and III, as well as transcription factors (RFX).Finally, the last cluster (C-4), showing strong activation at T2 and T3 (from 1h15min to 4 h pi) also contains transcription factors in addition to chaperone proteins (DnaJ and HSP90). The latter may be part of the cellular stress response induced by the infection, or are specifically activated to support viral proteins folding36. In this cluster, gene promoters are enriched in two motifs, one of which corresponds to STAT transcription factor binding sites (Figure S5).Taken together, these patterns show that the A. castellanii transcriptome is strongly reshaped by megavirus infection. Although all cellular genes are relatively underexpressed by the end of the infectious cycle, numerous functions, counting for a third of the expressed genes (4355/12845), are either maintained (cluster C-2) or activated (clusters C-3 and C-4) at various degrees. Several forces are probably at play here. The host likely responses to viral infection by activating general stress factors and more specific immune mechanisms37. But we also observed specific functions triggered to support viral replication, in line with the concept of the cell transforming into virocell38,39.A similar transcriptional reprogramming of the host has been observed in Acanthameoba polyphaga, a related amoeba from the same genus, infected by mimivirus40. Specifically, A. polyphaga genes involved in DNA replication and cytoskeletal remodeling are underexpressed during the mimivirus replication cycle. Similarly, Acanthameoba genes involved in transcription, translation regulation, and proteasome are activated in both A. castellanii/megavirus and A. polyphaga/mimivirus infections. However, several cellular genes associated with ribosome maturation, autophagy and protein folding are exclusively activated in the present study. This is likely due to an increased sequencing depth and temporal resolution of our host transcriptome analysis.Temporal dynamics of megavirus chilensis gene expressionWe next analyzed the expression dynamics of the GV genes. As expected, almost all (n = 1098, Supplementary Data 5B) GV genes exhibit no expression in the mock sample, with the exception of few genes (n = 36) that have non-zero TPM values (maximum = 0.29 TPM). This likely corresponds to traces of mRNA loaded in few particles not fully inactivated by heat treatment, although no sign of infection was detected 24 h pi.As for the host, PCA analysis of the GV genes showed a tight clustering of the replicates, and a strong segregation of the different timepoints with the exception of T5 and T6 (Figure S4B). This suggests that GV gene expression is highly dynamic from 30 min to 6 h pi and remains stable from 7 h pi until the end of the infectious cycle. Furthermore, as previously noticed for mimivirus25, virocell mRNA population is dominated by viral transcripts by the end of the replication cycle, with 97.5% of the mapped RNA-seq reads originating from viral genes at T6 (Figure S2).The clustering of viral genes by hkmeans (k = 5) revealed distinct patterns with significant enrichment of specific functions (Fig. 3A), and correlation with the presence of previously identified motifs in their promoters (Supplementary Data 1). No new motifs were identified using MEME-suite41 and Homer42. The first cluster (GV-1) shows a robust expression from the beginning of the infection (30 min pi), followed by a gradual decline over time. The majority (60%, 148/243) and significant proportion of the genes (Pvalue = 5.7 × 10−8, Supplementary Data 3C) have no known function, highlighting that most of the viral functions involved in the early stages of cell takeover are unknown. The rest are genes coding for Sel1 repeats-containing proteins, that are potentially involved in protein-protein and host interactions43. The second cluster (GV-2), with peak expression between 1h15min and 4 h pi, is also enriched in genes probably involved in protein-protein interactions (Ankyrin and FNIP repeats).Fig. 3: Transcriptional patterns of megavirus chilensis genes during the infectious cycle.A Hkmeans clustering of the GV genes with cluster names and number of genes in each cluster indicated on the left. The heatmaps show the expression patterns (Z-scores of log2-transformed TPM values) of each gene (rows) along the different timepoints (columns), averaged over the replicates. Normalized iBAQ values of the proteins identified by MS-based proteomics in viral particles are also indicated. Colored histograms display one-sided hypergeometric test Pvalues of the functional categories (see “Methods”) significantly enriched in each cluster (Pvalues ≤ 0.005). The numbers on the right indicate the number of genes with a given functional annotation (in cluster vs all expressed genes). B Relative genomic localization of all the GV genes assigned to each cluster. C Histogram of the proportion of GV genes according to their gene ancestry within the Imitervirales order (Figure S7). Gene age is measured by relative evolutionary divergence (RED) as defined by ref. 94 (see “Methods”). The oldest genes have a RED score of 0 and the most recent a RED score of 1.The third cluster (GV-3) contains all the genes involved in DNA replication and repair, such as the DNA polymerase, the PCNA sliding clamp and several copies of the small replication factor C (Supplementary Data 5B). It also reveals a strong activation of the virally-encoded transcription-related genes. Indeed, all of the 8 DNA-directed RNA polymerase subunits (RPB1-2, RPB5-7 and RPB9-11) are found within this cluster, as well as the mRNA capping enzyme, the poly(A) polymerase, the TATA-box binding protein (TBP) and 3 transcription factors (including TFIIB and TFIIS). Interestingly, mass spectroscopy (MS)-based proteomics of purified GV particles (see “Methods”) revealed that RNA polymerase subunits are packaged in virions (Supplementary Data 6A). The same applies to the early TF (mchi_571), expressed late during the infectious cycle (cluster GV-5, see further), like in poxviruses44. This is in line with the discovery that RNA polymerase subunits and early TF proteins are present in the protein-shielded genomic fiber of mimivirus45. Such preloading allows for rapid initiation of transcription at subsequent infection45. The fourth cluster (GV-4), mostly expressed in T3-T4 (from 3h15min to 6 h pi) contains many zinc-finger domain proteins, as well as the VLTF3-like late TF (mchi_455), a core Nucleocytoviricota gene.The largest and late expressing cluster (GV-5, from 5 h pi to the end of the infectious cycle) contains all the genes coding for the morphogenesis proteins, which comprises structural capsid proteins and the packaging ATPase46. As expected, this cluster strongly correlates with the proteins detected in GV virions by MS-based proteomics (Supplementary Data 6A and Fig. 3A). Genes coding for transmembrane-domain proteins are also enriched. Corresponding proteins are probably linked to the inner membrane layer found in viral particles (Fig. 3A) as they are predicted to localize at the endoplasmic reticulum by DeepLoc47 (Figure S6), membranes of which are the source of Megamimivirinae virion inner membrane48. Genes coding for collagen proteins are also enriched in this cluster, in agreement with their localization at the surface of viral particles49. In addition, Megamimivirinae virions are surrounded by heavily glycosylated fibrils50. Accordingly, we found cluster GV-5 to be strongly enriched in genes involved in carbohydrate metabolism, which includes 6 out of the 8 encoded glycosyltransferases. Finally, the most expressed gene in this cluster is a long non-coding RNA gene (mchi_663) homologous to R549b in mimivirus25.Overall, megavirus chilensis exhibits expression profiles of key functions that are similar to mimivirus during the replication cycle, whether infecting A. castellanii25 or A. polyphaga40. After 3 hours post infection, viral genes involved in DNA replication and transcription are highly expressed, and at the end of the infection cycle, genes associated with sugar metabolism, collagen, and capsid production are expressed in both viruses.We next questioned the genomic distribution of the GV genes along the genome as a function of their expression timing. As shown in Fig. 3B, GV genes are not uniformly distributed, with gene density gradually shifting from high concentration of early-expressed genes at genomic extremities, to high concentration of late-expressed ones at the center of the genome. In addition, gene age is not equal between clusters. By computing the relative evolutionary divergence (RED) of megavirus chilensis genes based on their conservation within the Imitervirales order (see “Methods”), we found that the proportion of recently acquired genes is higher in early-expressed clusters and conversely ancient genes are more frequent when lately expressed (Fig. 3C). To schematize, our data support a model in which more recently acquired genes involved in virus-host interactions are expressed first from the extremities of the genome, and older ones, especially those involved in virion morphogenesis, are subsequently expressed from the center of the genome. Similar trends of unequal distribution of ancient and recently acquired genes have been observed in several different families of GVs43,51,52,53, including pandoraviruses54, suggesting a common constraint in genome evolution.As previously described, the majority of host transcripts exhibit decreased expression levels during the late stages of infection. This includes genes with viral homologs, such as those involved in transcription, which are differentially expressed in both, the host (Fig. 2) and the GV (Fig. 3A). Focusing on shared transcription-related genes, we found that their expression levels usually overlap towards T2-T3, but while host gene expression drastically drops right after, the expression of virally-encoded homologs is generally maintained until the end of the infectious cycle (Figure S8). Assuming viral homologs preserve cellular functions, like the poxviruses-encoded DNA-dependent RNA polymerase44, transcriptional capacity of the virocell might be maintained by GV compensation. Nevertheless, there are numerous examples in giant viruses of virally-encoded homologs that evolved distinct functions from their cellular counterparts33,55. Further studies on the megavirus-encoded transcriptional machinery components will thus be required to explore their role within the virocell during infection.Megavirus vitis transpoviron has no effect on the virocell transcriptomeIn addition to the infection of A. castellanii cells with megavirus chilensis, we performed a similar experiment with megavirus chilensis associated with megavirus vitis transpoviron (from16) (C + GV + Tpv, Fig. 1A). The aim was to reveal the transcriptional program of Tpv genes, as well as its potential impact on the virocell transcriptome (C + GV, Fig. 1A).The mapping of RNA-seq reads on the Tpv genome first confirmed that all predicted Tpv genes are transcribed, with some as early as T2 (1h15min to 2h30min pi) (Fig. 4A). Tpv transcription then drastically increases at T4 (5–6 h pi) until the end of the GV infectious cycle (Fig. 4B). Interestingly, the weakly expressed mvtv_1 gene, with a maximal expression of 2.7 TPM compared to the other Tpv genes (minimum = 12.1, maximum = 798), has an opposite expression profile with strong repression from T4 onwards (Fig. 4B). Examination of this genomic locus shows transcription from the opposite strand, possibly originating from the downstream neighboring gene (mvtv_2, Fig. 4A), suggesting an antisense transcriptional interference. It is also the sole Tpv gene with an early regulatory motif56 in its promoter region, located 54 nt upstream of the start codon (Fig. 4A and Supplementary Data 1).Fig. 4: Transcriptional patterns of megavirus vitis transpoviron genes during the infectious cycle.A Coverage plot of RNA-seq reads (from one replicate) mapped using STAR85 on the Tpv genomic sequence during GV infection. Coverage of reads from the forward strand is shown in red and reads from the reverse strand in blue. Tpv genome annotation is shown at the top with genes from the forward strand in red, reverse strand in blue, and TIRs in gray. Presence of motifs in promoter regions is depicted using purple and green arrows for early and late motifs, respectively. B Hkmeans clustering (k = 2) of the Tpv genes with associated dendrogram on the left. The heatmap shows the expression (Z-score of log2-transformed TPM values) of each gene (rows) along the different timepoints (columns), averaged over the replicates. Gene names and functional annotations are displayed on the right.Unexpectedly, we also observed transcriptional signal originating from Tpv TIRs as strong as in the predicted genes (Fig. 4A). TIRs are devoid of annotated protein-coding genes, but 3 short open reading frames (ORFs) of 33 to 51 amino acids were identified (Supplementary Data 7). No peptide from previously published MS-based proteomics data16 of Vp-infected GV virions and purified Vp virions could be assigned to these ORFs. This suggests that TIR regions encompass highly expressed unidentified small proteins, or, most likely, ncRNAs of unknown function.Comparison of the virocell with and without Tpv (C + GV+Tpv vs C + GV, Fig. 1A) revealed that only 4 (out of 12845) cellular genes are differentially expressed, with one weakly expressed and no particular function standing out (Supplementary Data 5A and Supplementary Data 8A). In addition, none of the GV genes are differentially expressed. In other words, Tpv has no significant impact on the virocell transcriptome.To investigate potential Tpv and Vp integration into the GV genome, we sequenced using Nanopore long reads the genomic DNA of megavirus vitis, a closely related GV strain (97.9% average nucleotide identity with megavirus chilensis) from which zamilon vitis and megavirus vitis transpoviron were isolated16. We identified 12 megavirus vitis chimeric reads aligning to Tpv (Figure S9A), and 2 to Vp (Figure S9B), suggesting potential Tpv and Vp insertions within the megavirus genome. These insertions appear uniformly distributed throughout the genome (K–S test against uniform distribution Pvalue = 0.621), similar to observations in mimivirus23. These findings suggest potential GV diversification resulting from Tpv and Vp insertions. However, the low number of chimeric reads and their occurrence within essential genes (e.g., major capsid protein 3, mRNA capping enzyme, Figure S9A) indicate rare events probably often leading to evolutionary dead ends.Zamilon vitis virophage transiently modifies the megavirus chilensis transcriptomeTo further explore the transcriptome of this hyperparasitic system, we introduced the Vp by coinfecting A. castellanii cells with megavirus chilensis and zamilon vitis (C + GV+Vp, Fig. 1A). The experiment first revealed that all Vp genes are transcribed and fall into 4 clusters (Fig. 5A). For genes in the first cluster (Vp-1), a weak transcription signal can be observed at T2, peaking at T3 and gradually decreasing onwards (Fig. 5A, Supplementary Data 5C). Among genes from this cluster is the DNA primase, probably involved in Vp DNA replication. Genes from cluster Vp-2 show steady expression from T3 to T6 and notably include the Vp-encoded integrase. The third and largest cluster (Vp-3) contains genes whose expression is delayed, peaking at T4. It includes all members of the morphogenesis module (minor and major capsid proteins, and the packaging ATPase), as well as 3 proteins sharing a similar fold (za3_1, za3_19 and za3_20) that are suspected to form spikes at the surface of the virophage capsid57.Fig. 5: Transcriptional patterns of zamilon vitis virophage genes during the infectious cycle.A Hkmeans clustering (k = 4) of the VP genes with cluster names, number of genes in each cluster and associated dendrograms indicated on the left. The heatmap shows the expression patterns (Z-score of log2-transformed TPM values) of each gene (rows) along the different timepoints (columns), averaged over the replicates. B Volcano plot of GV gene expression in the presence/absence of the Vp (C + GV+Vp vs C + GV). Two-sided Wald test adjusted (Benjamini-Hochberg) FDR Pvalues and fold change metrics were calculated using Deseq286. Differentially expressed genes were identified using Deseq2 and EdgeR87 with FDR Pvalue < 0.01 and |log2(FC) ≥ 1.5 | . GV genes that passed the FDR Pvalue and FC thresholds are shown in red, while those that only passed the FDR Pvalue threshold are in orange. Source data are provided as a Source Data file.Finally, the latest expressed gene (za3_7, Fig. 5A), sole member of the Vp-4 cluster, encodes a transmembrane domain protein that is predicted to localize at the cell membrane and lysosome/vacuoles by DeepLoc (Supplementary Data 5C). Interestingly, according to our MS-based proteomics data (reprocessed from16), the protein is absent from purified Vp particles (Supplementary Data 9). By contrast, it is the most abundant Vp protein in GV particles when GV is infected by Vp (Supplementary Data 6B). Thus, this Vp-encoded protein is not associated with Vp virions, which lack internal membranes58, but probably binds to inner membranes of GV virions.Vp genes are expressed late during the GV infectious cycle, when the VF is operational, and are mainly controlled by GV-like late regulatory motifs (9 out 20 genes, Supplementary Data 1). However, akin to GV, it exhibits an organized gene expression pattern, with genes involved in DNA replication expressed first, followed by those involved in virion morphogenesis. This indicates that a hidden level of temporal gene regulation remains to be characterized.To determine the Vp’s impact on the virocell transcriptome, we next compared our transcriptomic data in the presence and absence of Vp (C + GV+Vp vs C + GV, Fig. 1A). Our analysis revealed a negligible impact of Vp on the host transcriptome, with only 6 cellular genes differentially expressed, 4 of which were weakly expressed (average expression < 5 TPM, Supplementary Data 5A and Supplementary Data 8A).In contrast, Vp strongly disrupted GV gene expression, significantly altering the expression of 23% (263/1134) of its genes (Supplementary Data 8B). This substantial effect could be attributed to a bias arising from the introduction of a new partner with a finite pool of sequenced reads. As a control, we performed the same analysis excluding the Vp genome sequence from the mapping, i.e., only C and GV reference sequences were included. After confirming sufficient read coverage (Figure S3), we still found that 22% (254/1134) of GV genes were differentially expressed. Thus, the observed differential expression of GV genes is indeed a result of its interaction with Vp, and not due to a bias in the proportions of mappable reads.The effect of Vp on GV gene expression is mainly negative, as most differentially expressed GV genes (238/263) are underexpressed in their presence. This mainly occurs at T4 (Fig. 5B), at the same time as peak expression for most Vp genes (Fig. 5A). Competition for transcription machinery might thus occur between the two viruses (GV and Vp) at this time point. This is supported by the fact that Vp genes are globally more efficiently expressed than GV genes (Figure S10).Since most of the underexpressed GV genes are expressed late (with 80% from cluster GV-5, Fig. 3A) and include important genes from the morphogenesis module, such as the major capsid protein (mchi_457, Fig. 5B and Supplementary Data 8B), one could expect that Vp coinfection alters GV particles protein composition. We thus performed MS-based analyses of GV virion in the presence and absence of Vp coinfection (see “Methods”). As shown in Supplementary Data 6C, none the virion-associated GV proteins exhibit differential abundance between the two conditions (with FDR Pvalues < 0.05 and |log2(FC)| ≥ 1.5 thresholds). This data nicely correlates with the fact that all of the underexpressed GV genes (with the exception of mchi_399) recover normal expression strength by the end of the infectious cycle (T6, Fig. 5B). Taken together, these data show that although Vp has a strong repression effect on GV transcriptome, it is only transitory and do not alter mature virion protein composition. Regardless, such transient changes might still have consequences on the speed of GV virion formation, and thus extend the period of time for mature Vp virions to be generated prior to host cell lysis. Further experiments will be needed in order to address such hypothesis. It is also possible that GV genes transcriptional level is sufficiently high that Vp-induced downregulation has no phenotypic effect on GV.Not all differentially expressed GV genes are repressed in the presence of the Vp, being 25 of them upregulated at T5 (Fig. 5B and Supplementary Data 8B). Among them, 6 strikingly colocalize in GV TIRs, with 3 next to each other identical on each TIR: the mchi_0/mchi_1133 ncRNAs, the Bro-N domain-containing mchi_1/mchi_1132, and mchi_2/mchi_1131 that are homologous to za3_9 in Vp12. Other activated functions include protein folding with two chaperons (the DnaJ-like mchi_351 and the HSP70 mchi_493), and DNA interaction with the mchi_396 topoisomerase 2 and the MC1-domain containing mchi_339. Interestingly, the latter is the most abundant GV protein in purified Vp particles (Supplementary Data 9). This suggests that this GV protein, recently proposed to be involved in mimivirus DNA compaction and packaging59, has a similar function not only in megavirus GV but potentially in zamilon Vp as well.Finally, the most upregulated gene in the context of virophage coinfection (log2(FC) = 2.95, FDR Pvalue = 2.5 × 10−11, Supplementary Data 8B and Fig. 5B) is the mchi_336 transcription initiation factor (TFIIB). It is worth mentioning that this gene may be essential for GV replication. Indeed, knock-out (KO) by homologous recombination with a selection marker60 resulted in a mixture of wild type and mutant viral particles. While mutants were rapidly outcompeted by wild-type viruses in the absence of selection, a complete loss of mutants was also observed with an increased number of passages under selection, indicating that mchi_336 KO is associated a high fitness cost. In addition to mchi_336, the GV-encoded mchi_455 late TF is also significantly upregulated by Vp coinfection (Fig. 5B). Altogether, this indicates that the Vp transiently activates key GV-encoded functions, likely to support its own gene expression and replication.Zamilon vitis virophage induces megavirus vitis transpoviron late gene overexpressionOur previous transcriptomic comparisons highlighted the effects of Vp, and lack of effect of Tpv, on the virocell transcriptome. We next explored the reciprocal impact of Vp and Tpv on each other. To this end, we first compared the complete coinfection experiment (C + GV+Vp+Tpv, Fig. 1A) to the one excluding Tpv (C + GV+Vp, Fig. 1A), in order to reveal the potential effects of Tpv on the Vp transcriptome. None of the Vp genes passed the differential expression thresholds in this comparison (Supplementary Data 8C). Thus, not only Tpv has no effect on the virocell transcriptome, it has no effect on Vp genes expression either.Reciprocally, we compared the full system (C + GV + Vp + Tpv, Fig. 1A) to the one without Vp (C + GV + Tpv, Fig. 1A), to decipher the effect of Vp on Tpv gene expression. We first observed a delay in Tpv transcription in the presence of Vp, with 3 Tpv genes (mvtv_2, mvtv_4 and mvtv_6) significantly underexpressed at T2 and/or T3 (Fig. 6 and Supplementary Data 8D). Importantly, in the C + GV + Tpv condition, Tpv is carried by the GV, while in the C + GV + Vp + Tpv it is brought along by the Vp virions (see “Methods” and Fig. 1A). The delay is therefore probably due to a difference in the accessibility of Tpv DNA for transcription, either because of delayed access to the transcription machinery or, most likely, because of a later opening of Vp virions.Fig. 6: Expression of megavirus vitis transpoviron genes in the presence or absence of the zamilon vitis virophage.Shown is the normalized expression (TPM values) of all Tpv protein-coding genes across the GV infectious cycle in the presence (red) or absence (blue) of Vp. Stars indicate significant differential expression between the two conditions at a given timepoint with FDR Pvalue < 0.01 and |log2(FC)| ≥ 1.5 using Deseq2 and EdgeR. Two-sided Wald test adjusted (Benjamini-Hochberg) FDR Pvalues are noted above each star. The numbers (1 and 2) indicate the two different effects detailed in the main text, with (1) for the delayed expression and (2) for the late overexpression of Tpv genes when Vp is present. Source data are provided as a Source Data file. Icons representing partners were created with BioRender.com.Secondly and more importantly, all Tpv genes, except mvtv_1, are overexpressed in the presence of Vp at late timepoints (T5 and/or T6, Fig. 6). Thus, Vp induces a global increase of Tpv late gene expression. Since Vp depends on the GV transcription machinery11,25 and does not encode TFs, we hypothesize that this upregulation of Tpv genes is not directly induced by the Vp, but rather indirectly through GV interaction. Indeed, as previously shown, the Vp upregulates the mchi_336 TFIIB and the mchi_455 late TF (Fig. 5B). The strong late global increase of Tpv gene expression might thus result from the transient Vp-induced upregulation of these GV-encoded TFs.Together, these comparisons highlight an asymmetrical relationship between the two entities, with no effect of Tpv on Vp transcriptional program, but a strong global increase of Tpv expression indirectly induced by Vp via GV. Interactions between GV, Vp, and Tpv are therefore highly intricated at the transcriptional level.

Hot Topics

Related Articles