The evolutionary features and roles of single nucleotide variants and charged amino acid mutations in influenza outbreaks during NPI period

The B/Victoria lineage stemmed from the 1988–1989 season, of which two distinct antigenic variants of influenza B virus were co-circulated, the B/Victoria and B/Yamagata lineages (Bv/By) with the reference strains B/Victoria/2/87 and B/Yamagata/16/88, respectively. The evolutionary dynamics of influenza B virus are complex and have been characterized by nucleotide insertions and deletions (indels) in the hemagglutinin (HA) gene and extensive reassortment events within and between the Bv and By lineages23. In this study, only strain GD/1557/2019 inserting 529AAAAACGAC537 in HA gene was similar to the vaccine strain B/Bri/60/08, which was different from others. On account of the vaccine strain Aus/1359417/21, the Bv strains circulating in 2022 had the highest homology with their HA gene but were different from those in other years (99.23 ± 0.22, F2022/Others = 74.78, P < 0.001). Some of influenza strains in the present study were isolated at the beginning of NPI (2020), being in fact a continuation of the 2019 epidemic and outbreaks. Moreover, from NPI (2020) to the end of April of 2022, only influenza Bv outbreaks (no H1N1 and no H3N2) occurred in southern China.This study included the analyses of nucleotides (molecular cluster, transition/transversion, evolutionary rate), amino acids (AA substitution, entropy, evolutionary selection, epitope), genes (HA/NA) and prevalence (epidemic/outbreak, different dates) and the relationship among them of Bv outbreaks. Although SNV occurs at random, the results are significant for the direction of biological evolution. From the results in the present study, the mutations were highly biased toward the specific amino acid, for example, the probability of GC transversion was one in 200,000 (only once) since 2008 (reference strain), but the probability of AG transition was 1–2 per thousand, which was faster than that of GC transversion. The evolutionary rates in this study were successively G → A, A → G, C → T, T → C, A → T, T → A, A → C, G → T, T → G and C → G, with the highest rate 10,000 times faster than the lowest rate. Compared with a study on SARS-CoV-2 pandemic spread during the first months, the frequency of both G → U and C → U substitutions increased, which suggested that the substitution spectrum of SARS-CoV-2 was determined by an interplay of factors, including intrinsic biases of the replication process, avoidance of CpG dinucleotides and other constraints exerted by the new host24.In this study, the epitope domain mutations including epitope A (120 loop, 137/142/144/199), B (150 loop, 165) and D (190 helix, 212/214) had high evolutionary rates, partially similar to a previous research23. The epidemic and outbreaks in southern China resulted from the mutations on HA genes, which were 1.59 times (2.15/1.36) faster than those on NA genes. As to the deeper reasons, the outbreaks here were associated with mutations of HA gene epitopes A, B and D. Compared with the epidemic in Germany during 2016–202025, a total of 13 substitutions were fixed over time (numbering in HA1 of Bri/60/08), including five in the 120-loop (R116H, I117V, N121T, K129N/D, K136E) and two substitutions in the 120-loop surrounding domain (K48E, N75K), one in the 150-loop (V146I), two in the 160-loop (E164D, N165K) and one in the 190-helix (S197N).Amino acids have been extensively studied as components of epitopes, while epitopes in infectious diseases involve epidemic, treatment, vaccines and so on26. The ionizing properties of amino acids are associated with the charged capacity, furthermore, with pathogenic adhesion and entry and molecular interaction between antigen and antibody, etc.; where interaction between antigen and antibody is involved in the multiply charged ion signals in amino acids27. Focusing on the epitope domain in this study, three polar amino acids (P137B/P199A/P212A) mutations occurred from 2019–2020 to 2021–2022 (P < 0.01), which affected the antigenicity of the epitope regions.Based on dS/dN substitution in the codon, there are certain errors in the evaluation of selective evolutionary sites. The site 214 and 563 in the HA genes and the site 73 in the NA genes in this study were the positive ones, which were evaluated by both approaches (FUBAR/MEME; P < 0.10)28. This suggested that the site 214 in HA genes was an AA in the epitope D (H214P) triggered off Bv outbreaks, and a positive selection site under the enormous external pressure in evolution as well.Entropy is usually used to evaluate the evolution as well29, while here the evolution was evaluated by both ER and SV, of which both were significantly correlated. Estimation of both rates of nucleotide substitution of HA and NA in Bv lineage were 2.05 × 10−3 s/s/y and 2.01 × 10−3 s/s/y, respectively23, while the RHA in this study was less than RNA (RHA = 0.690/RNA = 0.711), which suggested that the amino acid variations on HA were more active than the nucleotide variations, compared with those on NA. At the same time, both DHA (− 1.7668) and DNA (− 2.1355) in this study showed HA genes (especially in the five epitopes in HA1 region) were prone to variation, in other words, NA genes were more likely to evolve synonymous evolution rather than nonsynonymous one.The key role of charged amino acids has been widely studied in infectious diseases30. Here was a good model of the evolution of infectious disease pathogens (NPI/Bv outbreak only/Less distant transmission). In this study, the first stage entering the second stage of the Bv outbreak involved three polar AAs (N165K, P → H; G199E/K, P → A/B; N212E, P → A), substituted from the polar AAs into the basic, acidic/basic and acidic AAs, respectively; the second half of 2021 in the second stage (Cluster 3, Table S6) involved two polar AAs (H137Q/K/N, B → P; A142T, H → P), substituted from the basic and acidic AAs into the polar AAs, respectively. This suggested that the charge/pH preference for amino acid mutations is closely related (consistent) for the development trend of the outbreak. There are some similar reports, but with different research perspectives31. SNV adaptation is thus likely to have been associated with the influenza virus diversification across the outer environment and to have promoted their survival in extreme32. A genetic approach combined with potential epidemiological linkage enabled us to match data with previous reports on outbreaks or transmission chains, which may benefit public health actions33.

Hot Topics

Related Articles