Upper bounds on overshoot in SIR models with nonlinear incidence

Effect of nonlinear incidence on overshoot in SIR ODE modelsWe first examine the effect of nonlinear incidence on overshoot for ODE models, where the computations can be made analytical. Consider the following SIR ODE model as follows with generic incidence term βf(S)g(I), where f(S) and g(I) are functions of S and I respectively to be specified.$$\frac{dS}{dt}=-\beta f(S)g(I)$$
(4)
$$\frac{dI}{dt}=\beta f(S)g(I)-\gamma I$$
(5)
$$\frac{dR}{dt}=\gamma I$$
(6)
where S, I, R ∈ [0, 1] are the fractions of the population that are susceptible, infected, and recovered, respectively, $\beta ,\gamma \in {{\mathbb{R}}}_{\ > \ 0}$ are positive-definite parameters for transmission and recovery rate respectively.For the SIR model, the overshoot is given by the following equation:$$Overshoot={S}_{{t}^{* }}-{S}_{\infty }$$
(7)
where ${S}_{{t}^{* }}$ is the fraction of susceptibles at the time of the prevalence peak (i.e., when I is maximal in value), t*, and S∞ is the fraction of susceptibles at the end of the epidemic. To solve this equation, the easiest approach is to derive an equation for ${S}_{{t}^{* }}$ in terms of only S∞ and parameters. We do this by first setting (5) equal to 0 and solving for the critical susceptible fraction ${S}_{{t}^{* }}$.$$\frac{dI}{dt}=0=\beta f({S}_{{t}^{* }})g({I}_{{t}^{* }})-\gamma {I}_{{t}^{* }}$$
(8)
By using the usual definition for the basic reproduction number, ${R}_{0}\equiv \frac{\beta }{\gamma }$, we obtain the following equation for ${S}_{{t}^{* }}$.$${S}_{{t}^{* }}={f}^{-1}\left(\frac{{I}_{{t}^{* }}}{g({I}_{{t}^{* }})}\frac{1}{{R}_{0}}\right)$$We can see from this equation that ${S}_{{t}^{* }}$ will have I dependence unless $g({I}_{{t}^{* }})={I}_{{t}^{* }}$. Thus to make what follows analytically tractable, let us assume $g({I}_{{t}^{* }})={I}_{{t}^{* }}$. We will provide even stronger justification why g(I) must take this form later in the results. This assumption of $g({I}_{{t}^{* }})={I}_{{t}^{* }}$ reduces the above equation to the following.$${S}_{{t}^{* }}={f}^{-1}\left(\frac{1}{{R}_{0}}\right)$$
(9)
Taking this equation for ${S}_{{t}^{* }}$(9) and the overshoot formula (7), we obtain:$$Overshoot={f}^{-1}\left(\frac{1}{{R}_{0}}\right)-{S}_{\infty }$$
(10)
Thus the main challenge now becomes a problem of finding an equation for R0 and the inverse function f−1. Building on previous results35, the following outlines the general steps for calculating the maximal overshoot for a SIR model:

Take the ratio of $\frac{dI}{dt}$ and $\frac{dS}{dt}$. Integrate the resulting ratio. The indefinite integral requires a constant of integration, which is a conserved quantity that applies at every time point along the system’s trajectory in time.

Evaluate the equation for the conserved quantity at the beginning of the epidemic (t = 0) and the end of the epidemic (t = ∞) using initial conditions and asymptotic values. Then, rearrange the resulting equation for $\frac{1}{{R}_{0}}$.

Find the form for the inverse function, f−1.

Combine the equations for $\frac{1}{{R}_{0}}$ and f−1 with the overshoot equation.

Maximize the resulting overshoot equation by taking the derivative of the equation with respect to S∞ and setting the equation to 0 to find the extremal point ${S}_{\infty }^{* }$. This step usually leads to a transcendental equation for ${S}_{\infty }^{* }$, which can then be solved numerically.

Use the maximizing ${S}_{\infty }^{* }$ value in the overshoot equation to calculate the corresponding maximal overshoot.

Calculate the corresponding ${R}_{0}^{* }$ using ${S}_{\infty }^{* }$ and the $\frac{1}{{R}_{0}}$ equation.

Thus, the analytical exploration of nonlinear incidence terms of the type βf(S)g(I) is reduced to exploring different forms of f(S).Restrictions on g(I)The first step is to rule out what forms for the incidence term will not work with the procedure outlined above.We now show the principle reason why we require g(I) = I. We can see from calculating Step A in (11) that any incidence term that does not take the form $g(I)=aI,a\in {\mathbb{R}}$, where a is a real scalar, retains I dependence upon simplification.$$\,{{\mbox{Step A}}}\,:\frac{\frac{dI}{dt}}{\frac{dS}{dt}}=\frac{\beta f(S)g(I)-\gamma I}{-\beta f(S)g(I)}=-1+\frac{I}{{R}_{0}f(S)g(I)}$$
(11)
Any deviation from the form g(I) = aI results in I in the numerator and the denominator not completely canceling out, which will result in having to integrate I with respect to S, which we will not be able to do analytically. Therefore, I must enter linearly into the incidence term. Since a can be absorbed into the β parameter, all possible incidence terms for the purpose of calculating overshoot analytically will take the form β ⋅ f(S) ⋅ I.Restrictions on f(S)We now turn to what restrictions there are in the form of f(S). We start first with two conditions. First, we must enforce that when there are no susceptibles (S = 0), then the incidence rate must go to zero (βf(S)I = 0). Otherwise, since I do not have such a restriction, violating this condition would leave open the unrealistic possibility that the model can generate infected people when there are no susceptibles available. To ensure this condition is met, we need the function on S to output zero if the input is zero.For the second condition, for an outbreak to occur in a SIR model, we must have a minimum value for the basic reproduction number, R0. Another way to view this condition is by inspecting (2) and recognizing that the incidence term (βf(S)I) needs to be greater than the recovery term (γI). Otherwise, the epidemic cannot grow in size. Comparing the two terms leaves an inequality (βf(S)I > γI) which can be rewritten as:$${R}_{0} > \frac{1}{f(S)}$$
(13)
Beyond these conditions, an obvious requirement is that f(S) should be a continuous function. In order to be able to calculate the maximal overshoot analytically, the function should be integratable with respect to S and should also have a closed-form inverse f−1. As we will demonstrate, non-monotonic functions for f(S) are possible.For f(S), the following examples are constructed using basic functions that satisfy the above criteria:

$\exp (S)-1$

Invertible polynomials of S

$\sin (aS)$

Conversely, there are many examples of functions that would not work. An example that satisfies the conditions above but that does not have a closed-form inverse is $f(S)=\log (S+1)$. While similar to examples listed that work, the following violate one of the conditions: $\exp (S),\log (S),\cos (S)$. Examples that violate conditions of continuity include step functions of S or f(S) with cusps.Deriving maximal overshoot for various f(S)As an example, we will now look at a model with nonlinear incidence and apply the whole procedure previously outlined for finding the maximal overshoot.Example: $f(S)=\exp (S)-1$
Let us consider an incidence rate that takes on an exponential form. This produces an incidence term that grows slightly faster than the original bilinear incidence term, and so might be relevant in situations where there are network effects. The phenomenon of superspreading occurs when some individuals infect many more people than the typical infected person would50. Superspreading allows for explosive outbreaks that exceed the growth rate of infections allowed by simple bilinear incidence and is enabled by inherent heterogeneity in contacts amongst the population, which makes it natural to occur in situations that are approximated by a network. Different models can be formulated to account for superspreading51,52. Here we have chosen the exponential as a qualitatively simple example that more closely resembles the growth rate on a heterogeneous network than standard bilinear incidence.We start at Step A by solving for the rate of change of I as a function of S by taking the ratio of $\frac{dI}{dt}$ and $\frac{dS}{dt}$.$$\frac{dI}{dS}=-1+\frac{1}{{R}_{0}({e}^{S}-1)}$$
(14)
from which it follows on integration using the substitution u = eS and partial fractions $\left(\frac{1}{u-1}\right)$ and $\left(\frac{1}{u}\right)$ that $I+S+\frac{S-\,{{\mbox{ln}}}\,| {e}^{S}-1| }{{R}_{0}}$ is constant along all trajectories.For Step B, consider the conserved quantity at both the beginning (t = 0) and end (t = ∞) of the epidemic.$${I}_{0}+{S}_{0}+\frac{{S}_{0}-\,{{\mbox{ln}}}| {e}^{{S}_{0}}-1| }{{R}_{0}}={I}_{\infty }+{S}_{\infty }+\frac{{S}_{\infty }-{{\mbox{ln}}}\,| {e}^{{S}_{\infty }}-1| }{{R}_{0}}$$hence$$\frac{1}{{R}_{0}}=\frac{{I}_{\infty }+{S}_{\infty }-{I}_{0}-{S}_{0}}{{S}_{0}-{S}_{\infty }+\,{{\mbox{ln}}}\,\left(\frac{| {e}^{{S}_{\infty }}-1| }{| | {e}^{{S}_{0}}-1| }\right)}$$
(15)
We use the initial conditions: S0 = 1 − ϵ and I0 = ϵ, where ϵ is the (infinitesimally small) fraction of initially infected individuals. We use the standard asymptotic of the SIR model that there are no infected individuals at the end of a SIR epidemic: I∞ = 0. This yields:$$\frac{1}{{R}_{0}}=\frac{{S}_{\infty }-1}{1-{S}_{\infty }+\,{{\mbox{ln}}}\,\left(\frac{| {e}^{{S}_{\infty }}-1| }{e-1}\right)}$$
(16)
For Step C, we find the inverse of f.$$f(x)={e}^{S}-1\ \Rightarrow \ {f}^{-1}(x)=\,{{\mbox{ln}}}\,(x+1)$$
(17)
For Step D, we substitute the expression for $\frac{1}{{R}_{0}}$ (16) and f−1 (17) into the overshoot equation (10).$$Overshoot=\,{{\mbox{ln}}}\,\left(\frac{\,{{\mbox{ln}}}\,\left(\frac{| {e}^{{S}_{\infty }}-1| }{e-1}\right)}{1-{S}_{\infty }+\,{{\mbox{ln}}}\,\left(\frac{| {e}^{{S}_{\infty }}-1| }{e-1}\right)}\right)-{S}_{\infty }$$
(18)
For Step E, differentiation of both sides with respect to S∞ and setting the equation to zero to solve for the critical ${S}_{\infty }^{* }$ yields:$$0={\left(\frac{{{\mbox{ln}}}\left(\frac{| {e}^{{S}_{\infty }^{* }}-1| }{e-1}\right)}{1-{S}_{\infty }^{* }+{{\mbox{ln}}}\left(\frac{| {e}^{{S}_{\infty }^{* }}-1| }{e-1}\right)}\right)}^{-1}\left(\frac{\left(1-{S}_{\infty }^{* }+\,{{\mbox{ln}}}\,\left(\frac{| {e}^{{S}_{\infty }^{* }}-1| }{e-1}\right)\right)\cdot \left({\left(\frac{| {e}^{{S}_{\infty }^{* }}-1| }{e-1}\right)}^{-1}\frac{{e}^{{S}_{\infty }^{* }}}{e-1}\right)-\,{{\mbox{ln}}}\,\left.\left(\frac{| {e}^{{S}_{\infty }^{* }}-1| }{e-1}\right)\right)\cdot \left(-1+{\left(\frac{| {e}^{{S}_{\infty }^{* }}-1| }{e-1}\right)}^{-1}\frac{{e}^{{S}_{\infty }^{* }}}{e-1}\right)}{{\left(1-{S}_{\infty }^{* }+{{\mbox{ln}}}\left(\frac{| {e}^{{S}_{\infty }^{* }}-1| }{e-1}\right)\right)}^{2}}\right)-1$$Since eS − 1 is positive semi-definite over the unit interval for S, dropping the absolute value symbols and simplifying yields:$$\,{{\mbox{ln}}}\,\left(\frac{{e}^{{S}_{\infty }^{* }}-1}{e-1}\right)\left(\,{{\mbox{ln}}}\,\left(\frac{{e}^{{S}_{\infty }^{* }}-1}{e-1}\right)-{S}_{\infty }^{* }\right)=(1-{S}_{\infty }^{* })\left(\frac{{e}^{{S}_{\infty }^{* }}}{{e}^{{S}_{\infty }^{* }}-1}\right)$$which admits both a trivial solution (${S}_{\infty }^{* }=1$) and the solution ${S}_{\infty }^{* }=0.1663…$.For Step F, using the non-trivial solution ${S}_{\infty }^{* }$ in the overshoot equation (18) to obtain the value of the maximal overshoot for this model, $Overshoo{t}^{* }{| }_{\beta ({e}^{S}-1)I}$ yields:$$Overshoo{t}^{* }{| }_{\beta ({e}^{S}-1)I}=0.2963…$$
(19)
Thus, the maximal overshoot for incidence functions of the form β(eS − 1)I is 0.296. . . .For Step G, we can calculate the corresponding ${R}_{0}^{* }$ using ${S}_{\infty }^{* }$ and (16).$${R}_{0}^{* }{| }_{\beta ({e}^{S}-1)I}=1.7$$
(20)
This result is verified numerically in Fig. 2. Compared to the overshoot in the Kermack–McKendrick model with bilinear incidence35,49, we see that the maximal overshoot is also near 30%. However, we see that overshoot for exponential incidence has a much broader distribution over the domain of R0, suggesting that exponential incidence rates pose a public-health hazard for a greater range of communicable diseases over their bilinear counterparts.Fig. 2: The overshoot in a model with exponential incidence.The overshoot as a function of R0 for an SIR model with nonlinear incidence term of β(eS − 1)I. The horizontal line for Overshoot* in dark blue and the vertical line in red given by ${R}_{0}^{* }=1.7$ are the theoretical predictions given by the calculations in the text. The curve is obtained from numerical simulations.Additional examples for incidence rates of a different mathematical form can be found in the Supplemental Information. One example considers when the form of f(S) is given by a polynomial, which can be used to model higher-order effects beyond a simple two-person interaction. Another example considers the consequence of having f(S) that is non-monotonic over the domain of S, which might reflect real scenarios where there are tipping points or seasonality in behavior.Nonlinear incidence generated from dynamics on networksIn the previous sections, we focused on the nonlinear incidence that was generated from the introduction of nonlinearity in an ODE model. Importantly, this fundamentally assumes homogeneity in the transmission, in that all infected individuals are identical in their ability to spread the disease further. In contrast, network models allow for heterogeneous spreading, which depends on the local connectivity of each infected individual. This provides an entirely different mechanism through which nonlinear incidence can be generated compared to the ODE models. As many real-world complexities and details can be more easily captured and explored in a network model, the consequences of those heterogeneities on the overshoot become more transparent through the targeted and fully fleshed-out explorations that can be done through numerical experimentation in network simulations9,53,54,55,56. However, a trade-off for the increased realism is that it becomes more difficult to perform analytical calculations for a general network model, so here we conduct a numerical exploration of the behavior of the overshoot in network models across a range of network structures.The space of all possible network configurations is immense, so we must restrict ourselves to analyzing a particular subset of possible networks. Here we explore what happens to the overshoot when the contact structure of the population is given by a network graph that is roughly one giant component. While it is possible to construct pathological graphs that produce very complex dynamics, we consider more classical graphs here. Using a parameterization of heterogeneity (σ) given by Ozbay et al.57 and the configuration model of Newman58 to randomly construct networks (see Methods for details), we simulated epidemics on a spectrum of networks with structures ranging from the homogeneous limit (well-mixed, complete graph) to a heterogeneous limit (heavy-tailed degree distributions). The spectrum of distribution shapes across the space of σ used to generate the degree distributions can be seen in the supplement. As the dynamics of the epidemic on a network are stochastic and occur in discrete time here, they are not parameterized by R0 as in the ODE models. Instead, we use an analogous parameter that we will denote as a basic reproduction number for networks (R0,network) and is defined as follows:$${R}_{0,network}\equiv \frac{\tau \cdot \langle k\rangle }{\rho }$$
(21)
The transmission probability (τ) and recovery probability (ρ) are directly analogous to their counterparts (β and γ, respectively) in R0 in that they correspond to transmission and recovery parameters, albeit for a stochastic model. The inclusion of the mean network degree (〈k〉) as a scalar makes intuitive sense as that represents the average number of potential neighbors an infected node could potentially infect at the beginning of the epidemic, which is analogous to the interpretation of R0 as the number of secondary infections. We observed what happens to the overshoot on these different graphs as we changed R0,network.On a network model, where contact structure is made explicit, the homogeneous limit is a complete graph, which recapitulates the well-mixed assumption of the Kermack–McKendrick model. It is not surprising then that the overshoot in the homogeneous graphs (σ ≈ 0) peaks also around 0.3 (Fig. 3, purple and blue), which coincides with the analytical upper bound previously found in the ODE Kermack–McKendrick model35. However, while the homogeneous graphs shown here are highly regular (i.e., symmetric in the average number of contacts each node has), the network’s connectivity is not close to complete as the mean degree is significantly less than the network size. Thus, the average overshoot is lower than would be the case for a complete graph.Fig. 3: Comparing the overshoot in the context of networks with different amounts of contact heterogeneity.The overshoot for SIR epidemic simulations on networks with varying levels of heterogeneity (σ) as a function of R0,network. Each color shows simulations for networks of different contact heterogeneity as parameterized by σ. Larger σ corresponds to a graph with greater graph heterogeneity (i.e. increasingly heavy-tailed degree distribution). The solid lines represent the mean value of 100 simulation runs for a given R0,network and σ. The shaded areas indicate the 25th and 75th quartiles for those 100 simulations. The other simulation parameters are number of nodes (N) = 200, 〈k〉 = 7, and recovery probability (ρ) = 0.2.We also see that increasing contact heterogeneity (i.e., σ→1) qualitatively suppresses the overshoot peak both in terms of the overshoot value and the corresponding R0,network. Furthermore, increased heterogeneity also flattens out the overshoot curve as a function of R0,network. We see that for more heterogeneous graphs, the overshoot is larger at very low R0,network. For some intermediate values of contact heterogeneity (Fig. 3, yellow and orange), the overshoot shows larger overshoot than the homogeneous case for large R0,network. This would suggest a larger public health hazard for a larger range of communicable diseases for networks with structure in this regime. The intuition for the larger overshoot at high R0,network in more heterogeneous networks is that both the peak occurs earlier and that a significant number of cases can occur in the periphery of a network (Supplemental Materials). The probability of being connected to a high-degree node increases as the heterogeneity of the network increases, which drives the peak of infections to occur sooner. In the second phase of the epidemic, as the epidemic expands out to the periphery, the epidemic burns more slowly as individuals in the periphery have fewer neighbors.However, this trend of an overshoot distribution with a big right tail does not monotonically increase with the contact heterogeneity. We see that at very high amounts of contact heterogeneity, the overall area under this overshoot curve decreases. This can be partially explained by the fact that for very heterogeneous networks, a significant fraction of the population has no neighbors at all. This limits the number of people that can potentially be infected and caps the outbreak size and, subsequently, overshoot size.

Upper bounds on overshoot in SIR models with nonlinear incidence

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation | BMC Bioinformatics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Hot Topics

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation | BMC Bioinformatics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Related Articles

Balancing Act: Pregnancy and Bipolar Disorder

Cohesion at the cellular level: flexible yet stable

Gut bacteria influence responses to immunotherapy in patients with asbestos related cancer

Quick Links

Must Read

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation | BMC Bioinformatics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Popular Articles

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation | BMC Bioinformatics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models