Absence of effects of widespread badger culling on tuberculosis in cattle

Statistical models examining the effect of proactive badger cullingThe most important result from the 2006 paper is presented in Table 1.
Table 1 Log-linear regression model fitted to ‘confirmed’ incidents (breakdowns) in whole trial areas (inner and outer regions combined), since the initial proactive cull, based on VETNET location data.Here, the parameter treatment is highly significant and is interpreted as a reduction in the herd incidence of 18.7% in areas where badgers had been culled. What is a little unusual in this analysis is that the parameter Log of baseline herds was not fixed at 1. The rate is normally the count divided by some unit of exposure, in this case the exposure would be the number of baseline herds.If exposure was the same across all triplets, then the log linear Poisson regression would be$$log{\mu }_{x}={\beta }_{0}+{\beta }_{1}x+…$$To control for baseline Bx, as there is varying levels of exposure, this then becomes:$$log\frac{{\mu }_{x}}{{B}_{x}}={\beta }_{0}+{\beta }_{1}x+…$$ = > $$log{\mu }_{x}={\beta }_{0}+{\beta }_{1}x+{logB}_{x}+…$$The usual practice in Poisson regression with a variable exposure across groups would be to fix the parameters for logBx at 1, and code it as an offset variable. This then would assume that the incidence in each triplet area is directly proportional to the number of baseline herds, that is if the number of baseline herds is doubled, there is a doubling of the incidence when other parameters are held constant. However, in the 2006 paper, the parameter for logBx was not fixed at 1, but was analysed as an explanatory variable and hence estimated by the model. This could be a reasonable approach if there was some a priori reason why the number of herd breakdowns was not linearly associated with the number of herds in the baseline. All areas in each triplet had an area of approximately 100 km2. But with a relatively high variation in the number of herds in each area, there was considerable variation in herd density. Standard theory on infectious diseases does indicate that transmission is dependent on density (i.e. contact rate)19 so this might be a justification for not fixing the logBx parameter at 1. Models suggest that disease transmission increases as the density or numbers of cattle and badgers increases20. Hence, with a fixed area for each triplet, the parameter of log(baseline) or log(herd years at risk) might be expected to be above 1. However, the parameter value that is estimated is close to, and not significantly different, from zero (Table 1). This is not consistent with a density dependent increase in transmission. Furthermore, the interpretation for this is that the number of bTB herd incidents is independent of the number of herds in a triplet area. That is, if the number of baseline herds is doubled, the incidence does not change. This does not seem to be credible.Alternatively, the logBx parameter could be fixed at 1 as an offset term. Doing this, and keeping the same variables in the regression and correcting for overdispersion (as reported in the 2006 paper), the result reported in Table 2 is obtained. This is model 2:
Table 2 Log-linear regression model fitted to ‘confirmed’ incidents (breakdowns) in whole trial areas (inner and outer regions combined), since the initial proactive cull, based on VETNET location data.Model 2 suggests that badger culling had no effect on confirmed incidents. However, the two models are not comparable by model selection techniques such as AIC or AICc as Model 1 is a Poisson model and model 2 is a quasipoisson model. Therefore, competing models were analysed using the generalized Poisson model which allowed direct comparison of over- or under-dispersed Poisson models. It is also possible to compare models with and without the explanatory variables presented in the 2006 paper. Since there are 20 data points, but up to 14 variables (9 for Triplet, baseline herds, historic incidence, treatment and over-dispersion or variance parameter), the small sample size AIC (AICc) was used for model selection and comparison. Model 3 is the full model with log(Baseline) as an explanatory variable. With model 3, the most parsimonious model was without triplet as an explanatory variable. Statistical parsimony is when a simpler model with fewer parameters is favored over more complex models with more parameters, provided the models fit the data similarly well. Models with the lowest AIC(AICc) are generally selected as the most parsimonious. With model 3, treatment (culling) had no effect on ‘confirmed’ incidents. Model averaging also indicated that there was no evidence that treatment had an effect. All models where triplet was included as an explanatory variable had a higher AICc than the null model with intercept only. The null model has no explanatory (independent) variables. This strongly indicates a degree of overfitting in models that included these parameters.Supplementary information provided with the 2006 paper demonstrates that each triplet had a variable time at risk, ranging from 2.72 years (triplet D) to 6.73 years (triplet B) (Table 3). Thus, the effect of triplet in the 2006 paper is likely to be due to the variation in exposure time in addition to number of baseline herds. Time at risk was not explicitly included in the analysis of the data in the 2006 paper. There was an independent statistical auditor to oversee the analysis of the data generated by the RBCT. In the first report of the statistical auditor it was stated that, “to some extent, the number of triplets and the years of observation are interchangeable”21. This may have motivated the investigators in the 2006 paper to use the 9 categorical variables of triplet as an alternative to the single continuous variable of time at risk. Standard Poisson regression techniques would have time at risk and baseline herds included in the offset variables to fully control for exposure. Multiplying time at risk by the number of baseline herds achieves this, to give herd years at risk. This then can be included in the regression models as either an explanatory variable, or preferably, as an offset variable. It is not possible to use the categorical variables of triplet as offset variables. Model 4 repeats the analyses of models 2 to 3, but in this case using herd years at risk as a replacement for baseline.
Table 3 Incidence, baseline herds and incidence rates by triplet and treatment for trial areas and neighbouring areas. Figures are using the VETNET database and are for ‘confirmed’ incidents from initial cull.As when using baseline herds in the analysis, the only models that suggest an effect of treatment are models which included all the 9 explanatory variables of triplet. However, these models are not supported by AICc, with the most parsimonious model only including historical incidence and herd years at risk as explanatory variables. Likewise model averaging failed to support an effect of treatment on ‘confirmed’ bTB herd incidence rate.Linear modelThe linear model. where the dependent variable was incidence divided by herd years at risk, is arguably more easily understood by the non-specialist. This was examined using model 5. Examination of the residuals indicated this was a suitable model for examination of the data. As with the Poisson models, there was insufficient statistical evidence to conclude treatment influenced ‘confirmed’ bTB herd incidents. The most parsimonious model did not include treatment as an explanatory variable.bTB Perturbation effect hypothesisThe 2006 paper reported an increase in herd incidence in the unculled areas neighbouring the areas where badgers had been culled, leading to the proposal that badgers might transmit bTB to cattle at enhanced rates after culling has commenced. However, using the same modelling approach (model 6) as for the badger cull areas, there is insufficient evidence to support the perturbation hypothesis. The most parsimonious model (AICc) does not include an effect of treatment.Latent breakdownsThe analysis confirms the result of the RBCT. That is the total herd incidents (confirmed and unconfirmed) have a similar rate of occurrence in areas where badgers were culled compared to control areas. This was analysed using a (generalized) Poisson model as described above (model 7). Again, there was insufficient evidence to conclude an effect of badger culling on the incidence of bTB at the herd level.Bayesian analysisThe model in the 2006 paper was also compared to the most parsimonious Poisson model and linear model using an alternative Bayesian analysis. Minimally informative priors were used. The same priors for the parameters were used in the model comparisons. As expected, a Bayesian analysis of the model from the 2006 paper gave almost identical parameter estimates. However, in terms of Bayes factor, the most parsimonious Poisson model was favoured over the model in the 2006 paper by a factor of 7.2 × 1010. There was no evidence of an effect of proactive culling on herd incidence. Indeed, the marginal likelihood indicated that models where the treatment covariate was removed completely from the model indicated better support. The posterior probability distributions of the parameters in the (overdispersed) Poisson models are given in Fig. 4.All frequentist models, are summarized in Table 4. The R code and the R output for all model runs are given in the supplementary file 5.
Table 4 Statistical models analysed.OverdispersionThe 2006 paper specifically indicated that RBCT results were corrected for overdispersion. However, the model indicated by the results in Table 1 (from the 2006 paper), had no evidence of overdispersion and was a Poisson regression model (model 1). Other statistical models that were explored usually had evidence of overdispersion. This was corrected by modelling as a generalized Poisson model. It was then possible to extract AIC and AICc for model comparison. Only the full model, which included triplet and treatment suggested a significant effect of treatment, and there was some evidence that there was underdispersion in this model. However, using AICc for model selection, it was clear that the most parsimonious model only had log(herd years at risk) and log (historical three year incidence) as significant explanatory variables. Model averaging confirmed the absence of an effect of treatment (i.e. badger culling). AICc is now recommended for model selection rather than AIC, although the two information criteria converge with large sample sizes. An important aspect is that competing models should have normally distributed residuals. Details of the theory and assumptions can be found in Burnham and Anderson13, which includes examples of selection of Poisson models using AICc. For the model from the 2006 paper, there is evidence from the qqplot that the residuals are not normally distributed (Fig. 1), and this may be because there was under-dispersion and hence model misspecification. Better specification can be achieved with the generalized Poisson model. This also confirmed that, using the generalized Poisson model, the residuals were normally distributed and hence, the use of AICc for model selection is valid, at least for this set of models.Figure 1Analysis of residuals from the model reported in the 2006 paper.Posthoc analysisAnalysis of the residuals generated from the model in 2006 paper indicates there may be misspecification issues for model 1 (Fig. 1). The most parsimonious model, in terms of AICc, analyzed was model 4d. In this model, log(baseline) and Triplet were replaced by log(herd years at risk). Treatment was removed as an explanatory variable as there was no significant effect. This model had no misspecification issues (Fig. 2). The linear model 5a also had no misspecification issues (Fig. 3).Figure 2Analysis of residuals from the most parsimonious (over dispersed) (generalized) poisson model (model 4d).Figure 3Analysis of residuals from the most parsimonious linear model without triplet as an explanatory variable. Treatment was removed as an explanatory variable as there was no significant effect.Appropriate statistical modellingOur analysis of the original RBCT data in the 2006 paper strongly suggests that there is insufficient evidence to conclude that badger culling reduced the rate of ‘confirmed’ herd incidents compared to control areas with no badger culling. The only statistical models that suggested an unequivocal effect was the model used in the 2006 paper and in the 2006 model where log(baseline) was replaced by log(herd years at risk) (model 4c). However, the interpretation of the results of these models, is that herd incidence of bTB was independent of number of herds studied (2006 model), or indeed the time at risk. Hence this reduces credibility from an epidemiology perspective. In addition, post hoc analysis of the statistical model used strongly suggests a degree of misspecification of the 2006 model. Although it could be argued, that in terms of AIC, this model produced the best fit of the count models, in terms of AICc it was statistically far more inferior to the better specified models (Fig. 4).Figure 4Density of the parameters following the Bayesian analysis of the most parsimonious frequentist Poisson model (top) with the log(herd years at risk) as an explanatory variable. The is analogous to the analysis in the 2006 paper but including time at risk. The 2006 paper had log(baseline herds) as an explanatory parameter. The parameter for proactive treatment has considerable parts of its density on either side of zero. Removing this parameter completely, improves the marginal likelihood and hence gives better support for the model. Below is the same model with log(herd years at risk) set as an offset.Examination of over-dispersed models, using the generalized Poisson model also suggested the best fitting model was one where treatment had no effect. Furthermore, the most parsimonious linear model, which analysed incidence rates, had no issues with misspecification and was also unable to detect an effect of culling badgers. Finally, a Bayesian analysis strongly suggested that there was little evidence for an effect of a proactive cull.It seems likely that the statistical model reported in the 2006 paper was heavily overfitted. There were 13 free parameters, yet only 20 data points. As a result, the model is useful in reference only to its initial data set, which would include the specific idiosyncrasies of the data within each triplet, but it would have little predictive power.Models with herd years at risk either as a covariate or, preferably, as an offset variable are more logical, they give more credible results and have the lowest AICc of competing models. A simple linear model of incidence rate as the dependent variable offered no evidence of misspecification and failed to show an effect of culling.The aim of badger culling is to prevent bTB herd incidents. Concerns regarding undetected herd infections were raised in 2015, prior to the roll out of mass badger culling in England22. Since then, policy has altered to recognise ‘all SICCT reactors’ as potentially bTB infected for risk management purposes. Analysing the RBCT data to model total herd incidents, including undetected (i.e. latent) incidents, also indicated that culling had no effect on the herd incidence of bTB. When the statistical model included either baseline number of herds, or herd years at risk as a covariate rather than an offset, the parameter values were credible, indicating the number of herd bTB incidents increases both with years at risk and number of baseline herds, which is epidemiologically credible.With regards to the veracity of the bTB perturbation effect hypothesis23, the present analysis demonstrates that any such effect is most likely to be an artefact of unsuitable analysis. Taken together, this considerably weakens any view that badgers are responsible for a substantial proportion of the transmission of bTB between cattle herds claimed by subsequent studies based upon the RBCT results. It also suggests that the data, rather than the statistical model, from the RBCT is consistent with the recently published paper by Langton et al.6, which failed to demonstrate a reduction in bTB incidence and prevalence associated with the culling of badgers.It is also worth noting the data in Table 3, in the present report is taken directly from the supplementary material of the 2006 paper. In 6 of the proactive comparison areas, after treatment the incidence rates were higher in the survey only areas, but in 4 of the triplets the incidence rates were higher in the proactive area. There was also considerable heterogeneity ranging from a reduction in the proactive areas of 66% (Triplet E) to an increase of 236% (Triplet H). This, together with very marginal differences in some triplets (e.g. Triplet J and Triplet D), illustrates why there was likely to be no significant effect of culling in the present analysis.ImplicationsThe model in the 2006 paper had a credibility issue because of the parameter value for baseline herds and evidence of misspecification in its post hoc analysis. Furthermore, models where there was no effect of treatment also had a lower AICc. A linear model assuming normal approximations had no evidence of misspecification and failed to demonstrate an effect of culling.The present analysis has important implications. Much subsequent investment in research has assumed that the coefficient value (i.e. -0.207) in the RBCT results for the proactive culling parameter in the log linear Poisson model is a true reflection of the expected effect size for reduction in incidence in response to culling. There are many studies and reports which rely on this20,24,25,26,27,28,29,30. The view that the RBCT analysis represented a ‘strong evidence base involving experimental studies or field data collection on bTB with appropriate detailed statistical or other quantitative analysis.’31 must now be considered unsafe. Re-examination of data from the 2006 paper, suggests otherwise and hence undermines the badger culling policy in England and Ireland. Other recent papers have suggested that badger culling has an effect on bTB incidence in cattle32,33. Both reports suffer the same statistical issue as the 2006 paper, that is the misuse of the exposure variable. In both these papers the authors report their results as incidence rate ratios (IRR), when in fact they report incidence ratios (IR), that is they did not correctly control for exposure by using an offset. Furthermore, these reports have other limitations such as sample size and short duration6. The present study is also strongly supported by recent results from Ireland, which suggests that the main drivers of the bTB epidemic there is cattle to cattle transmission with a minor role (if any) played by badgers34.Vaccinating badgers as a possible alternative to culling for bTB control in cattle is also being trialled as an alternative to culling badgers to reduce transmission (for example Gormley et al.35). However, if culling badgers has no effect on the incidence of bTB in cattle, then vaccination of badgers is unlikely to reduce transmission to cattle. This is supported by the results of a recent trial in Ireland, with test, vaccinate and removal of badgers (TVR). Whilst this reduced the prevalence of bTB in badgers there was no effect on the incidence of bTB in cattle36. Of note is that this TVR study used an offset variable for exposure in their analysis, which is consistent with our analysis and in contrast to the 2006 paper.The study suggests that the relationship between badgers, badger culling and bTB herd incidence is likely to be lower than has been reported and the relationship may be minimal or extremely infrequent. Our findings suggest that any method of controlling the disease in badgers, such as culling or badger vaccination are most probably equally likely to have a negligible effect in controlling tuberculosis in cattle.

Absence of effects of widespread badger culling on tuberculosis in cattle

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Multi-output prediction of dose–response curves enables drug repositioning and biomarker discovery

Hot Topics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Related Articles

Balancing Act: Pregnancy and Bipolar Disorder

Cohesion at the cellular level: flexible yet stable

Gut bacteria influence responses to immunotherapy in patients with asbestos related cancer

Quick Links

Must Read

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Popular Articles

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis