The potential of decision trees as a tool to simplify broiler chicken welfare assessments

Welfare assessment protocols are exhaustive at contemplating on-farm animal welfare dimensions, principles, and criteria, but this comes at the cost of elevated time investment for assessors, technical staff, or farmers. Regular monitoring of flock welfare would allow the establishment of preventive control of welfare problems before they become severe and their solution difficult. Nevertheless, the application of full protocols may be too long11,26, even for those requiring less time such as the transect-based assessment protocols10,17, for widespread on-field adoption and expansion of regular use among farmers. In addition, adequate measurement and interpretation of some indicators require previous training and experience, such as the bedding quality assessment based on a 5-point scale27. These are added difficulties that may deter farmers from using simple assessment methods to improve flock welfare.In this study, we used a decision tree as a supervised ML approach to identify iceberg indicators allowing for a simpler, quicker classification of the welfare status of commercial broiler flocks. Of the 18 indicators required to complete the transect- based welfare assessment protocol, 6 were resource-based and 12 were animal-based (Table 1). From all these indicators, the decision tree selected 4 animal-based indicators to predict the outcome of the welfare assessment benchmark with relatively high certainty, considering the small sample size available. Subjectively, this number of indicators, and the selected indicators themselves, appear to be a good compromise between protocol simplification and a reasonable number and diversity of indicators that respond to the multifactorial nature of animal welfare28. Conversely, complementary ML analysis yielded high prediction accuracy at the cost of a higher number of indicators, which would result in a more complex assessment protocol than that proposed by the decision tree. Overall, this suggests the potential for an automatic flock classification according to the assessment benchmark outcomes.Preliminary, unsupervised analysis showed that observations grouped well according to the flock assessment benchmark outcome (3D principal component analysis), although some overlapping existed between sub-groups. Nevertheless, cluster analysis was inconclusive as the 3 clusters contained variable proportions of observations passing and failing the welfare assessment benchmark. Still, it is remarkable that the smallest cluster captured more than half of the flocks failing the assessment benchmark (6 out of 11 flocks).The resulting decision tree included the prevalence of cumulative mortality, lame and immobile birds, and birds with back wounds, reflecting the relevance of mobility problems and skin lesions, the most common critical on-farm welfare indicators10. These are all crude welfare indicators; indicators of positive welfare (i.e. exploring or dust bathing) were not contemplated in the AWIN protocol, and thus they were not assessed. According to Fig. 4, the welfare assessment of fast-growth broiler flocks will most likely be successful when cumulative mortality is < 4.0257%, birds with back wounds are < 0.013% and lame birds are < 0.506%.Cumulative mortality the on the assessment day was the first partitioning variable of the decision tree and had the highest relative importance (Table 3). This, and its position closer to the tree root suggested that cumulative mortality had a higher discriminant power and was more critical for welfare than other indicators. Average cumulative mortality, all flocks considered, was 3.09 ± 0.17% (Table 1), although values differed between flocks passing the assessment benchmark (2.77 ± 0.14%) and those failing it (4.39 ± 0.49%). Cumulative mortality fell within the range of values reported in other studies29,30,31,32,33(data not used in the present study) and was lower than the 5% threshold proposed by the EFSA AHAW Panel34. It has been affirmed that mortality is a particularly relevant iceberg indicator for broilers that should be included in any welfare scheme35. Nevertheless, a practical disadvantage of its use is the impossibility of a timely correction of mortality causes for the assessed flock once elevated values are detected36. Unusually high cumulative mortality likely reflects the joint action of different health and/or welfare problems, but cumulative mortality per se is unspecific at informing about the ultimate causes of welfare degradation. In any case, the sole determination of flock cumulative mortality on the assessment day did not allow for a precise estimation of the chances of failing an AWIN welfare assessment, and 3 additional variables were also selected by the algorithm (lameness, immobility, and back wounds).An important practical implication of the decision tree is that the welfare status of fast-growth broiler flocks could be quite accurately predicted by simply doing transect walks and checking mortality records on the assessment day. This would still require some training for a precise application of the definitions of animal-based indicators, but it would in any case be smaller than the training required by other assessment protocols. The prevalences of immobile, lame, and birds with back wounds were assessed through the transect method following the definitions that base the protocol32 and have been used in different studies37,38. It is well known that fast-growth strains are prone to show increased bone issues, resulting in higher levels of lameness and mobility problems39,40,41 that reflect in higher gait scores when compared to slow growing strains42. Both immobility and lameness have also been considered as relevant iceberg indicators by the EFSA AHAW Panel35. Although the presence of immobile birds might have to some extent been affected by farm management, prevalences of immobility and lameness differed substantially between flocks passing and failing the assessment benchmarks (0.16 ± 0.02% and 0.25 ± 0.02% respectively for flocks passing; 0.24 ± 0.05% and 0.49 ± 0.09% respectively for flocks failing).Flocks with different assessment benchmark outcomes also differed regarding the prevalence of back wounds, another relevant iceberg indicator according to the EFSA AHAW Panel35. Values were 0.003 ± 0.001% for flocks passing and 0.015 ± 0.006% for flocks failing the assessment benchmark. Back wounds are usually caused by sudden stress sources or excitement that alter the flock`s behaviour and trigger a flight response or excited jumping, for example, due to sudden changes in light intensity in houses with natural light. This may result in a higher prevalence of skin lesions caused by the claws when birds land on other birds’ backs. Back wounds’ values were similar to those reported previously10,32.We expected initial stocking density to play a crucial role in the decision tree. Contrarily, density was one of the variables discarded by the algorithm. Density is a determinant for broiler welfare, with high stocking densities generally having a negative impact on birds43, although their effect is also dependent on other housing conditions, specifically associated with thermal control and air quality44. Reducing stocking density will likely improve welfare and performance45 as environmental conditions become easier to control. The lack of relevance of density found in this study might be attributed to the low variability in the density range of the studied flocks. Alternatively, it is also possible that within commercial rearing, flocks housed at higher densities require top management practices to reach profitability, and so the resulting welfare levels may be equivalent to those of birds housed at lower stocking densities within the ´normal´ commercial range44. It might be speculated that if the assessment benchmark had been set at a higher value, an effect of stocking density would emerge. We believe that, even in that case, small variability would impede stocking density to affect assessment outcomes. Nevertheless, if the assessment protocol had included animal indicators of positive welfare46, an effect of stocking density might be expected.Broiler strains selected for fast growth have associated metabolic and locomotor disorders that may ultimately affect performance47,48. This led us to the assumption that body weight during assessment would be a relevant indicator and was therefore included in the analysis despite not being required by the AWIN protocol and its correlation with age. However, body weight, as found with stocking density, was discarded by the algorithm, which might be attributed to the homogeneity of weight achieved by the most common fast-growth strains used in Spain (either Ross 308 or Cobb 500). Bedding quality was not included in the final decision tree either. Its condition was on average good, with the score being lower (and therefore better) than the average score of 2.3 reported for European flocks11. This is not surprising given the lower densities normally used in Spain and the dryer litter conditions due to the climatic country conditions.Regarding the performance of the resulting decision tree, confusion matrices (Table 2) evidence that the tenfold validated decision tree was good at correctly predicting the assessment results (80.70% of cases), remaining strong at correctly predicting farms passing the assessment benchmark (error rate = 0.1304; 6 out of the 46 farms passing the assessment benchmark). Model performance was not so good at correctly predicting farms failing the assessment benchmark (error rate = 0.4545; 5 out of the 11 farms failing the assessment benchmark), which may be attributable to the characteristics of the sample, as the number of flocks that failed the assessment benchmark (19.3% of total) was lower regarding those passing, likely influencing model performance when predicting failed assessments. Model fit indicators would support this, as the specificity of the tenfold cross-validated model (the prediction accuracy for flocks that passed the assessment) was particularly good (0.8696), while sensitivity (the prediction accuracy for flocks that failed the assessment) was only average (0.5455). The AUC value obtained with the fitted model tested on training data (0.9980) indicates that the model was excellent at discriminating between flocks that pass or fail the assessment benchmark, while the likelihood of misclassification of a flock was low according to the Gini index (0.0234).The complementary supervised ML analyses on the complete set of indicators yielded a more complex and accurate model for describing our observations, but probably irrelevant describing intensive broiler farms beyond those of our study. Nevertheless, they all contained the 4 variables selected by the decision tree. A strong correlation was only detected between bird weight and age (r = 0.91; Fig. 5), but bird weight was not among the 10 variables yielding the most accurate model, and bird age was the least important of them (Fig. 8). This confirms the relative unimportance of bird weight and age regarding other indicators and for our modelling purposes.The relative importance of the 4 indicators automatically selected by the decision tree algorithm was high considering the complete set of welfare indicators (Fig. 6), particularly in the case of cumulative mortality and lame birds. Other indicators not included in the decision tree gained importance, such as the cases of dirty, small, and sick birds. Although they are relevant broiler welfare indicators38, their importance emerged when the model included 10 welfare indicators. As previously discussed, it may be well expected that in the case of a simpler model, such as that proposed by the decision tree, which in our view has more biological relevance and higher potential for generalization, the relative importance of variables will be different.A limitation of this study is that the indicators automatically selected by the decision tree algorithm only apply to the AWIN broiler protocol and must not be extrapolated to other protocols. However, this study was intended to be a proof of concept, so that the same or similar analytical approaches could be applied to any broiler welfare assessment protocol, and indeed to protocols for any other farmed species, as long as there is a consideration of the acceptability threshold for welfare indicators. This is extremely relevant, as the industry needs objective information about the welfare status of production animals to satisfy current market demands. If welfare assessments were dopted as an on-farm routine practice, the industry would likely be willing to apply simpler, shorter protocols that maximize objectiveness and scientific rigour. One way to achieve this is to base welfare assessments on iceberg indicators, i.e., indicators providing an overall welfare assessment49. Thus, the design of refined protocols based on identifying a sub-set of iceberg indicators appears as an adequate approach to tackle the problem. Nevertheless, the refinement process should be carefully designed as animal welfare is multidimensional and all dimensions are equally critical for animals. Furthermore, all welfare dimensions are affected by environmental factors, and their interrelations are complex44. Thus, information value or detail that may be lost by using refined protocols must be considered and weighed against potential benefits such as their usefulness. One way to overcome this would be using decision trees on each set of variables covering each animal welfare principle, which would ensure covering all of them. If refined and more practical protocols became widespread among farmers as a risk assessment tool, the chances of controlling welfare problems would be higher.On the other hand, decision tree modelling could be based on a two-step welfare assessment. A first step would include the use of the reduced protocol for a quick risk assessment of compromised animal welfare, followed by the implementation of a more in-depth assessment of those flocks for which downgraded welfare is detected. This might indeed be a more efficient, flexible, and scalable approach to improve on-farm animal welfare, as compared to current approaches where the application of full protocols is always required. Beyond this, if welfare problems and associated animal-based measures were assessed as binary variables (occurrence/non-occurrence), as already done for the risk of pigtail biting outbreaks50, then perspectives of the use of decision trees broaden even more to the possibility of assessing the risks of occurrence of these problems. For instance, in poultry production application of decision trees might help to clarify factors most contributing to the occurrence of leg problems and immobility in fast-growing broilers, or feather pecking in slow-growing broilers and laying hens. Beyond poultry, the practical implications of this approach could extend to all animal production species, their specificities, and associated welfare problems.Overall model performance confirms that there is room for protocol refinement using analytical tools such as decision trees. Welfare assessments based on a reduced number of iceberg indicators selected using data modelling, can offer similar results to those obtained with more complex, time-consuming assessment protocols, at least for a first risk assessment. Previous efforts to simplify broiler welfare assessments have focused on high correlations between on-farm indicators, and between values of the same indicator measured on-farm and at slaughter11 but achieved protocol reductions were still limited. Reduction to four welfare indicators translates into an important reduction of time necessary to discriminate between flocks that will pass a welfare assessment benchmark and those that will not. In our experience, completion of the AWIN protocol takes about 1.5 h per flock; if the refined protocol was to be implemented, assessment would only be based on carrying out transect walks to assess lame and immobile birds, birds with back wounds, and obtaining flock cumulative mortality from farm written records. A time reduction of about 45 min might therefore be well expected per flock. Although assessments used in this study were all carried out in Spanish commercial farms, modern intensive broiler production is standard across countries, therefore, we also believe that this decision tree has a broad application. There is room for model readjustments and improvement regarding to its accuracy in predicting failed assessments, which will be achieved once predictions can be based on a larger sample size.In conclusion, this study, taking the broiler AWIN protocol as an example, shows that decision trees can be an effective tool to refine welfare assessment protocols. Animal welfare benchmark assessments based on reduced protocols including a sub-set of iceberg indicators are possible and provide similar results to those obtained with complete protocols. Based on this welfare assessment protocol and available data, non-supervised analyses showed potential for applying ML classification algorithms. The resulting decision tree model for fast-growth broilers automatically selected % of cumulative mortality, % of lame and immobile birds as indicators of mobility problems, and % of birds with back wounds for skin lesions, all acknowledged as critical on-farm welfare indicators. Cumulative mortality appeared as the most important partitioning variable, probably because it summarizes actions and consequences of different sources of health and welfare degradation. The other three indicators selected by the algorithm were assessed using the transect method. Complementary ML analyses yielded models that included more welfare indicators and were more accurate, but at the cost of higher complexity that limited the possibilities of a broad application beyond this dataset. Thus, in light of the results it is suggested that a preliminary flock risk welfare assessment could be performed by doing transect walks to collect these three animal-based indicators and checking mortality records on the assessment day. The approach is promising, although model readjustments and improvement regarding prediction capabilities for flocks failing the assessment benchmark are still possible, and indeed necessary, as new data are available. Nevertheless, from a practical standpoint, this decision tree approach may be a more suitable way of implementing welfare assessments given the time limitation to cover all the farm demands.

Hot Topics

Related Articles