Foraging distance distributions reveal how honeybee waggle dance recruitment varies with landscape

OverviewTo predict the distribution of forage site distances reported by bees employing individual search (scouts) or following dances (recruits), we built a theoretical model to describe the distribution of distances between the hive and forage sites for any honeybee colony. This model contained “scout” trips- described by an exponential distribution, and “recruit” trips, described by a Rayleigh distribution. Both distributions were validated during model development by means of a simulation, described below. Using a pre-existing dataset of foraging distributions extracted from videos of waggle dances in 20 hives, for each hive we compared the fit of this collective model with a model in which foraging is entirely independent. Finally, for each hive within the agri-rural landscape (n = 10 hives), we calculated a multivariate land-use profile to describe the surrounding landscape and performed a Partial Least Squares analysis to establish how the proportion of bees acting as recruits (“waggle dance use”, estimated from the collective model) varied with land use.SimulationA circular foraging environment was created with radius \(r=2.5\). The number of resource patches in the environment was generated as a random Poisson variable with a mean of 1/5000 multiplied by the area of the environment. Resource patches are located at polar coordinates with a uniformly selected angle, \(\theta\), between 0 and 2\(\pi\) and a radial value, \(\rho\), between 0 and \(r\), determined by the square root of the uniform position values multiplied by \(r\). Each location was assigned to an instance of a resource object, where resource quality was randomly allocated between 0 and 10. Resource profitability is a product of this quality and the distance of the resource to the centrally located hive. Resources were periodically replaced with new resource of a random quality, on average once or twice per simulation.One hundred honeybee objects forage in this environment. Foraging bees follow independent flight segments with random lengths and direction41. In our simulation scouts follow a random path through the environment generated by sampling a uniform random step length and angle. The number of paths the scout draws when searching is also determined as a uniform random number. Each straight line in the random path is converted to a rectangle with length equal to the path section length and a constant width of 0.01 to represent an area the scout searches along that path. Of all the resources contained in the boxes drawn from the scout’s path, the one closest to the colony is selected as the resource patch that the scout will report if the quality of the resource exceeds a minimum threshold. Communication is simulated by pooling all the resource patches found. If no resources are contained in the scout’s path, no resources will be added to the scout pool.Recruits are honeybees that do not perform individual searches for forage, but instead sample from the pool of resource objects reported by scouts. The probability of sampling a resource is skewed by its profitability, mimicking the profitability bias known to occur in the dances of real-world scouts6. Recruits then visit these resources, and in the next iteration will add their resource to the pool of scout dances. Consequently, the pool of dances represents resources discovered by scouts and resources exploited by recruits. When a resource is depleted, it is removed from the environment and so any foragers that had been visiting it must select a different resource from the dance floor.The simulation was run for 100 iterations, in which all distances reported by scouts and recruits were recorded and combined every 5 time steps. We fitted an exponential and minimum of an exponential distribution to the distribution of foraging distances reported by the (Fig. 1a) scout and (Fig. 1b) recruit objects, by deriving the maximum likelihood estimate for each model fit on each data source through their analytical solutions: \(\hat{\lambda }=1/\bar{x}\), minimum of the exponential with a minimum foraging distance: \(\hat{\lambda }=1/\left(\pi \bar{{x}^{2}}\right)\). As the exponential assumes that distributions start from 0, the data were transformed by subtracting the minimum foraging distance from all foraging distances (\(x=x-\min (x)\)) before fitting. All simulation code was written in Python version 3.9 and uses the pandas42 and SciPy43 packages.Theoretical modelThe duration of the waggle run of a dance circuit represents the distance flown by the bee, and the two are linearly related44,45. To describe the distribution of waggle run durations on the dance floor we formulated a generic statistical model for the duration of waggle dances, in which it is assumed that resource patches are randomly placed in the environment. Foragers scout for these patches. Upon visiting a resource patch, foragers translate the profitability of a resource into the number of repeats of the dance, also called “dance circuits” (Fig. 2). The number of dance circuits is a function of quality and distance. Recruits sample random dances and report the location of successful visits to resource patches on the dance floor. Through the feedback and over-representation of profitable resources on the dance floor, recruits will converge to visiting the most profitable resource in vicinity of the hive. The distribution of waggle run durations is the superposition of scouting and recruiting trips.The distance after which a resource is first discovered by a scout is assumed to follow an exponential distribution (given by \(\lambda {e}^{-\lambda x}\)), which is the distribution of the distances to the first object encountered over a linear path when objects are randomly placed. Through the feedback mechanism that the dance floor provides, the colony can, collectively, locate the most profitable resource in its environment. For randomly placed resources in a two-dimensional environment the distance to the nearest point is distributed according to a Rayleigh distribution (given by \(2\lambda \pi x{e}^{-\pi \lambda {x}^{2}}\))46. Our simulation model (see above) shows that this indeed describes the distances at which recruits visit resources well. Knowing the distance distributions of scout and recruit trips we then assume that the proportion \(p\) of all trips are scout trips. With this information we can specify the distributions of distances on the dance floor (see Supplementary Methods for full details).We implemented this in a full model that describes the distance distribution of an environment that has \(n\) different resource types (See Supplementary Methods). In the full model the number of parameters increases with \(2n\) (each resource needs a parameter for the scout and recruit distribution). Even if the number of resources is low, the model tends to overfit. To facilitate estimation of the parameter \(p\) we therefore used a simplified model to estimate the fraction of scout trips, where \(m\) represents the lowest duration considered, and here a minimum waggle run duration in the data set.In the simplified the model we assumed that the number of dances depends weakly on distance and there is a sizable quality differences between resources of a non-negligible size and that there is a sizable intensity of the high-quality resource (See Supplementary Methods for detailed derivation). Foragers on scouting trips are more likely to report larger distances than foragers on recruiting trips. By linearising the function that translates the profitability into the number of waggle dance run for the largest waggle run duration and normalising, we arrive at simplified distribution for waggle run durations for scouting trips:$${f}_{s}(x)={a}_{s}\left({M}_{s}^{-1}{b}_{s}{e}^{-{b}_{s}{a}_{s}(x-m)}{\left[1-{a}_{s}x\right]}_{+}\right)$$
(1)
where we used the shorthand \({x}_{+}=\max (x,0)\) The parameter m is the minimum recorded duration, \({a}_{s}^{-1}\) is the maximum waggle run duration by scouts, \({a}_{s}{b}_{s}\) is the intensity of resources found by scouts and \({M}_{s}=(1-{a}_{s}m)-{b}_{s}^{-1}\left(1-{e}^{-{b}_{s}(1-{a}_{s}m)}\right)\) is the factor that normalises the distribution.Recruit trips will be predominantly to high-quality resources. Only if the nearest high-quality patch is very far away will there be a more profitable patch of lesser quality available, and this happens only rarely if the intensity of the best quality resource is sizable. After linearising the function that translates the profitability into the number of waggle dance runs for short waggle run durations and normalising the distribution of waggle run durations reported from recruit trips in the simplified model is:$${f}_{r}(x)={M}_{r}^{-1}2\pi {a}_{r}^{2}{b}_{r}x{e}^{-\pi {b}_{r}{({a}_{r}x)}^{2}}{\left[1-{a}_{r}x\right]}_{+}$$
(2)
where$${M}_{r}=(1-{a}_{r}m){e}^{-\pi {br}{({a}_{r}m)}^{2}}+\frac{{{\rm{erf}}}({a}_{r}\sqrt{\pi {b}_{r}m})-\left.{{\rm{erf}}}(\sqrt{\pi {b}_{r}})\right)}{2\sqrt{{b}_{r}}}$$is the normalisation factor, the parameter \({a}_{r}\) is the rate with which dances repeats depends on distance for recruit trips and \({a}_{r}^{2}{b}_{r}\) is the intensity of high-quality resources reported by recruited foragers.The simplified distribution function is$$P(\underline{x}=x)=p{f}_{s}(x)+(1-p){f}_{r}(x),$$
(3)
where \({f}_{s}(x)\) Is the distribution of distances that reported from scout trips and \({f}_{r}(x)\) the distribution of distances reported from recruit trips. The parameter \(p\) is the fraction of scout trips, and consequently, 1-p the fraction of recruit trips. The likelihood of the parameters given the data of observed n reported distances (x1, …xn) is$$L={\prod }_{i=1}^{n}p{f}_{s}\left({x}_{i}\right)+\left(1-p\right){f}_{r}\left({x}_{i}\right).$$We determined the maximum likelihood numerically for model fitting and parameter estimation.Data collectionDetails of data collection, waggle dance decoding and classification of land-use types can be found in full in the Materials and Methods section of Samuelson et al.24. Observation hives were located at apiaries in either central London (UK) or the surrounding agricultural land and were each located at least 2 km apart. Visits took place every fortnight between April and September 2017. On each visit, two hours of continuous waggle dance data was recorded by training a camcorder onto the dance floor. The footage of the dances was decoded manually following methods in ref. 44. Up to 40 dances were decoded per video.Statistics and reproducibilityFor each real-world hive from the observed dataset, we fitted the distances indicated in the waggle runs to two versions of our model: an “individual” model, where all forage sites are found through scouting, and a “collective” model, where the proportion of scout trips (\(p\)) can take on any value between 0 and 1.All models were fit using maximum likelihood estimation25 by summation over the logarithm of the simplified distribution function outlined in the methods section: model using uninformative priors. The numerical optimisation routine is written in C++ and uses the Nelder-Mead simplex algorithm47 implemented in the ‘NLopt’ library48 and interfaced to R using ‘Rcpp’49.The most parsimonious model was identified using the Akaike information criterion and Akaike weights25,50. Goodness of fit was assessed using the two-sample Kolmogorov-Smirnov (KS) test27 and implemented in R using the ks.boot function of the package ‘Matching’ in R51.All analysis code was written in R52.To determine whether there is variation in the use of the waggle dance within a landscape type we fitted the model to the data from all hives using the individual and collective models and selected the model with the lowest AIC. We also pooled the data for all hives for a landscape type, fitted the individual and collective model to the pooled data and selected the model with the lowest AIC. The sum of the AIC values of the best models for the individual hives in the agri-rural landscape was 2944, for the pooled data it was 3359. For the urban landscape the summed AIC of the best model for the hives was 2059, for the pooled data it was 2394. The difference in was ΔAIC = 415 and ΔAIC = 335 for, respectively, agri-rural and urban landscapes. Corresponding to Akaike weights of 1 for the model for individual hives and 0.000 for the model for with the pooled data, which had essentially no support25. See Supplementary Table 1 for details on AIC values.For each hive, our modelling process resulted in an estimated proportion of recruit trips,1-p, henceforth termed “waggle dance use”. This proportion was highly variable within the agri-rural category, and so within this group, we sought to identify land-use variables that covary with waggle dance use. For each site, we performed Partial Least Squares analysis based on proportional coverage within 10 land-use categories within a 2.5 km radius around each hive (Supplementary Table 2, see ref. 24 for full classification methods). Prior to conducing the PLS we removed any sites in for which both individual and collective model fit was poor (n = 1 site).As our estimates of waggle dance use are continuous on the interval \([{\mathrm{0,1}}]\) we used the R package plsRbeta53 to conduct the PLS and performed a beta regression on the results using the R package betareg54. As the betareg package only works on the open interval \(({\mathrm{0,1}})\) the data, \(x\), was transformed using the following equation: \(\left(x\left(n-1\right)+0.5\right)/n\) as outlined in the betareg package documentation. After analysis the data was back-transformed to the original values for the plots in Fig. 5.To test robustness, we performed jack-knifed resampling by removing each site in turn before re-rerunning the PLS analysis, recoding the loadings for each iteration (see Supplementary Fig. 2 for loadings with each site removed). The PLS loadings for each land-use type are plotted as a box plot in Fig. 5 to show the spread of these variable types. A loading was determined to be significantly correlating with the first principal component if contributed more than its expected variance.Note that significant correlations with land-use types that represent very small proportions of total land cover were not interpreted further (unmanaged green space; Supplementary Table 2; Fig. 5c. Unmanaged green space was also not supported by jackknife analysis).Reporting summaryFurther information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Hot Topics

Related Articles