A Reinforcement Learning approach to study climbing plant behaviour

Notations for the mechanics of a planar elastic rodLet \(e_1,e_2,e_3\) a basis of orthonormal vectors for \(\mathbb {R}^3\). We assume that the searcher shoot behaves like an inextensible and unshearable elastic rod34 confined in the plane spanned by the vectors \(e_1\) and \(e_2\). The centerline of this rod lies on the curve \(\Gamma \subset \text {span}\{e_1,e_2\}\). We parametrize the curve \(\Gamma\) with its arc length s, so that if the length of the curve is L, we have \(s \in [0,L]\). We denote with \(\underline{\Gamma }(s)\) the position in the plane of the point on \(\Gamma\) whose arc length is s. With this parametrisation, \(\partial _s\Gamma (s)\) represents the normal tangent vector to \(\Gamma\) at the point \(\underline{\Gamma }(s)\). We denote with \(\theta (s)\) the angle between \(\partial _s\Gamma (s)\) and the vector \(e_2\). Consequently, \(\theta ‘(s)\) is the curvature of \(\Gamma\) at the point \(\underline{\Gamma }(s)\).We refer to \(\Gamma\) as the current configuration, which corresponds to the actual shape of the rod when subject to external physical forces. To study the effects of gravity acting on that rod, we need to consider the intrinsic configuration, denoted as \(\hat{\Gamma }\), which corresponds to the geometric curve assumed by the rod when there are no external forces acting on it. Since the rod is inextensible, the arc length parameter of \(\hat{\Gamma }\) and \(\Gamma\) is the same \(s \in [0,L]\).The difference between the curvature \(\partial _s\hat{\theta }(s)\) of the intrinsic configuration and the curvature \(\partial _s \theta (s)\) of the current configuration is proportional to the resultant moment (of force) m(s) acting at the point \(\underline{\Gamma }(s)\) (and directed along \(\underline{e}_3\)) through the Euler-Bernoulli formula:$$\begin{aligned} m(s) = – E(s)I(s)(\partial _s\theta (s) – \partial _s\hat{\theta }(s)). \end{aligned}$$
(1)
This relation holds because we are considering unshearable rods. With E we are denoting the Young’s modulus, which expresses the stiffness of the material, and with I the second moment of area of the cross-section (with respect to \(e_3\)). We assume that the rod is in elastic equilibrium, that is, all the internal forces and moments are in balance with the external forces and moments. In this framework, considering equation (1) and the gravity force as the only external force acting on the rod, we can write the following differential equation (see for instance35)$$\begin{aligned} \partial _s(EI(\partial _s\hat{\theta } – \partial _s\theta ))(s) = \sin (\theta (s))g\int _s^L \rho _3(s’) A(s’) ds’. \end{aligned}$$
(2)
In this equation, g represents the gravity acceleration constant, \(\rho _3(s)\) the volume density of the shoot/rod at the point \(\underline{\Gamma }(s)\) and A(s) the cross-section area at that point.We are now interested in computing the internal bending stresses. We know that the internal moment m is generated by the deflection from the intrinsic configuration \(\hat{\Gamma }\). Indeed, this deflection generates an internal pressure called stress, that we denote with \(\sigma\), and a deformation \(\varepsilon\) of each element of the rod, called strain. Stresses and strains vary according to the position \(\underline{\Gamma }(s)\) on the rod, and depend also on the position on the cross-section. Since the rod is ushearable, the cross-section is always orthogonal to the tangent vector \(\partial _s \Gamma\). To describe the position of a generic point on the cross-section at \(\underline{\Gamma }(s)\), we name$$\begin{aligned} \beta (s) = \frac{\partial ^2_s\Gamma (s)}{|\partial ^2_s\Gamma (s)|} \in \text {span} \{ e_1, e_2\} \end{aligned}$$the normal vector, which is orthogonal to \(\partial _s\Gamma (s)\), and we consider the binormal vector$$\begin{aligned} \tau (s) = \frac{\partial _s \Gamma \times \beta (s)}{|\partial _s \Gamma (s) \times \beta (s)|}, \end{aligned}$$Then, the cross-section at the point \(\underline{\Gamma }(s)\) is a subset C(s) of the plane \(\text {span} \{ \beta (s), \tau (s) \}\) with the origin on the centerline. We define$$\begin{aligned} C(s,z) = \{ w \in \mathbb {R} : \, \underline{\Gamma }(s) + z\beta (s) + w \tau \in C(s) \}. \end{aligned}$$In this framework, the maximal bending stress at \(\Gamma (s)\) results to be36$$\begin{aligned} \begin{aligned} \sigma _m(s)&= \max \{|w| : \, C(s,w) \ne \emptyset \} \cdot \frac{m(s)}{I(s)} \\&= \max \{|w| : \, C(s,w) \ne \emptyset \} \cdot E(s) |\partial _s\theta (s) – \partial _s \hat{\theta }(s) |. \end{aligned} \end{aligned}$$
(3)
Formulation of the modelsWe assume that the searcher shoot has a circular cross-section with radius r and that the Young’s modulus E is constant all along the shoot. So, we have$$\begin{aligned} \begin{aligned} A(s)&= \pi r^2(s) \\ I(s)&= \frac{\pi }{4}r^4(s) \\ r(s)&= \max \{|w| : \, C(s,w) \ne \emptyset \} \end{aligned} \end{aligned}$$and we name$$\begin{aligned} u(s) = r^2(s), \, \psi (s) = u^2(s)(\partial _s \hat{\theta }(s) – \partial _s \theta (s)), \, \mu (s) = \int _s^L \pi \rho _3(s’)u(s’) ds’ \end{aligned}$$So, equation (2) can be rewritten as$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _s \psi (s) = c_1 \sin (\theta (s)) \mu (s) \\ \partial _s \theta (s) = \partial _s \hat{\theta }(s) – \frac{\psi (s)}{u^2(s)} \\ \partial _s \mu = – \pi \rho _3(s) u(s) \end{array}\right. } \end{aligned}$$
(4)
with$$\begin{aligned} c_1 = \frac{4g}{\pi E}. \end{aligned}$$These equations hold for a.e. \(s \in [0,L]\). At the boundary of this domain, we assume (1) to know the angle at the base of the shoot \(\theta (0) = \theta _0\); (2) that at the tip of the shoot, there is not any external weight so that the intrinsic curvature equals the current curvature. Using the functions defined above, this means \(\psi (L) = 0\) and \(\mu (L) = 0\); (3) the mass M of the whole shoot is known, so that \(\mu (0) = M\).The effectiveness of the self-sustaining behavior of the shoot can be quantitatively evaluated considering a threshold for the maximal internal stresses. In other words, we would like to find a distribution of the mass such that the maximal stress at each point of the shoot \(\sigma _m(s)\) does not cross some fixed value \(\bar{\sigma }\). Employing equation (Eq. 3), this means that any solution of system (Eq. 4) which satisfies the boundary conditions above discussed, has also to satisfy the condition$$\begin{aligned} |\psi (s)| \le c_2 u^{3/2}(s) \text { for every } s \in [0,L] \end{aligned}$$
(5)
with$$\begin{aligned} c_2 = \frac{\bar{\sigma }}{E}. \end{aligned}$$We group all these considerations into the following problems.Problem of shoot growth with Mechanics (Me). We want to find the maximal length L of the shoot for which there exists at each point a radius r(s) such that the following system is satisfied$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _s \psi (s) = c_1 \sin (\theta (s)) \mu (s) \\ \partial _s \theta (s) = \partial _s \hat{\theta }(s) – \frac{\psi (s)}{u^2(s)} \\ \partial _s \mu = – \pi \rho _3(s) u(s) \\ \psi (L) = 0 \\ \theta (0) = \theta _0 \\ \mu (0) = M, \, \mu (L) = 0 \\ |\psi (s)| \le c_2 u^{3/2}(s) \end{array}\right. }. \end{aligned}$$
(Me)
Problem of shoot growth with Mechanics and Leaves (MeLe). In problem (Me) we consider just the weight of the main stem. However, in most of climbing plant species, a relevant part of the total biomass is due to the leaves. We assume that the leaves are not uniformly distributed along the shoot. On the contrary, we assume that they are located at intervals equally spaced. Moreover, we assume that the mass \(m_{\text {lm}}\) of a single leaf at the point \(\underline{\Gamma }(s)\) depends just on the shoot radius r(s) at \(\underline{\Gamma }(s)\) (so \(m_{\text {lm}} = m_{\text {lm}}(r(s))\)). Let \(d_{\text {lm}}\) the distance between two leaves locations and \(n_{\text {lm}}\) the (fixed) number of leaves at each location. Then, we name$$\begin{aligned} s_i = i \times d_{\text {lm}} \text { for } i = 1,…, \, N_{\text {lm}}, \end{aligned}$$where \(N_{\text {lm}}\) is the total number of leaves locations. Therefore, \(s_i\) denotes the arc length of the i-th leaves location. Now, we want to compute how the weight of the leaves affects the shoot at the point \(\underline{\Gamma }(s)\). To achieve this, we subtract from the total leaves mass \(M_{\text {lm}}\) the mass of the leaves in the shoot portion between the base and \(\underline{\Gamma }(s)\). We define$$\begin{aligned} q_{\text {lm}}(s) = \left\lfloor \frac{s}{d_{\text {lm}}} \right\rfloor \end{aligned}$$and$$\begin{aligned} \text {RES}_{\text {lm}}(s) = M_{\text {lm}} – n_{\text {lm}} \sum _{i = 0}^{q_{\text {lm}}(s)}m_{\text {lm}}(r(s_i)), \end{aligned}$$where the operation \(\lfloor x \rfloor\) is the greatest integer lower or equal than x. Then, to take the leaves into account, we compute the gravity force acting at the point \(\underline{\Gamma }(s)\) :$$\begin{aligned} -g \left[ \int _s^L \rho _3(s’) A(s’) ds’ + \text {RES}_{\text {lm}}(s) \right] e_2. \end{aligned}$$This leads to another problem of length maximization. Like for the (Me) case, we want to find the maximal length L of the shoot for which there exists at each point a radius r(s) such that$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _s \psi (s) = c_1 \sin (\theta (s)) \left( \mu (s) + \text {RES}_{\text {lm}}(s) \right) \\ \partial _s \theta (s) = \partial _s \hat{\theta }(s) – \frac{\psi (s)}{u^2(s)} \\ \partial _s \mu = – \pi \rho _3(s) u(s) \\ \psi (L) = 0 \\ \theta (0) = \theta _0 \\ \mu (0) = M, \, \mu (L) = 0 \\ |\psi (s)| \le c_2 u^{3/2}(s) \end{array}\right. }. \end{aligned}$$
(MeLe)
Reinforcement Learning frameworkWe define the Searcher-Shoot environment as a Markov decision process problem (see Supplementary Material). Here, the agent is the liana’s searcher shoot, a structure that has presumably been selected during evolution to have the greatest reach to colonize its environment by adjusting on its mechanical properties and the taper of the shoot’s diameter. Specifically, the agent learns how to complete the task (i.e., the mass distribution) in the highest number of steps, choosing radius values does not generate internal stresses over a fixed threshold.The fundamental elements of the framework are:

State and observations. At each step, the agent, in the current state, can observe the mass, the radius, and the curvature before the next move. The choice of such a state is due to algorithmic and biological reasons. Indeed, in our algorithm, the agent controls the distribution of the mass by choosing the radius at each step. To this aim, the agent is able to observe the (remaining) mass and the radius. Regarding the curvature, it is a stop criterion and an index of efficiency. Moreover, plants are able to sense their own curvature through specialized fiber cells37. Plants are also able to sense their own inclination38, but we prefer to keep the number of observable variables as low as possible in order to reduce the computational effort of the algorithm.

Actions. In this framework, the action space is discrete. In the (Me) and (MeLe) models, at each time step, the agent can select one action among eleven options: it can leave the radius value unchanged or it can increase (or decrease) the radius of a certain quantity (\(\pm 1 \times 10^{-5}\), \(\pm 2 \times 10^{-5}\), \(\pm 3 \times 10^{-5}\), \(\pm 4 \times 10^{-5}\), \(\pm 5 \times 10^{-5}\)). Of course, this selection will influence the mass distribution: intuitively, the larger the radius value, the larger the mass allocated in the next step.

Reward. Every time the agent moves to the next step, it will receive positive feedback equal to \(+ 1\). Whether the move ends with the total mass equivalent to or less than 0, or the condition on the curvature (Eq. 5) is violated, the reward is 0 and the algorithm stops.

Episode and Reset. The episode does not have a fixed term. Instead, it ends whether the mass becomes zero or negative or the picked radius causes the curvature to violate condition (Eq. 5), as displayed in Algorithm 1 and 2. Then, we set the system parameters and the observation space to their initial values.

Algorithm 1Algorithm of Shoot growth with mechanics.Algorithm 2Algorithm of Shoot growth with mechanics and leaves.Models implementation and parametersTo begin with, we implement the System of equations (Me) in the Searcher-Shoot environment. In such a system, we consider stress and strain as factors responsible for its shaping, together with gravity, which acts as an external force on the plant’s structure, affecting its curvature. Moreover, the material density is a function of the radius r, defined as follows:$$\begin{aligned} \rho _3 = c_{\text {vd}} + b_{\text {vd}} \cdot r + a_{\text {vd}} \cdot r^2. \end{aligned}$$
(6)
We use Algorithm 1 to clarify the approach we implemented. Starting from an initial configuration of curvature, angle and mass \((\psi _0, -\pi /2,M)\), and given an initial radius \(R_0\), at each step of the algorithm the agent chooses how much to increase or decrease the current radius. Consequently, the total mass decreases and curvature and angle change accordingly to System (Me). This process is repeated until the mass vanishes or the curvature constraint is violated. The values of the constants \(c_2\) and \(\psi _0\) are estimated in two ways, leading to two groups of simulations. In the first group, \(c_2\) and \(\psi _0\) are the same for all the samples, while in the second group, they are estimated specifically for each sample by utilizing the method described in33.Successively, we add the leaves’ mass contribution, which affects the plant’s weight remarkably. We implement the System of equations (MeLe), where we can notice that the effects of the leaves are visible on the curvature and, then, in the formulation of the equation of \(\psi\). As for the material density, the mass of a single leaf depends on the radius r according to the following relation:$$\begin{aligned} m_{\text {ml}} = c_{\text {lm}} + b_{\text {lm}} \cdot r + a_{\text {lm}} \cdot r^2. \end{aligned}$$
(7)
In Algorithm 2 we clarify how we implement our model in the RL context. The Algorithm iterations are similar to Algorithm 1, and in addition, we consider the mass of the leaves and check whether at the position of the agent there is a group of leaves or not. In Table 4, we include all the parameters used in the models.
Table 4 Parameters of the models. In the case of the parameters resulting from the fitting procedure, we report the source of the data on which functions (Eqs. 6, 7) are fitted.Statement on research methodologyThe research methodology employed in this study adheres strictly to ethical and legal considerations associated with experimental research and field studies on plants. We affirm our compliance with relevant institutional, national, and international guidelines and legislation governing such research endeavors. Additionally, we emphasize our commitment to following the IUCN Policy Statement on Research Involving Species at Risk of Extinction and the Convention on the Trade in Endangered Species of Wild Fauna and Flora. This ensures the responsible and ethical conduct of our research with due consideration for the well-being of the studied plant species and the broader ecosystem.

Hot Topics

Related Articles