Constructing neural networks with pre-specified dynamics

We will introduce an algorithm that constructs a neural network that follows a given user-defined dynamics. If this is not possible, the algorithm finds a new equivalent dynamics that preserves the encoded information. We will first describe the neural network model, how the dynamics is specified, and the conditions necessary for the dynamics to be realisable by a neural network (see Table 1 in “Methods” for a quick reference to symbols and their definitions).Network model and consistency conditionsWe will consider recurrent neural networks composed of binary neurons, whose activations can take values 0 or 1 (Fig. 1a). Neurons receive inputs from sensory neurons and from all other neurons in the network.The network state at iteration k is dictated by the map:$$\begin{aligned} \mathbf { \textbf{u} }(k)&=\textbf{W}_{y}\textbf{y}(k)+\textbf{W}_{r}\textbf{z}(k-1) \end{aligned}$$
(1)
$$\begin{aligned} \textbf{z}(k)&=\mathcal{\text {H}} (\textbf{u}(k)) \end{aligned}$$
(2)
where \(\textbf{u}\) is the \(N_{neu}\)-dimensional vector of network preactivations, \(\textbf{y}\) is the \(N_{s}\)-dimensional vector encoding the presence of \(N_{s}\) different stimuli in a one-hot fashion, \(\textbf{z}\) is the \(N_{neu}\)-dimensional vector of network activations, and \(\mathcal {\text {H}}\) the Heaviside function. The ith row vector in matrix \(\textbf{W}_{y}\) collects the synaptic weights of the connections from stimulus to the ith neuron in the recurrent network, while the ith row vector in matrix \(\textbf{W}_{r}\) collects the synaptic weights of the connections the ith neuron receives from all neurons in the network (Fig. 1b).Figure 1Recurrent neural networks that follow a pre-specified transition graph. (a) Recurrent networks of binary neurons receive connections from neurons that encode stimuli in a one-hot fashion. (b) Example of synaptic weight matrices \(\textbf{W}_{y}\) and \(\textbf{W}_{r}\). The network is composed of 3 sensory neurons and 12 recurrently connected neurons. (c) Example of transition graph that defines the desired network dynamics. Each node represents a different network activation state. Directed arcs depict transitions from source to target states that are triggered by different stimuli, encoded by the sensory neurons and shown with arrows of different colours. (d) Example matrices \(\textbf{Y},\) \(\textbf{Z}_{s}\), and \(\textbf{Z}_{t}\).We want to find matrices \(\mathbf {W_{ y }}\), \(\mathbf {W_{ r }}\) and a set of network activation vectors \(Z=\{\textbf{z}_{v}\}\) such that stimuli trigger transitions between these network activations in accord with a desired, user-specified target trajectory in state space. We will specify a target trajectory as a multi digraph, in which each node v in the graph represents a network activation state \(\textbf{z}_{v}\), and the arcs represent transitions triggered by stimuli. Formally, we define a labelled multi digraph \(G=(V,K,S,f_{s},f_{t},l_{G})\), where V is the set of \(N_{v}\) nodes \(v_{1,}v_{2,}\ldots v_{N_{v}}\), \(K\subset \mathbb {N}\) is the set of transitions specified (the arcs in the graph), \(S\subset \mathbb {N}\) is the set of stimuli (the arc’s labels) that act as the network’s inputs, \(f_{s}\) is the function that takes an arc and gives its source node, \(f_{t}\) is the function that takes an arc and gives a target node, and \(l_{G}\) the function that takes an arc and gives its label (the stimulus that triggers that transition) (Fig. 1c). Equivalently, a matrix \(\textbf{G}\) can be constructed, such that row i in \(\textbf{G}\) is of the form \((l_{G}(k),f_{s}(k),f_{t}(k))\), specifying the ith transition. Each node \(v\in V\) will be encoded by its unique and specific vector z.We define matrices \(\textbf{Y}\), \(\textbf{Z}_{s}\) and \(\textbf{Z}_{t}\) as the matrices that collect in each of their column the activations of the sensory neurons (\(\textbf{Y}\)), and the activation of the recurrent network in a source state (\(\textbf{Z}_{s})\) and in a target state (\(\textbf{Z}_{t}\)), for each transition/column k (i.e. \(\textbf{Y}(:,k)=\textbf{y}_{\textbf{G}(k,1)}\), \(\textbf{Z}_{s}(:,k)=\textbf{z}_{\textbf{G}(k,2)}\) and \(\textbf{Z}_{t}(:,k)=\textbf{z}_{\textbf{G}(k,3)},\) with \(\textbf{y}_{i}\) being the input when the network is receiving stimulus i, and \(\textbf{z}_{i}\) being the activation state when the network state is \(v_{i}\) (Fig. 1d)). A network that instantiates graph G must satisfy:$$\begin{aligned} \textbf{W}\,\textbf{C}=\textbf{U},\,\,\,\,s.t.\, {\mathcal{\text{H}}}(\textbf{U})=\textbf{Z}_{t}, \end{aligned}$$
(3)
with$$\begin{aligned} \textbf{C}&=\begin{pmatrix}\textbf{Y}\\ \textbf{Z}_{s} \end{pmatrix}\end{aligned}$$
(4)
$$\begin{aligned} \textbf{W}&=\begin{pmatrix}\textbf{W}_{y},\textbf{W}_{r}\end{pmatrix} \end{aligned}$$
(5)
This is a problem of linear separability. Matrix \(\textbf{W}\) exists iff two conditions are met: 1. \(rank(\textbf{C})=rank((\textbf{C}^{T},\textbf{U}^{T}))\) (i.e. the augmented matrix \((\textbf{C}^{T},\textbf{U}^{T})\) and matrix \(\textbf{C}\) present the same linear combinations), and 2. the signs of values in U are consistent with \(\textbf{Z}_{t}\). If \(\textbf{Z}_{s}\) and \(\textbf{Z}_{t}\) are chosen properly, then a \(\textbf{W}\) that is a solution for Eq. (3) exists and can be found with recently proposed accelerated versions of the perceptron algorithm30. Finding suitable \(\textbf{Z}_{s}\) and \(\textbf{Z}_{t}\) matrices is a hard combinatorial problem. However, we can leverage some regularities to reduce its complexity. For a start, we must decide if such \(\textbf{Z}_{s}\), \(\textbf{Z}_{t}\) exist in the first place, since it could be the case that, for some transition graphs, the interdependencies between source and target nodes preclude its instantiation in the shape of any neural network. In this sense, we must find out if the graph is realisable, and if not, we must find a way to turn the graph into a realisable graph, without losing the critical aspects of the target dynamics.There is a set of linear combinations that directly stems from the transition graph, and can be detected and corrected if needed. Let’s consider two transitions that start from the same source node v, but are triggered by different stimuli:$$\begin{aligned}{} & {} \begin{pmatrix}\textbf{W}_{y},\textbf{W}_{r}\end{pmatrix}\begin{pmatrix}\textbf{y}_{s_{i}}\\ \textbf{z}_{v} \end{pmatrix}=\textbf{u}_{s_{i},v} \end{aligned}$$
(6)
$$\begin{aligned}{} & {} \begin{pmatrix}\textbf{W}_{y},\textbf{W}_{r}\end{pmatrix}\begin{pmatrix}\textbf{y}_{s_{j}}\\ \textbf{z}_{v} \end{pmatrix}=\textbf{u}_{s_{j},v} \end{aligned}$$
(7)
where \(\textbf{u}_{s_{i},v}\), \(\textbf{u}_{s_{j},v}\) are the preactivations after the network receives stimulus \(s_{i}\) or \(s_{j}\), starting from the same node v. Subtracting both sides of the equations we get:$$\begin{aligned} \begin{pmatrix}\textbf{W}_{y},\textbf{W}_{r}\end{pmatrix}\begin{pmatrix}\textbf{y}_{s_{i}}-\textbf{y}_{s_{j}}\\ \textbf{z}_{v}-\textbf{z}_{v} \end{pmatrix}&=\begin{pmatrix}\textbf{u}_{s_{i},v}-\textbf{u}_{s_{j},v}\end{pmatrix} \end{aligned}$$
(8)
$$\begin{aligned} \textbf{W}_{y}\left( \textbf{y}_{s_{i}}-\textbf{y}_{s_{j}}\right)&=\begin{pmatrix}\textbf{u}_{s_{i},v}-\textbf{u}_{s_{j},v}\end{pmatrix} \end{aligned}$$
(9)
Since both transitions start from the same node, the inputs that a neuron receives that come from the other neurons in the recurrent network cancel, and the difference in \(\textbf{u}\) only depends on the input from the stimuli. Vector \(\mathbf {\varvec{\delta }}_{i,j}:=\textbf{W}_{y}(\textbf{y}_{s_{i}}-\textbf{y}_{s_{j}})\) is a constant that does not depend on the source node v. Hence, vector \(\varvec{\delta }\textbf{u}_{i,j,v}:=\textbf{u}_{s_{i},v}-\textbf{u}_{s_{j},v}\) does not depend on v either, since it must be equal to \(\mathbf {\varvec{\delta }}_{i,j}\), regardless of v. For a given neuron n:$$\begin{aligned} (\textbf{z}_{s_{i},v}(n),\textbf{z}_{s_{j},v}(n))\in \{(1,0),(0,0),(1,1)\}\Leftrightarrow \textbf{z}_{s_{i},v}(n)\ge \textbf{z}_{s_{j},v}(n)&\Leftrightarrow \textbf{u}_{s_{i},v}(n)-\textbf{u}_{s_{j},v}(n)\ge 0 \end{aligned}$$
(10)
$$\begin{aligned} (\textbf{z}_{s_{i},v}(n),\textbf{z}_{s_{j},v}(n))\in \{(0,1),(0,0),(1,1)\}\Leftrightarrow \textbf{z}_{s_{i},v}(n)\le \textbf{z}_{s_{j},v}(n)&\Leftrightarrow \textbf{u}_{s_{i},v}(n)-\textbf{u}_{s_{j},v}(n)<0 \end{aligned}$$
(11)
Relations (10, 11) show that activation states of target nodes reached from the same source node determine one delta value for each pair of stimuli and for each neuron. In other words, each neuron will have a delta value for each pair of stimuli, and this value is a constant that must by fulfilled by the activation states in target nodes that come from a common source node. For instance, a pair \((\textbf{z}_{s_{i},v_{1}}(n),\textbf{z}_{s_{j},v_{1}}(n))\) adopting values (1, 0) implies that \(\delta _{i,j}\ge 0\) for neuron n, meaning that assigning (0, 1) to the pair \((\textbf{z}_{s_{i},v_{2}}(n),\textbf{z}_{s_{j},v_{2}}(n))\) results in an inconsistency, regardless of the identity of nodes \(v_{1},v_{2}\). Not taking this fact into consideration will lead to \(\textbf{Z}_{s}\) and \(\textbf{Z}_{t}\) that are not realisable in any neural network. Moreover, it could be the case that the graph itself does not allow any \(\textbf{Z}_{s}\), \(\textbf{Z}_{t}\) to be realisable. Consider for example the upper graph in Fig. 2a (and its associated matrix \(\textbf{G}\)). If a neuron differentiates \(v_{1}\) from \(v_{2}\), by firing when the network is in state \(v_{1}\) (\(z_{v_{1}}=1\)) and not firing in state \(v_{2}\) (\(z_{v_{2}}=0\)), then \(\delta _{1,2}>0\). This in turn means that the neuron must not fire in state \(v_{3}\) (\(z_{v_{3}}=0\)), because if it fired then \(\delta _{1,2}<0\), leading to a contradiction to the value of delta defined by transitions 1 and 2. We say that the value of zero assigned to \(v_{2}\) “propagates” to \(v_{3}\). This value propagates from \(v_{3}\) to \(v_{1}\) too (\(z_{v_{3}}=z_{v_{1}}=0\)), where we started. But this is in contradiction with the original value for \(z_{v_{1}}\). The inconsistency is circumvented only when \(v_{1}\) and \(v_{2}\) are not differentiated by any neuron, but in this case there would be only two network states instead of three. Therefore, the graph is unrealisable as it is. However, the graph can be modified to make it realisable. Let’s consider the modified graph in Fig. 2b. We have replaced \(v_{1}\) in transition 6 with a new node (\(v_{4}\)), and added transitions for this new node as source, equal to the transitions of the node it replaced (transitions with \(v_{1}\) as a source node). The new node encodes the same information as the replaced node, i.e. the occurrence of stimulus \(s_{2}\) starting from \(v_{3}\). It also leads to the same transitions too (going to \(v_{1}\) through \(s_{1}\), and going to \(v_{2}\) through \(s_{2}\)). Therefore, the new graph (Fig. 2b) is realisable, and all the information encoded in the network states is preserved. It is important to recall that the contradiction in delta values that takes place in the graph of Fig. 2a occurs because delta propagates, and because node \(v_{1}\) appears twice as target, once triggered by \(s_{1}\) and another by \(s_{2}\). On the other hand, since the new node \(v_{4}\) only appears once as target, its addition cannot lead to a new contradiction. The new node is a twin of the node it replaces, in the sense that it leads to the same targets through the same stimuli. We will say that a transition graph is delta consistent if there are \(\textbf{Z}_{s}\), \(\textbf{Z}_{t}\) and \(\textbf{U}\) such that relations (10, 11) are satisfied. If this is not the case (because differentiating some nodes leads to a conflict with relations (10, 11)) then the graph is termed delta inconsistent. We will name “expansion” to the action of replacing a target node with a new node at a set of conflicting transitions, as we did with the graph of Fig. 2a. Each expansion removes one inconsistency.Figure 2Example graphs G (upper left of each panel), matrix \(\mathrm {\textbf{G}}\) (upper right of each panel, with columns showing stimulus, source node and target node), and their associated graph D (lower left of each panel). (a) Graph D forming a cycle of 3 nodes. If a neuron is active in one node and not active in the other node, we arrive at an inconsistency. (b) Inconsistency in (a) is resolved by adding node \(v_{4}\) with same targets as \(v_{1}\) (node \(v_{1}\) is expanded). The associated graph D has no cycles. (c) A graph with 3 nodes and 3 stimuli (dashes stand for unassigned target nodes). Graph D has no cycles because it is not possible to flow from a green arc to a cyan arc. (d) Green and cyan arcs in graph D are superimposed, putting them in the same traversable set. Hence, there is a cycle. (e) Green and cyan arcs in graph D are superimposed but they have opposite directions. These deltas are linked, but arcs labelled with one of these deltas must be inverted. (f) Same graph D as in (e), but cyan arcs have been inverted. All arcs belong to the same traversable set, but no cycles are present. Graph G is consistent. (g) Arcs green and cyan are not directly superimposed but they are linked: defining \(\delta _{1,2}\) or \(\delta _{2,3}\) also defines \(\delta _{1,3}\). (h) Composed superposition appears any time two nodes are reachable by more than one path (regardless of arcs direction).Detecting inconsistencies by searching for cycles in auxiliary graph D
In the graph of Fig. 2a we see that the inconsistency appears because we differentiate \(v_{1}\) from \(v_{2}\), defining \(\delta _{i,j}\) in the process. Assigned values “propagate” to other nodes, reaching back to one of the nodes we already defined. Therefore, the inconsistency occurred because we tried to differentiate two nodes that are part of a certain kind of cycle. To make this intuition more precise we define a labelled directed multi digraph \(D=(V,A,C,g_{s},g_{t},l_{D}),\)where V is the set of nodes, \(A\subset \mathbb {N}\) is the set of arcs, \(C=\{\delta _{i,j}:(i,j)\in S\times S\},\) the set of labels for the arcs, \(g_{s}\) the function that takes an arc and gives its source node, \(g_{t}\) the function that takes an arc and gives its target node, and \(l_{D}\) the function that takes an arc and gives its label (lower graph of each panel in Fig. 2). An arc a from source node \(v_{i}\) to target node \(v_{j}\) exists in D if \(v_{i}\not =v_{j}\), and there exist arcs \(a_{1},a_{2}\) and source node \(v_{s}\) in G such that \(f_{t}(a_{1})=v_{i},\,f_{t}(a_{2})=v_{j}\) and \(f_{s}(a_{1})=f_{s}(a_{2})=v_{s}\). This arc will have label \(l_{D}(a)=\delta _{m,n}\), being m and n the stimuli through which \(v_{i}\) and \(v_{j}\) are reached, respectively. In other words, two nodes are connected by an arc in D if in graph G these nodes are target nodes of the same source node. The label of the arc is \(\delta _{m,n}\), the delta associated with the stimuli that lead to each of the connected nodes. The direction of the arc is given by the sign of \(\delta _{m,n}\). Self loops (\(g_{s}(a)=g_{t}(a)\)) are not included in D because they represent cases in which \(\delta =0\), which does not constrain assignments to z in any way.Now we can understand inconsistencies in terms of graph D: if it is possible to go from one node of graph D to another node, travelling arcs of the same label/delta, and reach the node from where we started, then graph G has at least one inconsistency. This is because nodes in D are connected if they are linked by deltas, and the direction of the arc is defined by the delta sign, which means that assigning different z values to any pair of nodes within this kind of cycle will necessarily lead to a contradiction of deltas, as relations (10, 11) will not hold. Therefore, we can decide if an inconsistency exists by exploring D in search of cycles formed by arcs of the same label (simply referred to as cycles from now on). If any such cycle exists, then there is at least a pair of nodes that cannot be differentiated without getting a delta inconsistency. To assure delta consistency, we must break all these cycles by breaking any of their arcs. Since an arc exists when two nodes are targets of the same source node, the arc can be broken by expanding one of the nodes of the arc, at all transitions in G that allow the arc to exist. The expansion breaks the cycle. We have to repeat the process of cycle search and expansion until no cycles are left. The final graph will be delta consistent, and will hold all the information encoded by the original graph.Detecting equivalent labelsSo far we have considered the example of Fig. 2a in which there are only 2 stimuli and hence one delta. Let’s now consider a graph with 3 stimuli and 2 deltas. In Fig. 2c we can see a subset of a transition graph (some target nodes are left unassigned) which shows no cycles, hence nodes \(v_{1}\), \(v_{2}\) and \(v_{3}\) can be differentiated. However, in the graph of Fig. 2d, node \(v_{1}\) and \(v_{2}\) are connected by two arcs with different labels. Differentiating these two nodes will define two deltas, and they will have the same sign. We call Arc label superposition when two or more deltas are linked in this way, since defining one delta necessarily defines the other. Arcs with superimposed labels are equivalent in the sense that they can be traversed as if they had the same label, and should be treated as if they were of equal label while searching for cycles.Arc superposition can be parallel, like in the example above (superimposed arcs have the same direction), but can also be antiparallel (arcs in opposite directions). In this case, differentiating \(v_{1}\) from \(v_{2}\) defines two deltas, but the graph cannot be traversed in the same way as when superposition goes in the same direction. Consider the graph in Fig. 2e, which is equal to the previous graph, except for the last two transitions, where the target nodes were switched. In this case, it is easy to check that no inconsistency occurs. Yet, a cycle would be detected if the algorithm so far described is executed. This issue is solved by noting that an antiparallel superposition implies that deltas have opposite signs, meaning that the graph should be traversed in opposite directions for arcs of superimposed labels. Hence, for each case of antiparallel superposition we flip the direction of all arcs that share the label with one of the antiparallel superimposed arcs (Fig. 2f), with the precaution of only flipping arcs that weren’t already flipped.Finally, we can see that the parallel and antiparallel superpositions shown so far are particular examples of a broader definition of superposition. Consider the graph in Fig. 2g. In this case, differentiating \(v_{1}\) from \(v_{2}\) defines \(\delta _{1,2}\) but also \(\delta _{1,3}\). In the graph of Fig. 2h differentiating \(v_{1}\) from \(v_{2}\) defines \(\delta _{1,2}\) but also one of the other two deltas: \(\delta _{4,2}\) (parallel to \(\delta _{1,2}\)) or \(\delta _{4,1}\) (antiparallel to \(\delta _{1,2}\)). It is clear that superposition can occur between two arcs that do not share their source and target nodes. We say it is a composed superposition. With this in mind, we are able to propose a more general way of detecting arc label superposition. Superimposed deltas can be represented as a partition P of the set of labels C (i.e. \(P=\{\mathcal {C}_{i}\}_{i}\), with \(\cap _{i,j}(\mathcal {C}_{i},\mathcal {C}_{j})=\emptyset \; \forall i \ne \ j, \cup _{i}\mathcal {C}_{i}=C ).\) Then, for each element of the partition, there is one traversable set of arcs \(V_{trav}(\mathcal {C}_{i})=\{(v_{s},v_{t}):l_{D}(a)\in \mathcal {C_{ i }},v_{s}=f_{s}(a),v_{t}=f_{t}(a),a\in A\}.\) We initialise the partition as the finest possible, i.e. \(P=\{\{c_i\}\}_i\) with \(c_i \in C\). Then, to detect a composed superposition we first partition graph D into disjoint complete paths. Each complete path is a path within a traversable set such that the first node in the path is never a target node in the traversable set, and the last node is never a source node in the traversable set. We select one complete path within a given traversable set, taking arc direction into consideration (path 1). Then, we search for a second path (path 2) that connects two nodes of the path 1, but with the condition that (1) we do not take arc direction into consideration for this connection, and (2) arc labels in path 2 are from any traversable set except the traversable set of path 1. If such path 2 exists, then the label of the arcs in path 1 is superimposed to exactly one of the labels of the arcs in path 2, (chosen at random from all labels encountered in path 2). Since for path 2 we did not take arc direction into consideration, superimposed arcs in this path could be traversed in two possible directions. If the direction matches the direction of arcs in path 1, then we have parallel superposition. If the direction is opposite to the direction in path 1, then we are in a case of antiparallel superposition, hence the direction of all arcs in graph D labeled by the superimposed label found in path 2 must be inverted. Once the superposition is detected, elements of the partition P and the associated traversable sets are merged to account for the new superimposed deltas. Note that this way of detecting composed superposition will also detect simple parallel and antiparallel superpositions as the ones in examples of Fig. 2d, e.In summary, the process of constructing a consistent transition graph from a given target graph starts by initializing partition P as stated above and computing its associated traversable sets. Then, we partition each traversable set into disjoint complete paths and detect the composed superpositions. Partition P and traversable sets are modified according to the presence of parallel and antiparallel superpositions. Antiparallel arcs are inverted, but arcs that were flipped once are not flipped again. We merge elements of P and traversable sets according to the superpositions encountered. Next, we find cycles in graph D by choosing a set \(\mathcal {C}\) from P, and then a source node from \(V_{trav}(\mathcal {C})\). We perform depth first search (DFS), traversing the graph through arcs that are members of \(V_{trav}(\mathcal {C})\). When no cycles are encountered within one traversable set, a new \(\mathcal {C}\) is chosen and its associated \(V_{trav}(\mathcal {C})\) explored. When a cycle is encountered, a node from the cycle is expanded, graph G and D are updated, and the process is restarted from the superposition detection step. If all traversable sets were explored and no cycles were found, the process stops and returns the final consistent graph \(G_{cons}\) (see the Supplementary Information for pseudocode of the gFTP algorithm and its subfunctions).Construction of \(\textbf{Z}_{s}\) and \(\textbf{Z}_{t}\)
To construct matrices \(\textbf{Z}_{s}\), \(\textbf{Z}_{t}\) we must find the activation values for each neuron in the network at each node in the consistent graph, in such a way that each node has its own unique associated activation vector that encodes the node (these constitute the column vectors in \(\textbf{Z}_{s}\), \(\textbf{Z}_{t}\)), and the vector of activation states across transitions satisfies relations (10, 11) for each neuron (each of the row vectors in \(\textbf{Z}_{s}\), \(\textbf{Z}_{t}\)). Finding such activation states is a constraint satisfaction problem which we solve with a backtracking algorithm, where there is one binary decision for each node and the constraint to satisfy is delta consistency. Values 1 or 0 are assigned to nodes, and if delta consistency is sustained, we search for nodes that are obliged to have a certain value, either 1 or 0, such that delta consistency is not violated, i.e. assigned values are “propagated” to other values, according to deltas defined so far (for example, if \(\delta _{ij}=u_{1}-u_{2}>0\) and \(z_{1}=0\), then \(z_{2}\) must be 0 in order to keep delta consistency). Value propagation using delta consistency helps in reducing the number of combinations to try. To reduce it further, nodes are explored in decreasing order of their in-degree in graph \(G_{cons}\). This follows from the assumption that nodes that appear more times as targets in G have more instances to satisfy a given delta. This should make their assignment more difficult and critical to following assignments, and thus they should be assigned first. Details on our backtracking algorithm for constructing \(\textbf{Z}_{s}\) and \(\textbf{Z}_{t}\) can be found in the “Methods” section.Overview of the gFTP algorithm

1.

Graph D is constructed from graph G.

2.

Arc superpositions are detected and traversable sets are updated.

3.

Graph D is traversed through DFS.

4.

If a cycle is found, one of its nodes is expanded.

5.

Steps 1 to 4 are repeated until no cycles are present.

6.

Matrices \(\textbf{Z}_{s}\) and \(\textbf{Z}_{t}\) are constructed.

7.

Matrices \(\textbf{W}_{y}\) and \(\textbf{W}_{r}\) are obtained with the accelerated perceptron algorithm.

8.

If the perceptron learning error is not zero, new matrices \(\textbf{Z}_{s}\) and \(\textbf{Z}_{t}\) are constructed and concatenated to the previous ones.

9.

Steps 7 and 8 are repeated until step 7 achieves zero error.

Figure 3Graphic depiction of gFTP main stages. (a) Main steps to attain a consistent transition graph. gFTP takes a user-defined transition graph as input. An auxiliary graph D is constructed, which contains information about possible inconsistencies. Arc superpositions are detected (in this case, superposition between a complete path, connecting nodes 1, 2, and 3, and the arc between nodes 1 and 3 (an antiparallel superposition, since we can go from 1 to 3 though green arcs, and from 3 to 1 through a cyan arc). Detected superpositions are applied to traversable sets (all arcs in D are now green because they must be treated as if they were of the same label; cyan arcs were inverted because superposition was antiparallel). Depth First Search (DFS) is conducted to search for cycles. The absence of cycles indicates that graph G is consistent. Cycle presence requires the expansion of one node in G that composes the cycle in D after applying superpositions (node 2 was expanded in this case, but expansion of node 1 was also possible). A new cycle of the algorithm is executed starting from the expanded graph. (b) Main steps to obtain matrices \(\textbf{Z}_{s}\), \(\textbf{Z}_{t}\), \(\textbf{W}_{y}\), \(\textbf{W}_{r}\) from \(G_{cons}\). A row vector of ones and zeros is constructed in each iteration (vector’s elements are neuron’s output in each transition). This vector must differentiate at least two nodes of \(G_{cons}\), not already differentiated by previous row vectors in \(\textbf{Z}_{t}\). It must also be delta consistent. These row vectors are generated and concatenated, forming \(\textbf{Z}_{t}\), until all nodes are differentiated. Matrix \(\textbf{Z}_{s}\) is constructed with the same vectors, sorted according to transitions in \(G_{cons}\). An accelerated perceptron is trained, where each row of [\(\textbf{Y}\);\(\textbf{Z}_{s}\)] is a sample to classify, and each column of \(\textbf{Z}_{t}\) a set of classes. The algorithm stops if the perceptron training error reaches zero in less than max_iter iterations, and outputs all matrices. If the error is above zero after max_iter iterations, a new round of vectors that differentiate all nodes are generated and concatenated to \(\textbf{Z}_{t}\), \(\textbf{Z}_{s}\).See Fig. 3 for a graphical representation of the main steps taken by gFTP when constructing a consistent graph (Fig. 3a) and constructing matrices \(\textbf{Z}_{s}\), \(\textbf{Z}_{t}\), \(\mathbf {W_{y}}\) and \(\textbf{W}_{r}\) (Fig. 3b).

Hot Topics

Related Articles