Evolving Systems (2011) 2:189–198 DOI 10.1007/s12530-011-9031-4
ORIGINAL PAPER
Quantifying the role of complexity in a system’s performance Lorenzo Riano • T. M. McGinnity
Received: 27 November 2010 / Accepted: 7 April 2011 / Published online: 22 April 2011 Ó Springer-Verlag 2011
Abstract In this work we studied the relationship between a system’s complexity and its performances in solving a given task. Although complexity is generally assumed to play a key role in an agent’s performance, its influence has not been deeply investigated in the past. To this aim we analysed a predator–prey scenario where a prey had to develop several strategies to counter an increasingly skilled predator. The predator has several advantages over the prey, thus requiring the prey to develop more and more complex strategies. The prey is driven by a fully recurrent neural network trained using genetic algorithms.We conducted several experiments measuring the prey’s complexity using Kolmogorov algorithmic complexity. Our finding is that, in accordance to what was believed in literature, complexity is indeed necessary to solve non-trivial tasks. The main contribution of this work lies in having proved the necessity of complexity to solve non-trivial tasks. This has been made possible by blending together a goal oriented system with a complex one. An experiment is provided to distinguish between the complexity of a chaotic system and the complexity of a random one. Keywords Complexity Genetic algorithms Neural networks Chaotic systems
L. Riano (&) T. M. McGinnity Intelligent Systems Research Centre, School of Computing and Intelligent Systems, University of Ulster, Magee Campus, Londonderry BT48 7JL, UK e-mail: l.riano@ulster.ac.uk T. M. McGinnity e-mail: tm.mcginnity@ulster.ac.uk
1 Introduction Complexity is known to be a necessary aspect in the development of strategies to solve non-trivial tasks (Lenski et al. 2003). However it is not clear how complexity can be quantified and how can its impact on a system’s performance be measured. Key questions arise, for example is a complex structure necessary to exhibit complex behaviours or can behavioural complexity emerge even in simple structures? Artificial life and evolutionary algorithms are a natural test bed to explore these questions. Complexity can be obtained by evolving an artificial agent to solve tasks of incremental difficulty. In Lenski et al. (2003) populations of digital organisms evolved the ability to perform complex logic functions by learning to perform simpler ones and by combining them. A related approach is called scaffolding (Bongard 2008), where a simulated robot learns to pick-up an object by incrementally learning all the intermediate steps that lead to the final desired behaviour. A major drawback of scaffolding-like approaches is the need of hand-crafting scenarios of increasing difficulty. This might constrain the solutions an agent is able to find. This problem is partially addressed by studying the evolution of several agents contemporaneously competing or cooperating to solve a task (Gomez and Miikkulainen 1997). In particular, the study of co-evolution is likely to provide insights into how complexity autonomously emerges as a necessity. Nolfi and Floreano (1998) studied co-evolution in a predator–prey system. They argued that an ‘‘arms race’’ does not necessarily arise in a competitive scenario, as oscillation of strategies and rediscovery of old solution are a likely solution to the predator–prey problem. This poses a limit on the amount of complexity an evolved
123
190
system will exhibit, as simple ‘‘ad-hoc’’ solutions often proved to be better than general complex ones. Stanley and Miikkulainen (2004) explored a different competitive scenario. Two agents battled to obtain food and the related energy so that they could overcome each other. Artificial neural networks had been incrementally built via genetic algorithms to drive each of the agents. The authors linked the agent success rate to the complexity of the evolved underlying network, which in turn had been informally linked to the number of neurons and to the number of connections in the network. In this paper we propose to study and quantify the complexity of a network’s behaviour, as opposed to its structure. The link between the complexity of a neural network and the emergence of novel behaviours has also been investigated in Riano and McGinnity (2010). Several neural networks of increasing complexity have been trained to drive a robot towards people. The authors showed that, when the complexity is sufficiently high, the network generates behaviours that were not programmed nor anticipated. The network’s complexity was measured using Kolmogorov’s algorithmic theory (Kolmogorov 1965), which we use here to characterise the complexity of a behaviour. We are, in alignment with the current literature, convinced that complexity is mandatory to solve challenging tasks. However we feel that in previous work the focus has been mainly on solving a problem, rather than investigating what allows a problem to be solved. To the best of our knowledge there are no studies that investigate ‘‘how much’’ complexity is necessary to solve a given task. Answering this question requires one to (i) quantify the amount of complexity in a system and (ii) analyse the performance of a system (related to a task) with varying levels of complexity. Our goal in this work is therefore to numerically investigate a system’s complexity and its role in the system’s performance. Measuring the complexity of a system is still nowadays an open problem, and it is often domain-dependent. A survey of ‘‘complexity measures’’ is beyond the scope of this paper, and we refer the reader to existing surveys in the literature (see for example Kuusela et al. 2002; Daw et al. 2003). Two measures seem to have gained wide acceptance: Kolmogorov complexity (Kolmogorov 1965) and Grassberg effective measure complexity (Grassberger 1989). However while the first proved to be uncomputable, the latter is defined only for stochastic processes. Moreover Kolmogorov’s measure tends to be high for random process, an undesirable property when comparing the complexity of different systems. Other measures like Gell-Mann total information (Gell-Mann and Lloyd 1996) are subjective and more philosophical, thus lacking a proper scientific use.
123
Evolving Systems (2011) 2:189–198
In Sect. 2.4 we will show that it is possible to obtain a close approximation of Kolmogorov complexity, thus allowing us to effectively use it as a measure. Moreover, as we will illustrate in Sect. 2, we will be calculating the complexity of recurrent neural networks, which by definition do not contain any randomness element. This means that we will not face the problem of high complexity assigned to random processes. Once provided with a way to measure complexity, we can investigate how it influences a system’s performance. However this is non-trivial, as complexity and the system generating it are deeply interleaved and, although it is possible to measure the complexity after a system is defined, it is hard to generate a system with a given complexity. We addressed this problem by using a ‘‘complexity generator’’, that is a neural network trained to produce an output whose sole purpose is to be complex. This network can be combined with a task-oriented one to ‘‘inject’’ complexity into the system. The test bed we will use is a predator–prey scenario. This scenario has been widely studied in the literature (see for example Nolfi and Floreano 1998; Stanley and Miikkulainen 2004). In contrast with previous work the predator is coded by us, while the prey is evolved using genetic algorithms. This allows us to study the dynamics the prey evolves without transgressing into oscillations or greedy solutions (Nolfi and Floreano 1998). This also gives us the possibility to accurately control the ability of the predator, as we can simply vary its driving algorithm to generate more complex behaviours. The prey is controlled by a fully recurrent neural network as described in Sect. 2.1. We will illustrate in several experiments the solutions developed by the prey to overcome a more and more sophisticated predator. Ultimately the prey must evolve complexity so that it will win against the best predator. Thorough the paper we will use the term ‘‘agents’’ to refer to both the prey and the predator. This paper is organised as follow: in Sect. 2 and subsections we describe the techniques we will be using, including the neural network equations, the genetic algorithms and the Kolmogorov complexity. In Sect. 3 we describe the simulation model and dynamics, while in Sect. 4 we describe the experimental results. These results are discussed in Sect. 5 and conclusions are drawn in Sect. 6.
2 Employed techniques 2.1 Fully recurrent neural network The goal of the prey is to develop an evasive behaviour that will allow it to escape a predator. In this work we decided
Evolving Systems (2011) 2:189–198
191
RNNs with unsupervised Genetic Algorithms, as described later. 2.2 Genetic algorithms
Fig. 1 A diagram of a fully recurrent neural network. All the nodes, apart from the input ones, are fully connected to all the others. The input neurons do not have incoming connections
to employ a fully recurrent neural network (RNN, Fig. 1) (Beer and Gallagher 1992; Hu¨lse et al. 2004) to drive the prey’s dynamics. The reasons behind this choice are: –
– –
This model, with minor modifications, has been widely used and studied in the neuroevolution literature (see among others Paine and Tani 2004; Urzelai and Floreano 2001; Izquierdo et al. 2008; Hu¨lse et al. 2004). A RNN is Turing-compatible, i.e. it can approximate any arbitrary recursive function. This network can exhibits rich and often chaotic dynamics (Dauce et al. 1998).
A RNN is composed by input, output and hidden neurons. The only difference between an RNN and a classic feedforward neural network is that all the neurons, apart from the input ones, are fully interconnected, including self connections. The matrix Wij defines the connection strength between neuron j and neuron i while the vector bi defines the biases for every neuron i. Every neuron i ¼ 1; . . .; N has an associated output value xi such that 0 \ xi \ 1, which depends on the activation value ui. The network dynamics are discrete-time and deterministic, and are governed by the equations in (1), where f is the standard sigmoidal function that limits the neuron value between 0 and 1. 8 < xi ðt þ 1ÞP¼ f ðui ðtÞÞ ui ðtÞ ¼ Nj¼0 Wij xj þ bi ð1Þ : f ðuÞ ¼ 1þe1 u Although a RNN could be trained using backpropagation through time (Werbos 1990), this approach requires a training signal which, given the experiments we will describe later, we can not provide. We therefore trained the
Genetic algorithms (GA) have long been used in optimisation problems, including robotics control (Harvey et al. 2005) and neural network training (Floreano et al. 2008). The network structure and weights are encoded in a gene array, and conventional mutation and crossover operators are used. Several strategies have been employed in the past to train a neural network using a GA (Nelson et al. 2009; Harvey et al. 2005; Walker et al. 2003). The goal of this work is to prove that complexity plays a key role in the survival skills of a prey. As such, it would be out of scope to provide a comparison of several genetic algorithms. Therefore in this paper we decided to use the most common basic approach to neuroevolution (Beer and Gallagher 1992; Floreano et al. 2008): the network weights Wi,j and biases bi are encoded in a single genome vector; mutation is applied by adding a Gaussian random value to each element of the genome vector, and we perform a single-point crossover. In all the experiments we used PyEvolve (Perone 2009) as the main GA engine. 2.3 Least squares approximation of the prey’s trajectory If we denote with (x(t), y(t)) the prey’s position at time t, then the goal of the predator is to identify two functions fx and fy so that: jfx ðt þ DtÞ xðt þ DtÞj jfy ðt þ DtÞ yðt þ DtÞj
ð2Þ
where is an arbitrary small value. Equation 2 states that both fx and fy should be able to predict the future position (within Dt steps) of the prey with an error as small as possible. This is accomplished by approximating the prey’s trajectory with time-dependent polynomials. A generic m-degrees polynomial p(t) is written as: m X pðtÞ ¼ bi ti ð3Þ i¼0
where bi are unknown coefficients. If we have a set of s previous data points xt s ; . . .; xt up to time t, then the bi are the solution to the overdetermined linear system in Eq. 4. 2 32 3 2 3 b0 1 t s ðt sÞm xt s 6 1 t s þ 1 ðt s þ 1Þm 76 b1 7 6 xt sþ1 7 6 76 7 6 7 ð4Þ 6 .. 76 .. 7 ¼ 6 .. 7 .. .. 4. 5 4 5 4 . 5 . . . xt bm 1 t tm This system’s solution is given in Eq. 5, as a linear least squares (LSQ) solution.
123
192
Evolving Systems (2011) 2:189–198
b ¼ ðT 0 TÞ 1 T 0 x
ð5Þ
where T is the left matrix in Eq. 4 and T0 denotes its transpose. The prey’s trajectory (x(t), y(t)) can therefore be approximated by two polynomials fx and fy. Every time step t the predator calculates the coefficients bx,i, by,i to approximate (x(t), y(t)) using s past points. The predator then can predict the future prey position, as in Eq. 6. xðt þ DtÞ ¼ yðt þ DtÞ ¼
m X i¼0 m X
bx;i ðt þ DtÞi ð6Þ by;i ðt þ DtÞ
i
1.
i¼0
2.4 Kolmogorov complexity
2.
The notion of Kolmogorov complexity (KC) (Kolmogorov 1965) is a well established measure of the complexity of a string in computational theory. Given a binary string s of length l and a machine M; the bit length of the shortest program KM ðlÞ able to produce s and stop afterwards is the Kolmogorov complexity of s. In this work we are concerned with the complexity per symbol C defined in Eq. 7 and measured in bits per symbol (bps). KðsÞ l!1 l
C ¼ lim
ð7Þ
From Eq. 7 it can be seen than 0\¼C\¼1; where 0 represents a fully deterministic string, while 1 is a completely random string. The Kolmogorov complexity has been shown to be incomputable. However, it is possible to obtain an upper bound of it (which in practice is a very good approximation) by using a universal compression algorithm, like Lempel and Ziv (Kaspar and Schuster 1987). In Falcioni et al. (2003) the connections between entropy, chaos and algorithmic complexity are studied. These connections can be summarised by saying that a complex system generates incompressible strings (outputs) and it is unpredictable. The incompressibility of a string is therefore a clear indicator of complexity in a system. A similar approach is used in Khalatur et al. (2003). In this paper we will be concerned with the compressibility of sequences of numbers between 0 and 1. However a sequence of floating point numbers may be highly incompressible, given the nature of its representation in a machine, even if the numbers are very close to each other. A solution to this problem is to discretise a real-numbers string into a series of symbols (Daw et al. 2003). The degree of discretisation has been empirically determined. A fine discretisation (more than 70 symbols)
123
leads to high complexities that do not reflect the real nature of the underlying process. On the other side, a coarse discretisation leads to constant low complexities, removing any informative content of this approach. We found however that discretisations between 10 and 50 symbols did not lead to any substantial change in the evaluated complexities. Therefore we discretised the range 0–1 into 10 different symbols, and calculated the complexity of the resulting string. In this way a series of numbers must evenly and chaotically cover the entire 0–1 range to obtain an high complexity value. The procedure to calculate the KC of a string s of real numbers is therefore as follow:
3.
Map s into a string d of symbols drawn from an alphabet of 10 elements. Compress d using the Lempel-Ziv algorithm (Kaspar and Schuster 1987), thus obtaining a new shorter string d0 . Calculate the Kolmogorov complexity as the ratio between the length of d0 and the length of d0 .
Given that the length of d0 can never be longer that the length of d, the KC per symbol of a string is always between 0 and 1.
3 Experimental setup 3.1 Experiments overview The goal of the following experiments is to measure the performance of a prey with respect to its complexity. To this aim we ran several experiments where the prey was challenged each time by a more skilled predator. The predator was not only ‘‘naturally’’ better equipped than the prey (see next subsection), but it was also able to approximate the prey’s trajectory using the approach illustrated in Sect. 2.3. This means that the difficulty of the task the prey has to face increases with the experiments in Sects. 4.1, 4.2 and 4.3. While the prey managed to evolve good survival strategies in two out of three scenarios, eventually in Sect. 4.3 we show that the GA could not find a practical solution. This prompted for the blending of a goal-oriented behaviour with complexity illustrated in Sect. 4.5. 3.2 World and agent dynamics The simulated environment is an empty 2D space. This allowed us to focus more on the predator–prey dynamics than on the interactions between the agents and the environment. An agent state is identified by its position (x, y), its orientation h and it linear and angular velocities
Evolving Systems (2011) 2:189–198
(v, x). Both the predator and the prey have differential dynamics, according to Eq. 8. 8 > < xtþ1 ¼ xt þ vt cosðht Þdt ytþ1 ¼ yt þ vt sinðht Þdt ð8Þ > : htþ1 ¼ ht þ xt dt All the experiments have been performed using the Euler approximation of the agent’s dynamics with a fixed time step dt = 0.1 s. The prey’s speed is assumed to be 0.4 m/s. To avoid trivial solutions the predator speed is slightly higher, 0.45 m/s. The predator has a complete knowledge of its position and the prey’s position, regardless of their distance. They prey’s sensors are far more limited, as it can sense the predator only when it is less than 4 m away. Moreover the prey does not know exactly the angle between itself and the predator, but only an imprecise direction. To simulate this we divided the whole 360 area around the prey into 6 sectors. The prey therefore knows only the distance from the predator and the sector from which it is approaching. This is analogous to having an array of 6 sonars around the prey. The RNN described in Sect. 2.1 controls the prey. It takes as input a vector of 6 real numbers indicating the distance from the predator in one of the sectors and it has as the only output the angular speed x of the prey. Both agents’ linear velocities are kept constant at their maximum. 3.3 Evolving the prey In Sect. 2.2 we described the network representation and basic operators to use in GA. For all the experiments below we used a mutation probability of 0.8. Mutation is performed by adding to a single gene a random value drawn from a Gaussian distribution with 0 mean and 1 variance. Weights and biases are clamped between -3 and 3. Single point crossover is performed with a 0.1 probability. The fitness function for the prey is its survival time, calculated in seconds. We assume that if the distance between the prey and the predator is less than 0.5 m the predator successfully captured the prey. The fitness function is averaged from four separate trials, with the prey being located in each of the four possible sectors of the Cartesian space and the predator being on the opposite corner at a fixed distance.1 The initial angles are always so that the predator is facing the prey and the latter is facing 1
A first approach had been to generate several random starting locations for both the prey and the predator, and then averaging the survival time of the prey. This however led the GA to favour preys that started far from the predator, instead of prey with good avoiding skills.
193
the opposite direction. The trial is over if the predator catches the prey or if the latter survives for at least 150 s. After training, when evaluating the prey’s performance, both the prey and the predator’s initial position and orientation are randomly initialised. This way we tested the generality of the prey’s strategy. On the other side there might be starting positions where the predator is highly favoured and it will catch the prey, despite its best efforts. Therefore in the following experiments we will never see an average survival time of 150 s (as opposed to training), but always smaller values.
4 Experiments and results 4.1 Greedy strategy In the first experiment the predator used a simple greedy strategy, that is it always steered towards the current location of the prey without any approximation of Its trajectory. The prey’s RNN is composed by 6 input neurons, 1 output neuron and no hidden neurons. Training has been performed with a population of 100 individuals. After only 20 iterations the prey developed a simple strategy of going around in circles, as shown in Fig. 2. This way the predator always stayed behind, in spite of its superior velocity. The overall prey behaviour is very simple, with a Kolmogorov complexity of 0.04 bps, as it is the predator’s one. Testing the prey with 300 random starting locations yield a mean survival time of 137.961 s. That is the prey always survives, apart from a few unfortunate cases when it started too close to the predator and facing the wrong direction.
Fig. 2 The strategy evolved by the prey to counter a predator’s greedy strategy. Going around in circles was sufficient to allow the prey to survive for more than 150 s. The red continuous line is the prey’s trajectory, the blue dashed line is the predator’s (colour figure online)
123
194
Evolving Systems (2011) 2:189–198
4.2 Linear strategy
4.3 Quadratic strategy
In this experiment the predator approximates the prey’s strategy using the linear least square approach described in Sect. 2.3 with a first degree polynomial. Using this approach the predator was always able to catch the prey trained in the previous section. We therefore used the GA to obtain a new generation of preys able to survive the improved predator. Given the failure of the previous generation, we used a RNN with 3 hidden neurons. The predator used s = 15 previous prey’s positions to solve the linear system in Eq. 4, and it used a prediction Dt ¼ 10s ahead in time. A new winning prey emerged again after a few GA iterations. The strategy it developed was again to move in circular trajectories. The difference with the previous strategy lies in the different eccentricity of the circular trajectories, as shown in Fig. 3. In order to understand this new strategy, let us consider that the predator approximates the prey’s trajectory using 15 steps. As every step lasts for 0.1 s and the prey is moving at 0.4 m/s, the length of the path used by the predator to approximate the prey’s trajectory is 0.6 m. A close look at Fig. 3 reveals that the length of the upper ellipses’ major diagonal is about 0.6 m. In other words, the prey that had to withstand the improved predators learnt to fool it by abruptly changing its trajectory when it had been estimated. Although this strategy is ingenious, it is very simple. The changes of trajectory are not frequent and easily predictable by a smarter predator, as we will show next. The Kolmogorov complexity of the prey’s behaviour supports this idea, as it is only 0.02 bps, while the predator’s one is 0.06. Out of 300 random starting locations, the average prey survival time was 133 s.
In the previous experiments we found that the prey always developed dynamics that ended in a circular trajectory. We therefore crafted the third version of the predator that uses a second degree polynomial to approximate the prey’s trajectory. The predator used s = 10 previous steps to approximate Eq. 4 and it used a prediction Dt ¼ 10 s ahead in time. Again an improved predator outperformed the prey model previously evolved. This time however we could not evolve a prey able to survive this last deployed predator. We again used a GA to create a prey that could overcome the new predator. We trained networks with several different numbers of hidden nodes, but none of the associated prey found a winning strategy. Figure 4 shows the trajectory of one of the most successful prey with 5 hidden nodes. The solution it developed was to circle around until the predator is seen by the sensors, then move away without any radical change of direction. The predator’s higher speed doomed this strategy to fail. Out of 300 random starting locations, the average prey survival time was 30.78 s. The Kolmogorov complexity did not change significantly from the previous experiments, with a value of 0.06 bps for the prey, and a value of 0.04 for the predator.
Fig. 3 A comparison between the prey’s trajectory when facing a greedy predator (lower red line) and the prey’s trajectory when facing a predator which predicts the prey’s next position using a linear model (upper green line). The prey employs elliptical trajectories to ‘‘confuse’’ the predator, that is always left behind (colour figure online)
Fig. 4 The trajectory followed by the prey (continuous red line) against a predator which predicts the prey’s next position using a quadratic model (dashed blue line). The predator’s strategy proved to be very effective, as the prey could not develop a successful strategy to counter the predator (colour figure online)
123
4.4 Evolving for complexity The predator’s strategy is to approximate the prey’s trajectory, predict where it is going and proceed it. We already saw in Sect. 4.2 that the prey developed a behaviour that exploited a weakness of linear approximation. Our hypothesis was therefore that the same concept could be used to overcome the predator’s quadratic approximation,
Evolving Systems (2011) 2:189–198
Fig. 5 The trajectory followed by the prey (continuous red line) trained to be complex. The predator (dashed blue line) is still able to catch it. The inner plot shows a snapshot of the prey’s angular speed, measured in radians. It can be seen that the speed is chaotic, although a macro-inspection of it shows a pattern, as highlighted by the trajectory followed by the prey before been caught (colour figure online)
even if the GA had not found this kind of solution in the previous experiment. The approach we adopted was to create a network whose only goal is to produce a complex output. The complex network had the same number of inputs and outputs as the prey’s network, and 5 hidden nodes. The fitness function was the complexity of the network output when fed with inputs coming from a previous predator–prey experiment. The resulting network was characterised by a Kolmogorov complexity of 0.34 bps while the predator showed a complexity of 0.19. When used to drive a prey the complex network performed badly, with an average survival time of 14.75 s out of 300 random locations. Figure 5 shows the trajectory followed by the prey. Although it did not explicitly try to evade the predator, its frequent changes in direction made the trajectory approximation non-trivial. This property, when combined with a predator-evading strategy, will lead to a successful prey as shown in the next experiment. 4.5 Selector network The final experiment shows that blending predator-aware networks with ‘‘blind’’ complex networks leads to a very successfully prey strategy. To this aim we created a Selector RNN (SRNN), i.e. a network that dynamically selects a linear combination of the outputs of several subnetworks. A SRNN is composed of two or more subnetworks, a fully connected hidden layer and two or more outputs normalised so that they add up to one. The SRNN inputs are the same inputs each individual RNN receives plus the outputs of the subnetworks, while the outputs are the weights for two or more linear combinations of the
195
Fig. 6 Flow of data in the selector network. The inputs is fed into both the two subnetworks and the selector network. The selector then produces two weights that are used to combine the outputs of the subnetworks. Compare with the diagram in Fig. 1
outputs of the subnetwork. The SRNN therefore uses the input and the subnetworks outputs to decide to what extents every subnetwork will contribute to the final outputs. Figure 6 shows the flow of data from the inputs to the final outputs that are linear combinations of the sub-RNNs outputs. The SRNN for the prey has 8 inputs, 6 for the sector distances as described in Sect. 3.2 and 2 for the outputs of the two subnetworks, and 5 hidden nodes. Moreover the SRNN has 2 weights whose sum is one, indicating the weight of each of the single output of the subnetworks. The final speed of the prey is obtained by multiplying each of the subnetwork output by the respective SRNN output. An SRNN is trained using the GA in the same way as an ordinary RNN. We therefore used the networks previously described in Sects. 4.3 and 4.4 to be subnetworks for a SRNN. That is, we used a network trained to avoid a predator with a quadratic approximation of its trajectory and a network trained to produce a complex output. The resulting trajectory is shown in Fig. 7. The SRNN learnt to use the best of both networks, as it mixes circular trajectories with frequent changes of dynamics. The similarity with the previous model can be seen by comparing Figs. 7 and 4. In both cases at the beginning of the trial the prey did not know the predator’s position until it was closer than 4 m. So it moved around a circle, waiting for the predator to approach before fleeing in the opposite direction. The following trajectory is a mixture of circular and piecewise linear trajectories. Even when following circular trajectories, small variations generate every time circles of different diameter or position. The prey’s average survival time was 101 s out of 300 random starting locations, and the network output
123
196
Fig. 7 The trajectory followed by the prey driven by a SRNN (continuous red line) against a predator which predicts the prey’s next position using a quadratic model (dashed blue line). The prey managed to survive by alternating circular motions with abrupt changes of direction, ruining every time the predator’s efforts to estimate the prey’s trajectory. For this particular figure we let the simulation run for 200 s to show several changes of direction (colour figure online)
Kolmogorov Complexity is 0.29 bps. The predator’s complexity was measured to be 0.16. 4.6 Chaos is not random The velocity profile of Fig. 7 might be mistaken as randomly generated. This might lead to the conclusion that by simply generating random movements a prey would be able to escape the predator. To disprove this, we substituted the complex network described in the previous section with a random number generator. Therefore the SRNN used as input the output of the network trained to avoid the predator and a uniformly random number between 0 and 1. The prey’s performance dropped considerably, with an average survival time of 21 s out of 300 random starting locations. The Kolmogorov complexity however increased to 0.46 bps, while the predator’s one stayed at 0.18. Figure 8 shows a sample trajectory of the prey.
5 Discussion of the results Table 1 shows a summary of the results provided in the previous sections. It can be seen that the sequence of experiments represents a controlled arms race between the prey and the predator. At the beginning the prey quickly overcame both the greedy and the linear strategies without the need to develop complex solutions. However, when the predator employed a quadratic strategy, the simple GA algorithm we used could not find a solution to improve the prey’s survival time. We hypothesised that more complexity, quantified according to Kolmogorov’s theory, was
123
Evolving Systems (2011) 2:189–198
Fig. 8 The trajectory followed by the prey (continuous red line) when driven by a SRNN with a random number generator. In contrast with Fig. 7, the predator (dashed blue line) was always winning (colour figure online)
necessary for the prey to win the arms race. The experiment in Sect. 4.5 proved this. One could be tempted by measuring the difficulty of the task the prey is facing by looking at the predator’s complexity. However, this measure provides only a part of the picture. The reasons behind the predator’s successes with the quadratic strategy are not only in the strategy’s complexity itself, but also in the predator’s increased speed and better sensing. This can hardly be observed by looking at the complexity alone, as the first three entries in Table 1 show. Of particular interest is however the jump in the predator’s KC from the experiment in Sect. 4.3 to the following ones. Although the predator’s algorithm did not change, its complexity boosted when trying to cope with a more skilled prey. Given the number of experiments we conducted and the results provided in Sect. 4.5, we can postulate that 0.19 bps is the maximum complexity a Quadratic strategy can exhibit. Table 1 shows also that, as the predator’s skills increase, the average survival time of the prey decreases. In other words finding a successful strategy against a more challenging opponent becomes harder and harder. Nonetheless in all of the scenarios the prey managed to find a ‘‘sound’’ strategy i.e. elusive manoeuvres that, apart from a few unfortunate starting positions, lead the prey to survive. If we were to investigate a long term arms race between the prey and the predator, we hypothesise that eventually they will find a balance where no strategy is guaranteed to be successful all the times, in accordance to Nolfi and Floreano (1998). We strongly believe that the GA will eventually find a successful RNN for the predator with quadratic strategy in Sect. 4.3 if provided with enough time and a big population. A similar solution could have been found by using evolutionary algorithms that incrementally build a RNN structure (see for example Stanley and Miikkulainen 2004),
Evolving Systems (2011) 2:189–198
197
Table 1 A summary of the results of all the previous experiments Predator
Prey-[hidden nodes]
Surv. time
Prey complexity
Predator complexity
Greedy
RNN-0
137.96
0.04
0.04
Linear
RNN-3
133.00
0.02
0.06
Quadratic
RNN-5
30.78
0.06
0.04
Quadratic
RNN-5 (Complex)
0.19
Quadratic
SRNN-5
Quadratic
SRNN-5 (Random)
14.75
0.34
101.00
0.29
0.16
21.00
0.46
0.18
Surv. time refers to the survival time of the prey averaged over 300 trials and it is measured in seconds. The complexity is measured in bits per symbol
or by exposing the prey to tasks of increasing difficulty. In other words employing a structure like a SRNN is not necessary to ensure the prey’s survival. However, having separated the complexity from a goal-oriented agent allowed us to highlights the big role that complexity plays in a system’s performance. Complexity alone could not increase the prey’s survival chances, as showed in Sect. 4.4. This becomes evident by looking at the last three rows of Table 1. A prey with only a complex network or a random number generator could not survive, in spite of a complexity higher than the selector illustrated in Sect. 4.5. A SRNN developed the right tools to combine complexity with a goal-oriented strategy, tools that led to the prey’s final good performances. We therefore proved that complexity is necessary but not sufficient to increase a system’s performance. This idea was already well-grounded in the scientific community, but to the best of our knowledge it lacked an experimental proof. As introduced in Sect. 2.4 complexity can be observed in chaotic systems. An analogous complexity can be also observed in a random number generator that does not require any evolutionary strategy to be created. The last experiment in Sect. 4.6 however marked a clear separation between chaos and randomness. The SRNN learnt to exploit hidden rules, typically found in deterministic chaos, in the chaotic dynamics of the subnetwork to drive the prey to success. On the other hand a random number generator does not have any rule, therefore its presence did not improve the prey’s performance. As the scope of this work was to explore the role of complexity underlying the dynamics of a successful prey, we did not investigate thoroughly the choice of parameters for the GA. The literature in this topic provides a vast selection of algorithms and solutions to neural networks training using GA (see Floreano et al. 2008 for a recent survey). We did investigate however how the structure of RNN influences the prey’s performance. Our finding, confirmed by the experiments, is that full connection and a relatively small number of hidden neurons suffice to have a successful prey. The network evolved in Sect. 4.1 had no
hidden neurons, but the recurrent connections in the output neurons generated the dynamics necessary to evade the predator. In all the subsequent experiments 3 or 5 hidden neurons proved to be sufficient to generate winning strategies. Our approach was to conduct experiments with an increasing number of hidden neurons until we did not observe any increase in performance. This is similar to manually running the NEAT algorithm (Stanley and Miikkulainen 2004). 5.1 Relation to Ashby’s law of requisite variety Ashby’s (1956) law of requisite variety states that the larger the variety of actions available to a control system, the larger the variety of perturbations it is able to compensate. Ashby (1958) defined variety as the total number of states available to a system. The law of requisite variety reads that a controller (or regulator), in order to be successful in its task, must be capable of reacting to all the signals (or perturbation) that the system to be controlled produces. In the scenario proposed here, a prey must be able to cope with all the strategies the predator employs to capture it. If we draw a parallel between the variety of a system and its complexity, we find that this work provides an empirical proof of Ashby’s law: in order to overcome the predator’s superiority in speed and sensing (which is, as we stated before, not entirely captured by its numerical complexity), the prey had to increase its variety. This becomes evident by comparing Figs. 2 and 4 with Fig. 7, where the prey exhibits a far richer trajectory. Here we will not provide a quantification of the variety of both the predator and the prey as it is out of the scope of this paper.
6 Conclusions and future work They main points and results of this paper are: –
Genetic algorithms evolve networks of increasing complexity when facing more difficult tasks.
123
198
– – –
Evolving Systems (2011) 2:189–198
We uncoupled the algorithm solving a specific task from its complexity. Complexity is necessary but not sufficient to solve several tasks. Chaotic-like complexity follows rules that can not be characterised as random.
The key ‘‘tool’’ we used in this work is the Kolmogorov’s theory to measure the effect of complexity, numerically comparing different architectures in different scenarios. This allowed us to provide a proof to what was strongly believed in the scientific community but not really proved. That is: complexity, defined according to Kolmogorov, is indeed necessary to solve a non-trivial task. A question that naturally arises at this point is ‘‘what is the right amount of complexity necessary and sufficient to solve a given task?’’. We believe that answering this question will be hard if not impossible in all but trivial tasks. Even if we could find such a ‘‘minimally complex’’ system, its generalisation capabilities will be very limited. As we showed in this paper, a more demanding task requires a more complex solution. To avoid having to work with ever-increasing complex systems, we are currently investigating how learning can spontaneously produce the necessary amount of complexity without necessarily having a complex underlying system. This will integrate into the study of learning and evolution (Nolfi and Floreano 1999). Acknowledgments Dr. Riano is supported by InvestNI and the Northern Ireland Integrated Development Fund under the Centre of Excellence in Intelligent Systems project.
References Ashby W (1956) An introduction to cybernetics. University paperbacks. Wiley, New York Ashby W (1958) Requisite variety and its implications for the control of complex systems. Cybernetica 1(2):83–99 Beer RD, Gallagher JC (1992) Evolving dynamical neural networks for adaptive behavior. Adapt Behav 1(1):91–122 Bongard J (2008) Behavior chaining: incremental behavior integration for evolutionary robotics. Artif Life 11:64 Dauce E, Quoy M, Cessac B, Doyon B, Samuelides M (1998) Selforganization and dynamics reduction in recurrent networks: stimulus presentation and learning. Neural Networks 11(3): 521–533 Daw C, Finney C, Tracy E (2003) A review of symbolic analysis of experimental data. Rev Sci Instrum 74:915 Falcioni M, Loreto V, Vulpiani A (2003) Kolmogorov’s legacy about entropy, chaos, and complexity. In: The Kolmogorov legacy in physics, pp 85–108
123
Floreano D, Du¨rr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intell 1(1):47–62 Gell-Mann M, Lloyd S (1996) Information measures, effective complexity, and total information. Complexity 2(1):44–52 Gomez F, Miikkulainen R (1997) Incremental evolution of complex general behavior. Adapt Behav 5(3-4):317 Grassberger P (1989) Problems in quantifying self-generated complexity. Helv Phys Acta 62(5):489–511 Harvey I, Di Paolo E, Wood R, Quinn M, Tuci E (2005) Evolutionary robotics: a new scientific tool for studying cognition. Artif Life 11(1–2):79–98 Hu¨lse M, Wischmann S, Pasemann F (2004) Structure and function of evolvedneuro-controllers for autonomous robots. Connect Sci 16(4):249–266 Izquierdo E, Harvey I, Beer RD (2008) Associative learning on a continuum in evolved dynamical neural networks. Adapt Behav 16(6):361–384 Kaspar F, Schuster HG (1987) Easily calculable measure for the complexity of spatiotemporal patterns. Phys Rev A 36(2):842–848 Khalatur P, Novikov V, Khokhlov A (2003) Conformation-dependent evolution of copolymer sequences. Phys Rev E 67(5):51901 Kolmogorov AN (1965) Three approaches to the concept of the amount of information. Prob Info Trans 1(1):1–7 Kuusela T, Jartti T, Tahvanainen K, Kaila T (2002) Nonlinear methods of biosignal analysis in assessing terbutaline-induced heart rate and blood pressure changes. Am J Physiol-Heart Circ Physiol 282(2):H773 Lenski RE, Ofria C, Pennock RT, Adami C (2003) The evolutionary origin of complex features. Nature 423(6936):139–144. doi: 10.1038/nature01568 Nelson AL, Barlow GJ, Doitsidis L (2009) Fitness functions in evolutionary robotics: a survey and analysis. Robot Auton Syst 57(4):345–370 Nolfi S, Floreano D (1998) Coevolving predator and prey robots: do ‘‘arms race’’ arise in artificial evolution? Artif Life 4(4):311–335 Nolfi S, Floreano D (1999) Learning and evolution. Auton Robots 7(1):89–113 Paine RW, Tani J (2004) Motor primitive and sequence selforganization in a hierarchical recurrent neural network. Neural Netw 17(8–9):1291–1309 Perone CS (2009) Pyevolve: a Python open-source framework for genetic algorithms. SIGEVOlution 4(1):12–20 Riano L, McGinnity TM (2010) On the emergence of novel behaviours from complex non linear systems. In: Proceeding of BICA 2010. International conference on biological inspired cognitive architectures. IOS Press Stanley KO, Miikkulainen R (2004) Competitive coevolution through evolutionary complexification. J Artif Intell Res 21(1):63–100 Urzelai J, Floreano D (2001) Evolution of adaptive synapses: robots with fast adaptive behavior in new environments. Evol Comput 9(4):495–524 Walker J, Garrett S, Wilson M (2003) Evolving controllers for real robots: a survey of the literature. Adapt Behav 11:179–203 Werbos P (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560