Neurally Driven Evolutionary Autonomous Agents

In this project autonomous agents controlled by feed forward neural networks were trained using evolutionary algorithms to perform simple tasks such as navigation and foraging. The emphasis was on understanding the behavior of the evolved agents, and relating this behavior to the structure of the neural network.

Abstract
In this project autonomous agents controlled by feed forward neural networks were trained using evolutionary algorithms to perform simple tasks such as navigation and foraging. The emphasis was on understanding the behavior of the evolved agents, and relating this behavior to the structure of the neural network.

 

Background
It was already shown that autonomous agents controlled by a neural network can be trained (using evolutionary algorithms or other training algorithms) to perform tasks such as navigation, searching for food, evading predators, seeking prey and mating partners. In these software agents the neural networks acts as a “brain” – processing information from sensory inputs and controlling motor outputs. Some studies even tried to use evolved agents as a neuroscience tool – using the evolved agents as a model of real animal behavior. Still, an important question that is yet unanswered is how to interpret the evolved neural network, and understand exactly how the network works. This is a general question regarding neural networks, not specific to a certain experiment or training algorithm.

 

Our approach
Motivated by the book “Vehicles” by V. Braitenberg, we tried an approach that is described in the book as the “law of uphill analysis and downhill invention”. We assume that the evolution process contains information that could be useful for understanding how the resulting agents operate. Therefore – we will try to build and evolve our agents in such a way that would help understand them later. For that we used the NEAT genetic algorithm (Stanley, Miikkulainen 2003) which evolves network structure in parallel to network weights, using historical markings to identify similar network parts in different networks. We will use these marking for our own reasons – to help us understand how the evolved agents work.
The agents,The simulation,The environment and The tasks

Our agents are two wheeled vehicles, with multiple light sensors located in different parts of their body. The agents are controlled by a feed forward neural network: the inputs are the sensor readings, and two outputs determine the velocity of each wheel.

 

1

The environment is boundless, and contains static objects – lights – which intensity is inverse proportional to the distance from it, and is picked up by the light sensors (specific sensors will only sense a specific light). An example of an agent’s task is driving towards a specific light, or staying in a constant distance from it. In a more complex task the agent had limited energy, and had to drive by another light for “refueling”.

2

The Solution
Evolution

In order to train the agents to perform a task an evolutionary algorithm was used. The basic scheme for the algorithm is this:
1. Start with an initial population of random agents (with minimal networks and random weights). 2. Evaluate the agent’s fitness by simulation of the environment, and a performance criteria. The starting conditions are random in order to increase robustness, but are similar to all agents in the same generation. 3. Pick the best agents and use them to create a new population, by using network weights crossover, and network weights and structure mutations. 4. Continue until the agents reach a specific fitness limit, or until fitness stagnates.

 

Understanding Behavior
In order to understand the evolved agent’s behavior we developed 3 tools: 1. “Fitness Leaps”: some evolutionary runs (usually when starting conditions are less random) had big fitness “leaps” in certain generations. It is reasonable to assume that in these generations the agent developed a certain feature that was critical to its resulting behavior. Therefore, by inspecting differences between agents in these generations important network structure can be identified.

3
2. Primitives: certain network structures are essential to the agents operation, and thus shall be common do different evolutionary runs. A simple example is the “protractor” primitive which is used by the agent to know the relative angle between it and a specific light source.
3. “Age” of network parts: the NEAT algorithm assigns each part of the network structure a unique identifier that indicates the time (generation) it was added to the population. Therefore we can separate the network to smaller structures according to their generation of appearance, and inspect differences between networks based on the “age” of the network parts. Also, by assuming that older parts have a more significant role in the behavior this “age” will help identify the more important parts of the network.

 

Results
We will shortly review one result that we like the most: The agents were given the task of reaching a specific light, starting with insufficient energy. In order to reach that light they had to drive through a “refueling light”. The agents had two sensors for each light, and a sensor of their energy. Of course, the light positions were random, so they agents didn’t remember a path. The evolved agents solved the task, as can be seen by their paths:

4

This is the neural network controlling these agents:

5

Using the techniques mentioned above, we were able to reduce the network to its most important parts, and completely understand the vehicles behavior:

6

The information gathered from the evolution process was very helpful in understanding this behavior.

 

Tools
All simulations and genetic algorithms were written in C++ (as they have to be fast). Analysis of the agents and graphics were done in MATLAB.

 

Conclusions
A lot of information can be gathered from the evolutionary process. This information can help us to understand the NN behaviour and simplify its structure. Evolutionary algorithms exploits every aspect of the simulation and the environment and usually solves problems in an unexpected way.

 

Acknowledgment
We are grateful to our project supervisor Prof. Ron Meir for his help and guidance throughout this work
We are also grateful to the Ollendorf Minerva Center Fund for supporting this project.