Neural networks are often overlooked when considering game AI. This is because they once received a lot of hype but the hype didn't amount to much. However, neural networks are still an area of intense research, and numerous learning algorithms have been developed for each of the 3 basic types of learning: supervised, unsupervised, and reinforcement learning.
Reinforcement learning is the learning algorithm that allows an agent to learn from its environment and improve itself on its own. This is the class of learning algorithms we will focus on in this article. This article will discuss the use of genetic algorithms as well as an algorithm the author has researched for single-agent reinforcement learning. This article assumes that the neural networks are simple integrate and fire, non-spiking sigmoidal activation neural networks.
Genetic algorithms are one of the simplest but also one of the most effective reinforcement learning methods. It does have one key limitation though: It has to operate on multiple agents (AI's). Nevertheless, genetic algorithms can be a great tool for creating neural networks via the process of evolution.
Genetic algorithms are part of a broader range of evolutionary algorithms. Their basic operation proceeds as follows:
1. Initialize a set of genes
2. Evaluate the fitnesses of all genes
3. Mate genes based on how well they performed (performing crossover and mutation)
4. Replace old genes with the new children
5. Repeat steps 2 - 4 until a termination criterion is met
The code accompanying performs the genetic algorithm following the NEAT (Neuro-Evolution of Augmenting Topologies) methodology created by Kenneth Stanley. As a result, the neural network genes are encoded by storing connections between neurons as index pairs (indexed into the neuron array) along with the associated weight, as well as the biases for each of the neurons.
This is all the information that is needed to construct a completely functional neural network from genes. However, along with this information, both the neuron biases and connection genes have a special "innovation number" stored along with them. These numbers are unique, a counter is incremented each time an innovation number is assigned. That way, when network genes are being mated, we can tell if connections share a heritage by seeing if their innovation numbers match. These can then be crossed over directly, while the genes without innovation number matches can be assigned randomly to the child neural networks.
This description is lacking in detail, but intends to simply provide an overview of the way the genetic algorithm included in the software package functions.
While this genetic algorithm works very well for many problems, it requires that many agents are simulated at a time rather than one just learning by itself. So, we will briefly cover another method of neural network training.
This method quite possibly has already been invented, but I could not find a paper describing the same method so far. This method applies to how neuron weights are updated when learning in a single agent scenario. This method is entirely separate from network topology selection. As a result, the included software package uses a genetic algorithm to evolve a topology for use with the single agent reinforcement learning system. Of course, one could also simply grow a neural network by randomly attaching new neurons over time.
Anyways, I discovered this technique after a lot of trial and error while trying to find a weight update rule for neural networks that operates using information available at the neuron/synapse level. It therefore could be biologically plausible. The method uses a reward signal, dopamine, to determine when the network should be punished. That's right; this network only feels pain, not pleasure. Well, its pleasure is the lack of pain. Either way, in order to make this method work, one needs to add a output trace (floating point variable) to each neuron. Other than that, one only needs the reward signal dopamine, which ranges from 0 (utter failure) to 1 (complete success). When one has this information, all one needs to do is update the neural network weights after every update cycle using the following code:
Where m_output is the output of the neuron, traceDecay is a value that defines how quickly the network forgets (ranges from [0, 1]), and m_inputs is an array of connections.
This code works as follows:
The output trace is simply an average output over time that decays if left untouched. The weight update simply moves the weight if dopamine is < 1 (it doesn't have perfect fitness yet) in the direction that would cause it to output its average output less often.
This method is able to solve the XOR problem with considerable ease (it easily learns an XOR gate).
These methods seem like total overkill for games. But, they can do things that traditional methods can't. For instance, with the genetic algorithm, you can create a physics-based character controller like this one.
The animation was not made by an animator; rather the AI learned how to walk by itself. This results in an animation that can react to the environment directly. The AI in the video was created using the same software package this article revolves around (linked below).
The second discussed technique can be used to have game characters or enemies learn from experience. Enemies can for instance be assigned a reward for how close they get to a player, so that they try to get as close as possible to the player given a few sensory inputs. This method can also be used by virtual pet type games, where you can reward or punish a pet to achieve the desired behavior.
The software package accompanying this article contains a manual on how to use the code.
The software package can be found at: https://sourceforge.net/projects/neatvisualizers/
I hope this article has provided some insight and inspiration for the use of reinforcement learning neural network AI in games. Now get out there and make some cool game AI!
2 July 2013: Initial release
Reinforcement learning is the learning algorithm that allows an agent to learn from its environment and improve itself on its own. This is the class of learning algorithms we will focus on in this article. This article will discuss the use of genetic algorithms as well as an algorithm the author has researched for single-agent reinforcement learning. This article assumes that the neural networks are simple integrate and fire, non-spiking sigmoidal activation neural networks.
Genetic Algorithms
The concept
Genetic algorithms are one of the simplest but also one of the most effective reinforcement learning methods. It does have one key limitation though: It has to operate on multiple agents (AI's). Nevertheless, genetic algorithms can be a great tool for creating neural networks via the process of evolution.
Genetic algorithms are part of a broader range of evolutionary algorithms. Their basic operation proceeds as follows:
1. Initialize a set of genes
2. Evaluate the fitnesses of all genes
3. Mate genes based on how well they performed (performing crossover and mutation)
4. Replace old genes with the new children
5. Repeat steps 2 - 4 until a termination criterion is met
NEAT
The code accompanying performs the genetic algorithm following the NEAT (Neuro-Evolution of Augmenting Topologies) methodology created by Kenneth Stanley. As a result, the neural network genes are encoded by storing connections between neurons as index pairs (indexed into the neuron array) along with the associated weight, as well as the biases for each of the neurons.
This is all the information that is needed to construct a completely functional neural network from genes. However, along with this information, both the neuron biases and connection genes have a special "innovation number" stored along with them. These numbers are unique, a counter is incremented each time an innovation number is assigned. That way, when network genes are being mated, we can tell if connections share a heritage by seeing if their innovation numbers match. These can then be crossed over directly, while the genes without innovation number matches can be assigned randomly to the child neural networks.
This description is lacking in detail, but intends to simply provide an overview of the way the genetic algorithm included in the software package functions.
While this genetic algorithm works very well for many problems, it requires that many agents are simulated at a time rather than one just learning by itself. So, we will briefly cover another method of neural network training.
Local Dopamine Weight Update Rule with Output Traces
The concept
This method quite possibly has already been invented, but I could not find a paper describing the same method so far. This method applies to how neuron weights are updated when learning in a single agent scenario. This method is entirely separate from network topology selection. As a result, the included software package uses a genetic algorithm to evolve a topology for use with the single agent reinforcement learning system. Of course, one could also simply grow a neural network by randomly attaching new neurons over time.
Anyways, I discovered this technique after a lot of trial and error while trying to find a weight update rule for neural networks that operates using information available at the neuron/synapse level. It therefore could be biologically plausible. The method uses a reward signal, dopamine, to determine when the network should be punished. That's right; this network only feels pain, not pleasure. Well, its pleasure is the lack of pain. Either way, in order to make this method work, one needs to add a output trace (floating point variable) to each neuron. Other than that, one only needs the reward signal dopamine, which ranges from 0 (utter failure) to 1 (complete success). When one has this information, all one needs to do is update the neural network weights after every update cycle using the following code:
float outputSigned = 2.0f * m_output - 1.0f; m_outputTrace += -traceDecay * m_outputTrace + outputSigned; // Weight update for(size_t i = 0; i < numInputs; i++) m_inputs[i].m_weight += -Sign(m_outputTrace) * Sign(m_inputs[i].m_pInput->m_outputTrace) * (1.0f - dopamine); // Bias update m_bias += -Sign(m_outputTrace) * (1.0f - dopamine);
Where m_output is the output of the neuron, traceDecay is a value that defines how quickly the network forgets (ranges from [0, 1]), and m_inputs is an array of connections.
This code works as follows:
The output trace is simply an average output over time that decays if left untouched. The weight update simply moves the weight if dopamine is < 1 (it doesn't have perfect fitness yet) in the direction that would cause it to output its average output less often.
This method is able to solve the XOR problem with considerable ease (it easily learns an XOR gate).
Use in Games?
These methods seem like total overkill for games. But, they can do things that traditional methods can't. For instance, with the genetic algorithm, you can create a physics-based character controller like this one.
The animation was not made by an animator; rather the AI learned how to walk by itself. This results in an animation that can react to the environment directly. The AI in the video was created using the same software package this article revolves around (linked below).
The second discussed technique can be used to have game characters or enemies learn from experience. Enemies can for instance be assigned a reward for how close they get to a player, so that they try to get as close as possible to the player given a few sensory inputs. This method can also be used by virtual pet type games, where you can reward or punish a pet to achieve the desired behavior.
Using the Code
The software package accompanying this article contains a manual on how to use the code.
The software package can be found at: https://sourceforge.net/projects/neatvisualizers/
Conclusion
I hope this article has provided some insight and inspiration for the use of reinforcement learning neural network AI in games. Now get out there and make some cool game AI!
Article Update Log
2 July 2013: Initial release