Neural networks are often overlooked when considering game AI. This is because they once received a lot of hype but the hype didn't amount to much. However, neural networks are still an area of intense research, and numerous learning algorithms have been developed for each of the 3 basic types of learning: supervised, unsupervised, and reinforcement learning.

Reinforcement learning is the learning algorithm that allows an agent to learn from its environment and improve itself on its own. This is the class of learning algorithms we will focus on in this article. This article will discuss the use of genetic algorithms as well as an algorithm the author has researched for single-agent reinforcement learning. This article assumes that the neural networks are simple integrate and fire, non-spiking sigmoidal activation neural networks.

Genetic Algorithms

The concept

Genetic algorithms are one of the simplest but also one of the most effective reinforcement learning methods. It does have one key limitation though: It has to operate on multiple agents (AI's). Nevertheless, genetic algorithms can be a great tool for creating neural networks via the process of evolution.

Genetic algorithms are part of a broader range of evolutionary algorithms. Their basic operation proceeds as follows:

1. Initialize a set of genes
2. Evaluate the fitnesses of all genes
3. Mate genes based on how well they performed (performing crossover and mutation)
4. Replace old genes with the new children
5. Repeat steps 2 - 4 until a termination criterion is met

NEAT

The code accompanying performs the genetic algorithm following the NEAT (Neuro-Evolution of Augmenting Topologies) methodology created by Kenneth Stanley. As a result, the neural network genes are encoded by storing connections between neurons as index pairs (indexed into the neuron array) along with the associated weight, as well as the biases for each of the neurons.

This is all the information that is needed to construct a completely functional neural network from genes. However, along with this information, both the neuron biases and connection genes have a special "innovation number" stored along with them. These numbers are unique, a counter is incremented each time an innovation number is assigned. That way, when network genes are being mated, we can tell if connections share a heritage by seeing if their innovation numbers match. These can then be crossed over directly, while the genes without innovation number matches can be assigned randomly to the child neural networks.

This description is lacking in detail, but intends to simply provide an overview of the way the genetic algorithm included in the software package functions.

While this genetic algorithm works very well for many problems, it requires that many agents are simulated at a time rather than one just learning by itself. So, we will briefly cover another method of neural network training.

Local Dopamine Weight Update Rule with Output Traces

The concept

This method quite possibly has already been invented, but I could not find a paper describing the same method so far. This method applies to how neuron weights are updated when learning in a single agent scenario. This method is entirely separate from network topology selection. As a result, the included software package uses a genetic algorithm to evolve a topology for use with the single agent reinforcement learning system. Of course, one could also simply grow a neural network by randomly attaching new neurons over time.

Anyways, I discovered this technique after a lot of trial and error while trying to find a weight update rule for neural networks that operates using information available at the neuron/synapse level. It therefore could be biologically plausible. The method uses a reward signal, dopamine, to determine when the network should be punished. That's right; this network only feels pain, not pleasure. Well, its pleasure is the lack of pain. Either way, in order to make this method work, one needs to add a output trace (floating point variable) to each neuron. Other than that, one only needs the reward signal dopamine, which ranges from 0 (utter failure) to 1 (complete success). When one has this information, all one needs to do is update the neural network weights after every update cycle using the following code:

float outputSigned = 2.0f * m_output - 1.0f;

m_outputTrace += -traceDecay * m_outputTrace + outputSigned;

// Weight update
for(size_t i = 0; i < numInputs; i++)
    m_inputs[i].m_weight += -Sign(m_outputTrace) * Sign(m_inputs[i].m_pInput->m_outputTrace) * (1.0f - dopamine);

// Bias update
m_bias += -Sign(m_outputTrace) * (1.0f - dopamine);

Where m_output is the output of the neuron, traceDecay is a value that defines how quickly the network forgets (ranges from [0, 1]), and m_inputs is an array of connections.

This code works as follows:

The output trace is simply an average output over time that decays if left untouched. The weight update simply moves the weight if dopamine is < 1 (it doesn't have perfect fitness yet) in the direction that would cause it to output its average output less often.

This method is able to solve the XOR problem with considerable ease (it easily learns an XOR gate).

Use in Games?

These methods seem like total overkill for games. But, they can do things that traditional methods can't. For instance, with the genetic algorithm, you can create a physics-based character controller like this one.

The animation was not made by an animator; rather the AI learned how to walk by itself. This results in an animation that can react to the environment directly. The AI in the video was created using the same software package this article revolves around (linked below).

The second discussed technique can be used to have game characters or enemies learn from experience. Enemies can for instance be assigned a reward for how close they get to a player, so that they try to get as close as possible to the player given a few sensory inputs. This method can also be used by virtual pet type games, where you can reward or punish a pet to achieve the desired behavior.

Using the Code

The software package accompanying this article contains a manual on how to use the code.

The software package can be found at: https://sourceforge.net/projects/neatvisualizers/

Conclusion

I hope this article has provided some insight and inspiration for the use of reinforcement learning neural network AI in games. Now get out there and make some cool game AI!

Article Update Log

2 July 2013: Initial release

Reinforcement Learning for Games

Genetic Algorithms

The concept

NEAT

Local Dopamine Weight Update Rule with Output Traces

The concept

Use in Games?

Using the Code

Conclusion

Article Update Log

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...