Home » From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch

From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch

Introduction

Topologies (NEAT) is a powerful algorithm introduced in 2002 by Kenneth O. Stanley and Risto Miikkulainen from the University of Texas at Austin in this paper. NEAT algorithm introduced a new idea to the standard neuro-evolution techniques that evolved fixed-topology networks, by dynamically increasing the complexity of the networks over the generations.

In this article, I will walk through the NEAT algorithm and its implementation from scratch in Python, focusing on the algorithm’s design decisions and the intricacies that make NEAT both elegant and challenging to reproduce. This article is intended for a technical audience, familiar with neural networks and the basics of evolutionary computation.

Evolutionary Algorithms: A General Overview

Before jumping into NEAT, let’s review the evolutionary algorithm’s fundaments. Inspired by genetics and natural selection, evolutionary algorithms are a type of optimization algorithms that solve complex problems by iteratively improving a population of candidate solutions.

The core idea is to imitate the process of biological evolution:

  1. Initialization: The process starts by generating an initial population of candidate solutions. This initial solutions are generated randomly, and each solution is typically represented as a “genome” or “chromosome”, which encodes its features and characteristics.
  2. Evaluation: Each individual in the population is evaluated according to how well it solves the given problem. This is done using a fitness function which assigns a numerical score to each solution. The higher the fitness, the better the solution.
  3. Selection: Based on their fitness, a subset of the population is selected to become the “parents” of the next generation. Individuals with higher fitness scores have a higher probability of being selected.
  4. Reproduction: The selected individuals then reproduce to create “offspring” for the next generation using genetic operators. There are two main processes in this phase:
    • Crossover: Two or more parent genomes are combined to produce new offspring genomes, merging advantageous traits.
    • Mutation: Small changes are randomly introduced into the offspring genomes. This introduces novelty and helps explore the search space, preventing the algorithm from getting stuck in a local optima.
  5. Replacement: The newly generated offspring replaces the current population (either entirely or partially), forming the new generation.
  6. Termination: Steps 2–5 are repeated for a fixed number of generations, until a certain fitness threshold is reached, or until a satisfactory solution to the problem is found.

What Makes NEAT Special?

NEAT stands out because it goes further than regular evolutionary algorithms, which only evolve the weights of the network. NEAT also evolves the topology of the networks, making them increasingly complex.

Before NEAT, there were two main challenges that didn’t allow for regular evolutionary algorithms to be used to dinamically adjust the architecture of the networks. NEAT solves these two challenges:

  • How to perform crossover between topologically diverse networks: In traditional genetic algorithms, combining two genomes with vastly different structures often leads to non-functional or malformed offspring. Imagine trying to combine two different neural networks, where one has three hidden layers and another has only one, simply averaging weights or randomly merging connections will likely break the network’s functionality.
  • How to maintain diversity in a population with vastly different topologies: When networks can grow in complexity (adding new nodes and connections), these structural mutations often lead to a temporary decrease in their fitness. For example, a new connection may interfere with the existing network’s functionality before its parameters are properly tuned. As a result, recently mutated networks can be prematurely outcompeted by simpler, more optimized ones, even if the newer variant have the potential to evolve into a superior solution given more time. This premature convergence to local optima is a common problem.

How NEAT Solves These Challenges

NEAT tackles these two fundamental problems through two ingenious mechanisms: historical markings (also called innovation numbers) and speciation.

Historical Markings

Source: Evolving Neural Networks through Augmenting Topologies

To address the challenge of performing crossover between topologically diverse networks, NEAT introduces the concept of innovation numbers. When a new connection or node is added to a network during mutation, it is assigned a globally unique innovation number. This number acts as a historical marking, indicating the historical origin of that particular genetic feature.

Let’s look at the example in the image above, where we have two networks, “Parent 1” and “Parent 2” undergoing crossover. We can see that both networks have a connection from node 2 to node 5, with innovation number 4. This tells us that this connection must have been inherited by both networks from a common ancestor at some point. However, we can also see that “Parent 1” has a connection (from node 1 to node 5) with the innovation number 8, but “Parent 2” doesn’t have this connection. This shows that “Parent 1” evolved this specific connection independently.

During crossover, NEAT aligns the genes (nodes and connections) of the two parents based on their innovation numbers.

  • Matching genes (those with the same innovation number) are inherited randomly from either parent.
  • Disjoint genes (present in one parent but not the other, and with innovation numbers within the range of the other parent’s genes) are typically inherited from the fitter parent.
  • Excess genes (present in one parent but not the other, and with innovation numbers outside the range of the other parent’s genes, meaning they appeared later in evolutionary history) are also typically inherited from the fitter parent.

This whole process ensures that functionally similar parts of the networks are combined correctly, even if their overall structures differ significantly.

Speciation

To maintain diversity in a population with vastly different topologies, and prevent premature convergence, NEAT employs a technique called speciation. Through speciation the population is divided into different “species” based on topological similarity. Networks within the same species are more likely to share common ancestral traits and thus, more similar structures.

The similarity between two networks (genomes) is calculated using a compatibility distance function. This function considers three components:

  • The number of disjoint genes (genes present in one genome but not the other, but within the shared historical range).
  • The number of excess genes (genes present in one genome but not the other, and outside the shared historical range).
  • The average weight difference of matching genes.

If the compatibility distance between two networks falls below a certain threshold, they are considered to belong to the same species.

By speciating the population, NEAT ensures that:

  • Competition occurs only within species: This protects novel or less complex structures from being immediately outcompeted by highly complex but perhaps currently sub-optimal designs.
  • Each species has a chance to innovate and improve: The best individuals from each species are allowed to reproduce (elitism), promoting the evolution of unique solutions within different topological niches.
  • Species can die out if they are not successful: If a species consistently performs poorly, it will shrink and eventually disappear, making room for more promising genetic lines.

Core Components of NEAT

There are four core components in the NEAT algorithm:

Node Genes

Each node gene represents a neuron of the neural network. Each node has:

  • An ID (unique identifier)
  • A layer: it could be input, hidden, or output
  • An activation fuction
  • A bias value
class NodeGene:
    def __init__(self, id, layer, activation, bias):
        self.id = id
        self.layer = layer  # The layer to which the node belongs
        self.activation = activation    # Activation function
        self.bias = bias

Connection Genes

The connection genes (synapses) represent the connections between the neurons of the network. Each connection gene has:

  • Input node ID
  • Ouput node ID
  • Weight
  • Enabled flag (indicates whether the connection is enabled)
  • Innovation number (unique identifier assigned when the connection is first created)
class ConnectionGene:
    def __init__(self, in_node_id: int, out_node_id: int, weight: float,  innov: int, enabled: bool = True):
        self.in_node_id = in_node_id
        self.out_node_id = out_node_id
        self.weight = weight
        self.enabled = enabled  # Whether the connection is enabled or not
        self.innov = innov  # Innovation number described in the paper

Genomes

The genome is the “genetic blueprint” of a single neural network within the NEAT algorithm. It’s essentially a collection of all the node and connection genes that define the network’s structure and parameters. With the Genome we can later construct the actual runnable network (which we’ll call Phenotype). Each genome represents one individual in the population.

class Genome:
    def __init__(self, nodes, connections):
        self.nodes = {node.id: node for node in nodes if node is not None}
        self.connections = [c.copy() for c in connections]
        self.fitness = 0

Innovation Tracker

A critical component in any NEAT implementation is an Innovation Tracker. This is my custom implementation of this mechanism that is responsible for assigning and keeping track of unique innovation numbers for every new connection and node created throughout the process. This ensures that historical markers are consistent across the entire population, which is fundamental for correctly aligning genes during crossover.

class InnovationTracker:
    def __init__(self):
        self.current_innovation = 0
        self.connection_innovations = {}  # (in_node_id, out_node_id) -> innovation_number
        self.node_innovations = {}        # connection_innovation -> (node_innovation, conn1_innovation, conn2_innovation)
        self.node_id_counter = 0
    
    def get_connection_innovation(self, in_node_id, out_node_id):
        """Get innovation number for a connection, creating new if needed"""
        key = (in_node_id, out_node_id)
        if key not in self.connection_innovations:
            self.connection_innovations[key] = self.current_innovation
            self.current_innovation += 1
        return self.connection_innovations[key]
    
    def get_node_innovation(self, connection_innovation):
        """Get innovation numbers for node insertion, creating new if needed"""
        if connection_innovation not in self.node_innovations:
            # Create new node and two connections
            node_id = self.node_id_counter
            self.node_id_counter += 1
            
            conn1_innovation = self.current_innovation
            self.current_innovation += 1
            conn2_innovation = self.current_innovation
            self.current_innovation += 1
            
            self.node_innovations[connection_innovation] = (node_id, conn1_innovation, conn2_innovation)
        return self.node_innovations[connection_innovation]

NEAT Algorithm Flow

With the core components understood, now we can piece them together to understand how NEAT works. In the example shown, the algorithm is trying to solve the XOR problem.

1. Initialization

The very first generation of genomes is always created with a very simple and fixed structure. This approach aligns with NEAT’s philosophy of “starting simple and growing complexity”, ensuring it explores the simplest solutions first, gradually increasing complexity.

In this code example, we initialize the networks with the minimal structure: a fully-connected network with no hidden layers.

def create_initial_genome(num_inputs, num_outputs, innov: InnovationTracker):
    input_nodes = []
    output_nodes = []
    connections = []
    
    # Create input nodes
    for i in range(num_inputs):
        node = NodeGene(i, "input", lambda x: x, 0)
        input_nodes.append(node)
    
    # Create output nodes
    for i in range(num_outputs):
        node_id = num_inputs + i    # We start the ids where we left in the previous loop
        node = NodeGene(node_id, "output", sigmoid, random.uniform(-1, 1))
        output_nodes.append(node)
    
    # Update the innov tracker's node id
    innov.node_id_counter = num_inputs + num_outputs

    # Create connections
    for i in range(num_inputs):
        for j in range(num_outputs):
            in_node_id = i
            out_node_id = j + num_inputs
            innov_num = innov.get_connection_innovation(in_node_id, out_node_id)
            weight = random.uniform(-1, 1)
            conn = ConnectionGene(in_node_id, out_node_id, weight, innov_num, True)
            connections.append(conn)
    
    all_nodes = input_nodes + output_nodes
    return Genome(all_nodes, connections)
def create_initial_population(size, num_inputs, num_outputs, innov):
    population = []
    for _ in range(size):
        genome = create_initial_genome(num_inputs, num_outputs, innov)
        population.append(genome)
    return population

2. Evaluate Population Fitness

After the initial population is set up, the NEAT algorithm enters the main evolutionary loop. This loop repeats for a pre-defined number of generations, or until one of its solutions reaches a fitness threshold. Each generation undergoes a series of critical steps: evaluation, speciation, fitness adjustment, and reproduction. The first step is to evaluate the fitness of each individual in the population. To do this first we have to go through the following steps:

  1. Phenotype Expression: For each Genome we first have to express it into its phenotype (a runnable neural network). This involves constructing the actual neural network from the nodes and connections lists within the Genome.
  2. Forward Pass: Once the network is constructed, we perform the forward pass with the given inputs to produce the outputs.
  3. Fitness Calculation: Given the network’s input and output we can now calculate it’s fitness using the fitness function. The fitness function is problem-specific and is designed to return a numerical score indicating how well the network achieved its goal.
def topological_sort(edges):
  """ Helper function to sort the network's nodes """
        visited = set()
        order = []

        def visit(n):
            if n in visited:
                return
            visited.add(n)
            for m in edges[n]:
                visit(m)
            order.append(n)

        for node in edges:
            visit(node)

        return order[::-1]

class Genome:
    ... # The rest of the methods would go here

    def evaluate(self, input_values: list[float]):
      """ 
          Method of the Genome class.
          Performs the phenotype expression and forward pass
      """
        node_values = {}
        node_inputs = {n: [] for n in self.nodes}
        
        input_nodes = [n for n in self.nodes.values() if n.layer == "input"]
        output_nodes = [n for n in self.nodes.values() if n.layer == "output"]
        
        # Verify that the number of input values matches the number of input nodes
        if len(input_nodes) != len(input_values):
            raise ValueError(f"Number of inputs doesn't match number of input nodes. Input={len(input_nodes)}, num_in_val={len(input_values)}")
        
        # Assign input values
        for node, val in zip(input_nodes, input_values):
            node_values[node.id] = val
        
        edges = {}
        for n in self.nodes.values():
            edges[n] = []
        
        for conn in self.connections:  # Only construct enabled connections
            if conn.enabled:
                in_node = self.get_node(conn.in_node_id)
                out_node = self.get_node(conn.out_node_id)
                edges[in_node].append(out_node)
                node_inputs[conn.out_node_id].append(conn)
        
        sorted_nodes = topological_sort(edges)
        
        for node in sorted_nodes:
            if node.id in node_values:
                continue
            
            incoming = node_inputs[node.id]
            total_input = sum(
                node_values[c.in_node_id] * c.weight for c in incoming
            ) + node.bias
        
            node_values[node.id] = node.activation(total_input)
        
        return [node_values.get(n.id, 0) for n in output_nodes]
def fitness_xor(genome):
    """Calculate fitness for XOR problem"""
    # XOR Problem data
    X = [[0, 0], [0, 1], [1, 0], [1, 1]]
    y = [0, 1, 1, 0]
    total_error = 0
    for i in range(len(X)):
        try:
            output = genome.evaluate(X[i])
            # print(f"Output: {output}")
            if output:
                error = abs(output[0] - y[i])
                total_error += error
            else:
                error = y[i]
                total_error += error
        except Exception as e:
            print(f"Error: {e}")
            return 0  # Return 0 fitness if evaluation fails
    
    fitness = 4 - total_error
    return max(0, fitness)

3. Speciation

Instead of letting all individuals compete globally, NEAT divides the population into species, putting together those genomes that are topologically similar. This approach prevents new topological innovations from being immediatly outcompeted by larger, more mature species, and allows them to mature.

The process of speciation in each generation involves:

  1. Measuring Compatibility: We use a compatibility distance function to measure how different two Genomes are. The shorter the distance between them, the more similar two genomes are. The following code implementation uses the formula proposed in the original paper, with the proposed parameters.
def distance(genome1: Genome, genome2: Genome, c1=1.0, c2=1.0, c3=0.4):
    genes1 = {g.innov: g for g in genome1.connections}
    genes2 = {g.innov: g for g in genome2.connections}

    innovations1 = set(genes1.keys())
    innovations2 = set(genes2.keys())

    matching = innovations1 & innovations2
    disjoint = (innovations1 ^ innovations2)
    excess = set()

    max_innov1 = max(innovations1) if innovations1 else 0
    max_innov2 = max(innovations2) if innovations2 else 0
    max_innov = min(max_innov1, max_innov2)

    # Separate excess from disjoint
    for innov in disjoint.copy():
        if innov > max_innov:
            excess.add(innov)
            disjoint.remove(innov)

    # Weight difference of matching genes
    if matching:
        weight_diff = sum(
            abs(genes1[i].weight - genes2[i].weight) for i in matching
        )
        avg_weight_diff = weight_diff / len(matching)
    else:
        avg_weight_diff = 0

    # Normalize by N
    N = max(len(genome1.connections), len(genome2.connections))
    if N < 20:  # NEAT typically uses 1 if N < 20
        N = 1

    delta = (c1 * len(excess)) / N + (c2 * len(disjoint)) / N + c3 * avg_weight_diff
    return delta

2. Grouping into Species: At the beggining of each generation, the Speciator is responsible for categorizing all genomes into existing or new species. Each species has one representative genome, that serves as the benchmark against which every individual of the population is compared to determine if belongs to that species.

class Species:
    def __init__(self, representative: Genome):
        self.representative = representative
        self.members = [representative]
        self.adjusted_fitness = 0
        self.best_fitness = -float('inf')
        self.stagnant_generations = 0
    
    def add_member(self, genome: Genome):
        self.members.append(genome)
    
    def clear_members(self):
        self.members = []
    
    def update_fitness_stats(self):
        if not self.members:
            self.adjusted_fitness = 0
            return

        current_best_fitness = max(member.fitness for member in self.members)
        
        # Check for improvement and update stagnation
        if current_best_fitness > self.best_fitness:
            self.best_fitness = current_best_fitness
            self.stagnant_generations = 0
        else:
            self.stagnant_generations += 1

        self.adjusted_fitness = sum(member.fitness for member in self.members) / len(self.members)
class Speciator:
    def __init__(self, compatibility_threshold=3.0):
        self.species = []
        self.compatibility_threshold = compatibility_threshold
    
    def speciate(self, population: list[Genome]):
        """ Group genomes into species based on distance """
        # Clear all species for the new generation
        for s in self.species:
            s.clear_members()
        
        for genome in population:
            found_species = False
            for species in self.species:
                if distance(genome, species.representative) < self.compatibility_threshold:
                    species.add_member(genome)
                    found_species = True
                    break
            
            if not found_species:
                new_species = Species(representative=genome)
                self.species.append(new_species)

        # Remove empty species
        self.species = [s for s in self.species if s.members]

        # Recompute adjusted fitness
        for species in self.species:
            species.update_fitness_stats()
            # Update representative to be the best member
            species.representative = max(species.members, key=lambda g: g.fitness)

    def get_species(self):
        return self.species

4. Adjusting Fitness

Even when genomes are grouped into species, a raw fitness value isn’t enough to allow for fair reproduction. Larger species would naturally produce more offspring, potentially overwhelming smaller species that might hold promising, but still nascent, innovations. To counter this, NEAT employs the adjusted fitness, and adjusts the fitness based on the species performance.

To adjust the fitness of an individual, its fitness is divided between the number of individuals in its species. This mechanism is implemented in the update_fitness_stats method inside the Species class.

5. Reproduction

After speciating and adjusting the fitnesses, the algorithm moves to the reproduction phase, where the next generation of genomes is created through a combination of selection, crossover, and mutation.

  1. Selection: In this implementation the selection is done through elitism in the main evolutionary loop.

2. Crossover: Some key aspects of this implementation are:

  • Node inheritance: Input and output nodes are explicitly ensured to be passed down to the offspring. This is done to ensure the functionality of the network doesn’t break.
  • Matching genes: When both parents have a gene with the same innovation number, one is chosen randomly. If the gene was disabled in either parent, there is a 75% chance of the gene being disabled in the offspring.
  • Excess genes: Excess genes from the less fit parent are not inherited.
def crossover(parent1: Genome, parent2: Genome) -> Genome:
    """ Crossover assuming parent1 is the fittest parent """
    offspring_connections = []
    offspring_nodes = set()
    all_nodes = {}  # Collect all nodes from both parents
    
    for node in parent1.nodes.values():
        all_nodes[node.id] = node.copy()
        if node.layer in ("input", "output"):
            offspring_nodes.add(all_nodes[node.id]) # Ensure the input and output nodes are included
    for node in parent2.nodes.values():
        if node.id not in all_nodes:
            all_nodes[node.id] = node.copy()        

    # Build maps of genes keyed by innovation number
    genes1 = {g.innov: g for g in parent1.connections}
    genes2 = {g.innov: g for g in parent2.connections}

    # Combine all innovation numbers
    all_innovs = set(genes1.keys()) | set(genes2.keys())

    for innov in sorted(all_innovs):
        gene1 = genes1.get(innov)
        gene2 = genes2.get(innov)
        
        if gene1 and gene2:  # Matching genes
            selected = random.choice([gene1, gene2])
            gene_copy = selected.copy()

            if not gene1.enabled or not gene2.enabled:  # 75% chance of the offsprign gene being disabled
                if random.random() < 0.75:
                    gene_copy.enabled = False

        elif gene1 and not gene2:   # Disjoint gene (from the fittest parent)
            gene_copy = gene1.copy()
        
        else:   # Not taking disjoint genes from less fit parent
            continue
        
        # get nodes
        in_node = all_nodes.get(gene_copy.in_node_id)
        out_node = all_nodes.get(gene_copy.out_node_id)
        
        if in_node and out_node:
            offspring_connections.append(gene_copy)
            offspring_nodes.add(in_node)
            offspring_nodes.add(out_node)
    
    offspring_nodes = list(offspring_nodes) # Remove the duplicates
    
    return Genome(offspring_nodes, offspring_connections)

3. Mutation: After crossover, mutation is applied to the offspring. A key aspect of this implementation is that we avoid forming cycles when adding connections.

class Genome:
    ... # The rest of the methods would go here

    def _path_exists(self, start_node_id, end_node_id, checked_nodes=None):
        """ Recursive function to check whether a apth between two nodes exists."""
        if checked_nodes is None:
            checked_nodes = set()
        
        if start_node_id == end_node_id:
            return True
        
        checked_nodes.add(start_node_id)
        for conn in self.connections:
            if conn.enabled and conn.in_node_id == start_node_id:
                if conn.out_node_id not in checked_nodes:
                    if self._path_exists(conn.out_node_id, end_node_id, checked_nodes):
                        return True
        return False

    def get_node(self, node_id):
        return self.nodes.get(node_id, None)
    
    def mutate_add_connection(self, innov: InnovationTracker):
        node_list = list(self.nodes.values())

        # Try max 10 times
        max_tries = 10
        found_appropiate_nodes = False

        for _ in range(max_tries):
            node1, node2 = random.sample(node_list, 2)
            
            if (node1.layer == "output" or (node1.layer == "hid" and node2.layer == "input")):
                node1, node2 = node2, node1 # Swap them
            # Check if it's creating a loop to the same node
            if node1 == node2:
                continue
            # Check if it's creating a connection between two nodes on the same layer
            if node1.layer == node2.layer:
                continue
            if node1.layer == "output" or node2.layer == "input":
                continue
        
            # Check whether the connection already exists
            conn_exists=False
            for c in self.connections:
                if (c.in_node_id == node1.id and c.out_node_id == node2.id) or
                   (c.in_node_id == node2.id and c.out_node_id == node1.id):
                    conn_exists = True
                    break
            
            if conn_exists:
                continue
            # If there is a path from node2 to node1, then adding a connection from node1 to node2 creates a cycle
            if self._path_exists(node2.id, node1.id):
                continue
            
            innov_num = innov.get_connection_innovation(node1.id, node2.id)
            new_conn = ConnectionGene(node1.id, node2.id, random.uniform(-1, 1), innov_num, True)
            self.connections.append(new_conn)
            return
    
    def mutate_add_node(self, innov: InnovationTracker):
        enabled_conn = [c for c in self.connections if c.enabled]
        if not enabled_conn:
            return
        connection = random.choice(enabled_conn)    # choose a random enabled connectin
        connection.enabled = False  # Disable the connection

        node_id, conn1_innov, conn2_innov = innov.get_node_innovation(connection.innov) 

        # Create node and connections
        new_node = NodeGene(node_id, "hid", ReLU, random.uniform(-1,1))
        conn1 = ConnectionGene(connection.in_node_id, node_id, 1, conn1_innov, True)
        conn2 = ConnectionGene(node_id, connection.out_node_id, connection.weight, conn2_innov, True)

        self.nodes[node_id] = new_node
        self.connections.extend([conn1, conn2])
    
    def mutate_weights(self, rate=0.8, power=0.5):
        for conn in self.connections:
            if random.random() < rate:
                if random.random() < 0.1:
                    conn.weight = random.uniform(-1, 1)
                else:
                    conn.weight += random.gauss(0, power)
                    conn.weight = max(-5, min(5, conn.weight))  # Clamp weights
    
    def mutate_bias(self, rate=0.7, power=0.5):
        for node in self.nodes.values():
            if node.layer != "input" and random.random() < rate:
                if random.random() < 0.1:
                    node.bias = random.uniform(-1, 1)
                else:
                    node.bias += random.gauss(0, power)
                    node.bias = max(-5, min(5, node.bias))

    def mutate(self, innov, conn_mutation_rate=0.05, node_mutation_rate=0.03, weight_mutation_rate=0.8, bias_mutation_rate=0.7):
        self.mutate_weights(weight_mutation_rate)
        self.mutate_bias(bias_mutation_rate)

        if random.random() < conn_mutation_rate:
            self.mutate_add_connection(innov)
        
        if random.random() < node_mutation_rate:
            self.mutate_add_node(innov)

6. Repeating the Process Within the Main Evolutionary Loop

Once all the offspring has been generated and the new population is formed, the current generation ends, and the new population becomes the starting point for the next evolutionary cycle. This is handled by a main evolutionary loop, which orchestrates the whole algorithm.

def evolution(population, fitness_scores, speciator: Speciator, innov: InnovationTracker, stagnation_limit: int = 15):
new_population = []

# Assign fitness to genomes
for genome, fitness in zip(population, fitness_scores):
genome.fitness = fitness

# Speciate population
speciator.speciate(population)
species_list = speciator.get_species()
species_list.sort(key=lambda s: s.best_fitness, reverse=True) # Sort species by best_fitness
print(f"Species created: {len(species_list)}")

# Remove stagnant species
surviving_species = []
if species_list:
surviving_species.append(species_list[0]) # Keep the best one regardless of stagnation
for s in species_list[1:]:
if s.stagnant_generations < stagnation_limit:
surviving_species.append(s)

species_list = surviving_species
print(f"Species that survived: {len(species_list)}")

total_adjusted_fitness = sum(s.adjusted_fitness for s in species_list)
print(f"Total adjusted fitness: {total_adjusted_fitness}")

# elitism
for species in species_list:
if species.members:
best_genome = max(species.members, key=lambda g: g.fitness)
new_population.append(best_genome)

remaining_offspring = len(population) - len(new_population)

# Allocate the remaining offspring
for species in species_list:
if total_adjusted_fitness > 0:
offspring_count = int((species.adjusted_fitness / total_adjusted_fitness) * remaining_offspring) # The more fit species will have more offspring
else:
offspring_count = remaining_offspring // len(species_list) # If all the species performed poorly, assign offspring evenly between them

if offspring_count > 0:
offspring = reproduce_species(species, offspring_count, innov)
new_population.extend(offspring)

# Ensure there are enough individuals (we could have less because of the rounding error)
while len(new_population) < len(population):
best_species = max(species_list, key=lambda s: s.adjusted_fitness)
offspring = reproduce_species(best_species, 1, innov)
new_population.extend(offspring)

return new_population

7. Running the Algorithm

def run_neat_xor(save_best=False, generations=50, pop_size=50, target_fitness=3.9, speciator_threshold=2.0):
    NUM_INPUTS = 2
    NUM_OUTPUTS = 1
    
    # Initialize Innovation Number and Speciator
    innov = InnovationTracker()
    speciator = Speciator(speciator_threshold)

    # Create initial population
    population = create_initial_population(pop_size, NUM_INPUTS, NUM_OUTPUTS, innov)
    
    # Stats
    best_fitness_history = []
    avg_fitness_history = []
    species_count_history = []
    
    # main loop
    for gen in range(generations):
        fitness_scores = [fitness_xor(g) for g in population]
        print(f"fitness: {fitness_scores}")
        
        # get the stats
        best_fitness = max(fitness_scores)
        avg_fitness = sum(fitness_scores) / len(fitness_scores)
        best_fitness_history.append(best_fitness)
        avg_fitness_history.append(avg_fitness)
        print(f"Generation {gen}: Best={best_fitness}, Avg={avg_fitness}")
        
        # Check if we achieved the target fitness
        if best_fitness >= target_fitness:
            print(f"Problem was solved in {gen} generations")
            print(f"Best fitness achieved: {max(best_fitness_history)}")
            
            best_genome = population[fitness_scores.index(best_fitness)]

            if save_best:
                with open("best_genome.pkl", "wb") as f:
                    pickle.dump(best_genome, f)

            return best_genome, best_fitness_history, avg_fitness_history, species_count_history
        
        population = evolution(population, fitness_scores, speciator, innov)

    print(f"Couldn't solve the XOR problem in {generations} generations")
    print(f"Best fitness achieved: {max(best_fitness_history)}")
    return None, best_fitness_history, avg_fitness_history, species_count_history

Complete Code

Github code

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *