Custom‐Operators‐Tutorial - adylagad/CSCI-561_Genetic-Algorithm GitHub Wiki
Custom Operators Tutorial
Learn how to create and integrate your own genetic operators.
🎯 What You'll Learn
By the end of this tutorial, you'll be able to:
- ✅ Create custom crossover operators
- ✅ Create custom mutation operators
- ✅ Create custom selection strategies
- ✅ Create custom initialization strategies
- ✅ Integrate them into the GA framework
📋 Prerequisites
- Basic Python knowledge
- Understanding of GA Explained
- Familiarity with the codebase structure
🚀 Tutorial 1: Custom Mutation Operator
Let's create a Reverse Subsequence Mutation operator.
Step 1: Understand the Interface
All mutation operators must implement:
from genetic_algorithm.core.types import Tour
from abc import ABC, abstractmethod
class MutationStrategy(ABC):
@abstractmethod
def mutate(self, tour: Tour) -> Tour:
"""Mutate a tour and return modified copy."""
pass
Step 2: Create the Class
Create genetic_algorithm/operators/reverse_mutation.py:
"""Reverse Subsequence Mutation operator."""
import random
from typing import List
from genetic_algorithm.core.types import Tour
from genetic_algorithm.operators.mutation import MutationStrategy
class ReverseSubsequenceMutation(MutationStrategy):
"""
Reverse Subsequence Mutation.
Selects a random subsequence and reverses it.
This preserves most of the tour while making a local change.
Example:
Before: [1, 2, 3, 4, 5, 6]
-------
After: [1, 2, 5, 4, 3, 6] (reversed [3,4,5])
"""
def mutate(self, tour: Tour) -> Tour:
"""
Mutate tour by reversing a random subsequence.
Args:
tour: Tour to mutate
Returns:
Mutated tour (modified copy)
"""
# Copy to avoid modifying original
mutated = tour.copy()
# Select two random positions
size = len(mutated)
pos1 = random.randint(0, size - 1)
pos2 = random.randint(0, size - 1)
# Ensure pos1 < pos2
if pos1 > pos2:
pos1, pos2 = pos2, pos1
# Reverse the subsequence
mutated[pos1:pos2+1] = reversed(mutated[pos1:pos2+1])
return mutated
Step 3: Test Your Operator
Add test in test_ga.py:
def test_reverse_subsequence_mutation(self):
"""Test reverse subsequence mutation."""
from genetic_algorithm.operators.reverse_mutation import ReverseSubsequenceMutation
mutation = ReverseSubsequenceMutation()
tour = [0, 1, 2, 3, 4]
mutated = mutation.mutate(tour)
# Should contain same cities
self.assertEqual(set(mutated), set(tour))
# Should be different (with high probability)
self.assertNotEqual(mutated, tour)
# Original unchanged
self.assertEqual(tour, [0, 1, 2, 3, 4])
Step 4: Use in main.py
from genetic_algorithm.operators.reverse_mutation import ReverseSubsequenceMutation
# Create mutation operator
mutation = ReverseSubsequenceMutation()
# Use in GA
ga = GeneticAlgorithm(
cities=cities,
fitness_evaluator=fitness_evaluator,
selection=selection,
crossover=crossover,
mutation=mutation, # Your custom mutation!
population_initializer=initializer,
config=config
)
Step 5: Run and Compare
python main.py
Compare results with default SwapMutation:
- Which converges faster?
- Which finds better solutions?
- Why?
🔄 Tutorial 2: Custom Crossover Operator
Let's create Edge Recombination Crossover (ERX).
Step 1: Understand Edge Recombination
ERX preserves edges (connections between cities) from parents:
Parent 1: A-B-C-D-E
Edges: A-B, B-C, C-D, D-E, E-A
Parent 2: A-C-E-D-B
Edges: A-C, C-E, E-D, D-B, B-A
Offspring: Build tour preserving these edges
Step 2: Implementation
Create genetic_algorithm/operators/erx_crossover.py:
"""Edge Recombination Crossover operator."""
import random
from typing import Dict, Set, List
from genetic_algorithm.core.types import Tour
from genetic_algorithm.operators.crossover import CrossoverStrategy
class EdgeRecombinationCrossover(CrossoverStrategy):
"""
Edge Recombination Crossover (ERX).
Preserves edge information from both parents.
Good for TSP because it maintains adjacency relationships.
Algorithm:
1. Build edge table from both parents
2. Start with random city
3. Move to neighbor with fewest unused edges
4. Repeat until tour complete
"""
def crossover(self, parent1: Tour, parent2: Tour) -> Tour:
"""
Create offspring using edge recombination.
Args:
parent1: First parent tour
parent2: Second parent tour
Returns:
Offspring tour
"""
# Build edge table
edge_table = self._build_edge_table(parent1, parent2)
# Start with random city
current = random.choice(parent1)
offspring = [current]
unvisited = set(parent1) - {current}
# Build tour
while unvisited:
# Remove current city from all edge lists
for city in edge_table:
edge_table[city].discard(current)
# Find next city
neighbors = edge_table[current] & unvisited
if neighbors:
# Choose neighbor with fewest edges
next_city = min(
neighbors,
key=lambda c: len(edge_table[c] & unvisited)
)
else:
# No neighbors available, choose random
next_city = random.choice(list(unvisited))
offspring.append(next_city)
unvisited.remove(next_city)
current = next_city
return offspring
def _build_edge_table(
self,
parent1: Tour,
parent2: Tour
) -> Dict[int, Set[int]]:
"""Build edge table from both parents."""
edge_table: Dict[int, Set[int]] = {city: set() for city in parent1}
# Add edges from parent1
for i, city in enumerate(parent1):
prev_city = parent1[i - 1]
next_city = parent1[(i + 1) % len(parent1)]
edge_table[city].update([prev_city, next_city])
# Add edges from parent2
for i, city in enumerate(parent2):
prev_city = parent2[i - 1]
next_city = parent2[(i + 1) % len(parent2)]
edge_table[city].update([prev_city, next_city])
return edge_table
Step 3: Test ERX
def test_erx_crossover(self):
"""Test edge recombination crossover."""
from genetic_algorithm.operators.erx_crossover import EdgeRecombinationCrossover
crossover = EdgeRecombinationCrossover()
parent1 = [0, 1, 2, 3, 4]
parent2 = [4, 3, 2, 1, 0]
offspring = crossover.crossover(parent1, parent2)
# Valid tour
self.assertEqual(len(offspring), len(parent1))
self.assertEqual(set(offspring), set(parent1))
Step 4: Benchmark Against Order Crossover
Create benchmark script:
import time
from genetic_algorithm.operators.order_crossover import OrderCrossover
from genetic_algorithm.operators.erx_crossover import EdgeRecombinationCrossover
# Test with both crossovers
for Crossover in [OrderCrossover, EdgeRecombinationCrossover]:
crossover = Crossover()
# Run GA
start = time.time()
result = ga.run()
end = time.time()
print(f"{Crossover.__name__}:")
print(f" Best cost: {result.best_cost:.2f}")
print(f" Time: {end - start:.2f}s")
print(f" Generations: {result.statistics['total_generations']}")
🎲 Tutorial 3: Custom Selection Strategy
Let's create Tournament Selection.
Step 1: Understand Tournament Selection
1. Randomly pick K individuals (tournament size)
2. Select the best from this group
3. Repeat for second parent
Example (K=3):
Population: [Tour1, Tour2, Tour3, Tour4, Tour5]
Fitness: [0.001, 0.002, 0.0015, 0.0012, 0.0018]
Tournament 1: Pick [Tour1, Tour3, Tour5]
Best = Tour5 (fitness 0.0018)
Tournament 2: Pick [Tour2, Tour3, Tour4]
Best = Tour2 (fitness 0.002)
Parents: (Tour5, Tour2)
Step 2: Implementation
Create genetic_algorithm/operators/tournament_selection.py:
"""Tournament Selection strategy."""
import random
from typing import Tuple, List
from genetic_algorithm.core.types import Population, Tour, FitnessScores
from genetic_algorithm.operators.selection import SelectionStrategy
class TournamentSelection(SelectionStrategy):
"""
Tournament Selection.
Randomly select K individuals and pick the best.
Higher tournament size = more selection pressure.
Advantages:
- No fitness scaling needed
- Adjustable selection pressure
- Efficient for large populations
"""
def __init__(self, tournament_size: int = 5):
"""
Initialize tournament selection.
Args:
tournament_size: Number of individuals in each tournament
"""
if tournament_size < 2:
raise ValueError("Tournament size must be >= 2")
self.tournament_size = tournament_size
def select(
self,
population: Population,
fitness_scores: FitnessScores
) -> Tuple[Tour, Tour]:
"""
Select two parents using tournament selection.
Args:
population: Current population
fitness_scores: Fitness of each individual
Returns:
Tuple of two selected parents
"""
parent1 = self._tournament(population, fitness_scores)
parent2 = self._tournament(population, fitness_scores)
return parent1, parent2
def _tournament(
self,
population: Population,
fitness_scores: FitnessScores
) -> Tour:
"""Run one tournament."""
# Random sample
size = min(self.tournament_size, len(population))
indices = random.sample(range(len(population)), size)
# Find best in tournament
best_idx = max(indices, key=lambda i: fitness_scores[i])
return population[best_idx].copy()
Step 3: Compare Selection Strategies
# Roulette Wheel
selection1 = RouletteWheelSelection()
# Tournament
selection2 = TournamentSelection(tournament_size=5)
# Test both and compare:
# - Convergence speed
# - Final solution quality
# - Diversity maintained
🌱 Tutorial 4: Custom Initialization
Let's create Savings Algorithm Initialization.
Step 1: Understand Savings Algorithm
Clarke-Wright savings algorithm:
1. Start with "star" configuration (all cities from depot)
2. Calculate savings for merging routes: s(i,j) = d(0,i) + d(0,j) - d(i,j)
3. Merge routes with highest savings
4. Continue until one tour
Step 2: Implementation
Create genetic_algorithm/initialization/savings_init.py:
"""Savings Algorithm initialization."""
from typing import List, Tuple
from genetic_algorithm.core.types import CityList, Population, Tour
from genetic_algorithm.initialization.base import PopulationInitializer
from genetic_algorithm.utils.distance import euclidean_distance
class SavingsInitializer(PopulationInitializer):
"""
Initialize population using Clarke-Wright Savings Algorithm.
Creates one high-quality tour using savings heuristic,
then generates variations for the rest of the population.
"""
def __init__(self, population_size: int):
"""Initialize with population size."""
super().__init__(population_size)
def initialize(self, cities: CityList) -> Population:
"""
Create initial population.
Args:
cities: List of cities
Returns:
Initial population
"""
# Create one savings-based tour
savings_tour = self._create_savings_tour(cities)
# Create variations
population = [savings_tour]
for _ in range(self.population_size - 1):
# Create variation by swapping random cities
variant = savings_tour.copy()
# ... add randomization ...
population.append(variant)
return population
def _create_savings_tour(self, cities: CityList) -> Tour:
"""Create tour using savings algorithm."""
n = len(cities)
# Calculate all savings
savings: List[Tuple[float, int, int]] = []
for i in range(1, n):
for j in range(i + 1, n):
s = (euclidean_distance(cities[0], cities[i]) +
euclidean_distance(cities[0], cities[j]) -
euclidean_distance(cities[i], cities[j]))
savings.append((s, i, j))
# Sort by savings (descending)
savings.sort(reverse=True)
# Build tour (simplified)
tour = list(range(n))
# ... implement merging logic ...
return tour
🎯 Best Practices
1. Always Copy Input
# ✅ Good
def mutate(self, tour: Tour) -> Tour:
mutated = tour.copy()
# Modify mutated
return mutated
# ❌ Bad
def mutate(self, tour: Tour) -> Tour:
# Modifies original!
tour[0], tour[1] = tour[1], tour[0]
return tour
2. Validate Output
def crossover(self, parent1: Tour, parent2: Tour) -> Tour:
offspring = self._do_crossover(parent1, parent2)
# Validate
assert len(offspring) == len(parent1), "Invalid tour length"
assert set(offspring) == set(parent1), "Invalid cities"
return offspring
3. Document Algorithm
class MyOperator:
"""
Brief description.
Detailed explanation of the algorithm.
Example:
Input: ...
Output: ...
References:
[1] Paper name, Author, Year
"""
4. Add Configuration
class TournamentSelection(SelectionStrategy):
def __init__(self, tournament_size: int = 5):
self.tournament_size = tournament_size
# Now users can tune the parameter!
🧪 Testing Your Operators
Unit Tests
class TestMyOperator(unittest.TestCase):
def test_basic_functionality(self):
"""Test basic operation."""
operator = MyOperator()
result = operator.apply(input_data)
self.assertIsNotNone(result)
def test_preserves_validity(self):
"""Test output is valid tour."""
operator = MyOperator()
tour = [0, 1, 2, 3, 4]
result = operator.apply(tour)
self.assertEqual(len(result), len(tour))
self.assertEqual(set(result), set(tour))
def test_determinism(self):
"""Test reproducibility with seed."""
random.seed(42)
result1 = operator.apply(input_data)
random.seed(42)
result2 = operator.apply(input_data)
self.assertEqual(result1, result2)
Integration Tests
def test_full_ga_with_custom_operator(self):
"""Test complete GA run with custom operator."""
ga = GeneticAlgorithm(
cities=test_cities,
mutation=MyCustomMutation(),
...
)
result = ga.run()
self.assertIsNotNone(result.best_tour)
self.assertGreater(result.best_fitness, 0)
📊 Benchmarking
Create benchmarking script:
"""Benchmark different operators."""
import time
from typing import Dict, Any
def benchmark_operator(
operator_class,
operator_params: Dict[str, Any],
num_runs: int = 5
):
"""Benchmark an operator."""
results = []
for run in range(num_runs):
operator = operator_class(**operator_params)
# Run GA
start = time.time()
result = ga.run()
end = time.time()
results.append({
'cost': result.best_cost,
'time': end - start,
'generations': result.statistics['total_generations']
})
# Aggregate
avg_cost = sum(r['cost'] for r in results) / num_runs
avg_time = sum(r['time'] for r in results) / num_runs
print(f"{operator_class.__name__}:")
print(f" Avg Cost: {avg_cost:.2f}")
print(f" Avg Time: {avg_time:.2f}s")
# Run benchmarks
benchmark_operator(SwapMutation, {})
benchmark_operator(ReverseSubsequenceMutation, {})
benchmark_operator(TournamentSelection, {'tournament_size': 5})
🚀 Advanced: Hybrid Operators
Combine multiple strategies:
class HybridMutation(MutationStrategy):
"""Apply multiple mutation operators."""
def __init__(self, operators: List[MutationStrategy], weights: List[float]):
self.operators = operators
self.weights = weights
def mutate(self, tour: Tour) -> Tour:
# Choose operator based on weights
operator = random.choices(self.operators, weights=self.weights)[0]
return operator.mutate(tour)
# Use it:
mutation = HybridMutation(
operators=[SwapMutation(), ReverseSubsequenceMutation()],
weights=[0.7, 0.3] # 70% swap, 30% reverse
)
📚 Further Reading
- Architecture Deep Dive - Understand the codebase
- Performance Tuning - Optimize your operators
- API Reference - Complete API docs
✅ Checklist
- Understand the operator interface
- Implement your operator
- Add unit tests
- Test with full GA
- Benchmark against defaults
- Document your operator
- Share your results!
Ready to optimize? → Performance Tuning Guide