Performance‐Tuning‐Guide - adylagad/CSCI-561_Genetic-Algorithm GitHub Wiki
Performance Tuning Guide
Optimize your genetic algorithm for better solutions and faster convergence.
🎯 Goals
- Better Solution Quality - Find shorter tours
- Faster Convergence - Reach good solutions quickly
- Stable Performance - Consistent results across runs
⚙️ Parameter Tuning
Population Size
Formula: population_size = multiplier × number_of_cities
# Small population (faster, less exploration)
config = GAConfig(population_size_multiplier=0.5) # 250 individuals
# Medium population (balanced)
config = GAConfig(population_size_multiplier=1.0) # 500 individuals ✅
# Large population (slower, more exploration)
config = GAConfig(population_size_multiplier=2.0) # 1000 individuals
Effects:
| Size | Speed | Quality | Diversity | Best For |
|---|---|---|---|---|
| Small (0.5×) | ⚡ Fast | 😐 OK | 📉 Low | Quick tests |
| Medium (1×) | ⚖️ Balanced | ✅ Good | 📊 Medium | General use |
| Large (2×) | 🐌 Slow | 🎯 Best | 📈 High | Hard problems |
Recommendation: Start with 1× multiplier, increase if stuck in local optima.
Mutation Rate
Default: 2% (0.02)
# Low mutation (more exploitation)
config = GAConfig(mutation_rate=0.01) # 1%
# Medium mutation (balanced)
config = GAConfig(mutation_rate=0.02) # 2% ✅
# High mutation (more exploration)
config = GAConfig(mutation_rate=0.10) # 10%
Effects:
| Rate | Convergence | Exploration | Stability | Best For |
|---|---|---|---|---|
| Low (1%) | Fast | Low | High | Near-optimal populations |
| Medium (2-5%) | Balanced | Balanced | Good | General use |
| High (10%+) | Slow | High | Low | Escaping local optima |
Recommendation: 2-5% for most cases. Increase if solution quality plateaus.
Number of Generations
Default: 3500
# Quick run (testing)
config = GAConfig(number_of_generations=1000)
# Normal run
config = GAConfig(number_of_generations=3500) # ✅
# Long run (best quality)
config = GAConfig(number_of_generations=10000)
Rule of Thumb:
Minimum generations = 5 × number_of_cities
Recommended = 7 × number_of_cities
Maximum = 20 × number_of_cities (diminishing returns)
Convergence Threshold
Default: 500 generations without improvement
# Impatient (stop early)
config = GAConfig(convergence_threshold=100)
# Balanced
config = GAConfig(convergence_threshold=500) # ✅
# Patient (wait longer)
config = GAConfig(convergence_threshold=1000)
Effects:
- Low (100): Faster runtime, may miss improvements
- Medium (500): Good balance
- High (1000+): Longer runtime, thorough search
Recommendation: 10-20% of total generations
🔄 Operator Selection
Crossover Strategies
# Order Crossover (default)
crossover = OrderCrossover(selection)
# ✅ Pros: Fast, preserves order
# ❌ Cons: May lose some edge information
# PMX Crossover
crossover = PMXCrossover(selection)
# ✅ Pros: Preserves position information
# ❌ Cons: Slightly slower
# 🎯 Try when: Order Crossover plateaus
# Edge Recombination (custom)
crossover = EdgeRecombinationCrossover(selection)
# ✅ Pros: Preserves edges (good for TSP)
# ❌ Cons: Complex, slower
# 🎯 Try when: Need best quality
Benchmark Results (500 cities):
| Crossover | Avg Cost | Time | Convergence |
|---|---|---|---|
| Order | 48,234 | 3.2min | Gen 2847 |
| PMX | 47,891 | 3.8min | Gen 3012 |
| ERX | 47,456 | 5.1min | Gen 3234 |
Initialization Strategies
# Random Initialization (default)
initializer = RandomInitializer()
# ✅ Pros: Unbiased, fast
# ❌ Cons: Poor initial quality
# Nearest Neighbor Initialization
initializer = NearestNeighborInitializer()
# ✅ Pros: Much better initial solutions
# ❌ Cons: Slightly biased
# 🎯 Use when: Want faster convergence
Impact:
| Strategy | Initial Best | Initial Avg | Final Best | Generations Saved |
|---|---|---|---|---|
| Random | ~120,000 | ~150,000 | 48,234 | 0 (baseline) |
| Nearest Neighbor | ~65,000 | ~70,000 | 48,156 | ~500 (14%) |
Recommendation: Use Nearest Neighbor for faster convergence!
Mutation Strategies
# Swap Mutation (default)
mutation = SwapMutation(mutation_rate=0.02)
# ✅ Pros: Simple, effective
# ❌ Cons: Small changes only
# Reverse Subsequence (custom)
mutation = ReverseSubsequenceMutation(mutation_rate=0.02)
# ✅ Pros: Larger changes, good for escaping local optima
# ❌ Cons: May be too disruptive
# Hybrid Mutation (advanced)
mutation = HybridMutation(
operators=[SwapMutation(), ReverseSubsequenceMutation()],
weights=[0.8, 0.2]
)
# ✅ Pros: Best of both worlds
# 🎯 Try when: Need balanced exploration/exploitation
📊 Monitoring Performance
Key Metrics to Watch
# After running GA:
stats = result.statistics
# 1. Convergence Generation
convergence = stats['convergence_generation']
total = stats['total_generations']
efficiency = (convergence / total) * 100
if efficiency < 50:
print("✅ Good! Converged early")
elif efficiency < 80:
print("⚠️ OK, but could be better")
else:
print("❌ Not converging well")
# 2. Cost Improvement
initial_cost = 150000 # Typical random
final_cost = stats['best_cost']
improvement = (initial_cost - final_cost) / initial_cost * 100
print(f"Improved by {improvement:.1f}%")
# Target: >60% improvement
# 3. Diversity
best = stats['best_cost']
worst = stats['worst_cost']
diversity = (worst - best) / best * 100
print(f"Diversity: {diversity:.1f}%")
# Target: 10-30% diversity
Performance Patterns
Pattern 1: Fast Convergence, Poor Quality
Generation 0: 120,000
Generation 500: 52,000 ← Fast drop
Generation 1000: 51,800
Generation 2000: 51,750 ← Plateaued early
Solution: Increase diversity
- Increase population size
- Increase mutation rate
- Try different initialization
Pattern 2: Slow Convergence, Good Quality
Generation 0: 120,000
Generation 1000: 85,000 ← Slow progress
Generation 2000: 62,000
Generation 3000: 48,500 ← Still improving
Solution: May be fine! Or speed up:
- Decrease population size
- Use better initialization (Nearest Neighbor)
- Adjust selection pressure
Pattern 3: No Improvement
Generation 0: 120,000
Generation 1000: 118,000 ← Barely improving
Generation 2000: 117,500
Generation 3000: 117,000 ← Stuck!
Solution: Major issues
- Check operator implementations
- Try different operators
- Verify fitness calculation
- Check for bugs
🎯 Optimization Recipes
Recipe 1: Quick and Dirty
Goal: Fast results for testing
config = GAConfig(
population_size_multiplier=0.5, # Small population
number_of_generations=1000, # Few generations
convergence_threshold=100, # Stop early
mutation_rate=0.05 # Higher mutation
)
initializer = RandomInitializer() # Simple init
crossover = OrderCrossover(selection) # Simple crossover
Expected: Decent solution in ~1 minute
Recipe 2: Balanced Performance
Goal: Good quality in reasonable time
config = GAConfig(
population_size_multiplier=1.0, # Standard population ✅
number_of_generations=3500, # Standard ✅
convergence_threshold=500, # Balanced ✅
mutation_rate=0.02 # Standard ✅
)
initializer = NearestNeighborInitializer() # Better init
crossover = OrderCrossover(selection) # Fast crossover
Expected: Good solution in ~3-5 minutes
Recipe 3: Best Quality
Goal: Best possible solution
config = GAConfig(
population_size_multiplier=2.0, # Large population
number_of_generations=10000, # Many generations
convergence_threshold=1000, # Patient
mutation_rate=0.03 # Moderate mutation
)
initializer = NearestNeighborInitializer() # Better init
crossover = PMXCrossover(selection) # Better crossover
mutation = HybridMutation(...) # Advanced mutation
Expected: Best solution in ~15-20 minutes
Recipe 4: Escape Local Optima
Goal: Break out of plateaus
config = GAConfig(
population_size_multiplier=1.5, # Larger population
number_of_generations=5000,
convergence_threshold=300, # Restart earlier
mutation_rate=0.08 # High mutation! 🔥
)
# Use diverse initialization
initializer = HybridInitializer(
strategies=[RandomInitializer(), NearestNeighborInitializer()],
ratios=[0.3, 0.7]
)
Expected: More exploration, may find better solutions
🔬 Advanced Techniques
1. Adaptive Mutation Rate
class AdaptiveMutation(MutationStrategy):
"""Mutation rate adapts based on diversity."""
def __init__(self, base_rate: float = 0.02):
self.base_rate = base_rate
self.generation = 0
def mutate(self, population: Population, stats: dict) -> Population:
# Increase mutation if diversity is low
diversity = stats.get('diversity', 0.5)
if diversity < 0.1: # Low diversity
rate = self.base_rate * 3 # Triple mutation!
elif diversity > 0.3: # High diversity
rate = self.base_rate * 0.5 # Half mutation
else:
rate = self.base_rate
# Apply mutation
...
2. Elitism Tuning
# Keep top N% of population
elite_count = int(0.1 * population_size) # Keep top 10%
# Higher elitism (5-15%): Faster convergence
# Lower elitism (1-3%): More exploration
3. Island Model (Parallel)
# Run multiple independent GAs
islands = [
GeneticAlgorithm(config, ...),
GeneticAlgorithm(config, ...),
GeneticAlgorithm(config, ...)
]
# Periodically exchange best solutions
for generation in range(total_generations):
for island in islands:
island.evolve()
if generation % 100 == 0: # Every 100 generations
# Exchange best individuals between islands
migrate_best_solutions(islands)
4. Local Search Hybrid
def hybrid_ga_with_local_search():
"""Combine GA with local search."""
# Run GA
result = ga.run()
best_tour = result.best_tour
# Apply 2-opt local search to refine
refined_tour = two_opt(best_tour, cities)
# Often improves by 2-5%!
return refined_tour
📈 Benchmarking Framework
Create systematic benchmarks:
"""Benchmark script."""
import json
from dataclasses import dataclass
from typing import List
@dataclass
class BenchmarkResult:
config_name: str
best_cost: float
avg_cost: float
runtime: float
convergence_gen: int
def run_benchmark(
config_variants: List[dict],
num_runs: int = 5
) -> List[BenchmarkResult]:
"""Run comprehensive benchmarks."""
results = []
for variant in config_variants:
print(f"\nTesting: {variant['name']}")
costs = []
times = []
conv_gens = []
for run in range(num_runs):
# Setup
config = GAConfig(**variant['params'])
ga = setup_ga(config)
# Run
start = time.time()
result = ga.run()
end = time.time()
# Record
costs.append(result.best_cost)
times.append(end - start)
conv_gens.append(result.statistics['convergence_generation'])
# Aggregate
results.append(BenchmarkResult(
config_name=variant['name'],
best_cost=min(costs),
avg_cost=sum(costs) / len(costs),
runtime=sum(times) / len(times),
convergence_gen=int(sum(conv_gens) / len(conv_gens))
))
return results
# Define variants
variants = [
{
'name': 'Baseline',
'params': {'mutation_rate': 0.02, 'population_size_multiplier': 1.0}
},
{
'name': 'High Mutation',
'params': {'mutation_rate': 0.10, 'population_size_multiplier': 1.0}
},
{
'name': 'Large Population',
'params': {'mutation_rate': 0.02, 'population_size_multiplier': 2.0}
},
]
# Run and analyze
results = run_benchmark(variants, num_runs=5)
# Print comparison
for r in results:
print(f"\n{r.config_name}:")
print(f" Avg Cost: {r.avg_cost:.2f}")
print(f" Runtime: {r.runtime:.2f}s")
print(f" Convergence: Gen {r.convergence_gen}")
🎓 Pro Tips
1. Start with Defaults
Don't over-tune initially. Defaults work well:
config = GAConfig() # Use defaults first!
2. Change One Thing at a Time
# ❌ Bad: Change everything
config = GAConfig(
mutation_rate=0.10,
population_size_multiplier=2.0,
number_of_generations=10000
)
# Can't tell what helped!
# ✅ Good: Iterate
config1 = GAConfig(mutation_rate=0.10) # Test mutation
config2 = GAConfig(population_size_multiplier=2.0) # Test pop size
3. Monitor Diversity
# Track diversity over time
diversity_history = []
for generation in range(total_generations):
best = min(population_costs)
worst = max(population_costs)
diversity = (worst - best) / best
diversity_history.append(diversity)
# Plot to see if diversity is maintained
4. Use Better Initialization
Nearest Neighbor initialization almost always helps:
# Easy win: Just change this
initializer = NearestNeighborInitializer() # Instead of Random
5. Know When to Stop Tuning
Diminishing returns after:
- 5-10% improvement from tuning is excellent
- 1-2% improvement is typical
- <0.5% improvement means you're at the limit
📊 Expected Performance
For 500 Cities
| Configuration | Best Cost | Avg Cost | Runtime | Convergence |
|---|---|---|---|---|
| Quick | ~52,000 | ~55,000 | 1 min | Gen 800 |
| Balanced | ~48,000 | ~49,500 | 3-5 min | Gen 2800 |
| Best Quality | ~46,500 | ~47,500 | 15-20 min | Gen 5500 |
Quality Benchmarks
Random Tour: ~150,000 (Baseline)
Greedy (Nearest): ~65,000 (56% better)
GA (Quick): ~52,000 (65% better)
GA (Balanced): ~48,000 (68% better)
GA (Optimized): ~46,500 (69% better)
Known Best (if any): ~45,000 (70% better) ← Theoretical limit
🐛 Troubleshooting Performance Issues
Issue: Not Converging
Symptoms: Cost stays high, no improvement
Diagnosis:
# Check if operators are working
print("Initial best:", initial_best_cost)
print("Final best:", final_best_cost)
print("Improvement:", (initial - final) / initial * 100, "%")
# Should see >50% improvement
Solutions:
- Verify operators are called correctly
- Check fitness calculation (lower cost = higher fitness)
- Try different operators
- Increase generations
Issue: Converges Too Fast
Symptoms: Plateaus very early, poor diversity
Diagnosis:
print("Convergence generation:", stats['convergence_generation'])
print("Total generations:", stats['total_generations'])
print("Ratio:", stats['convergence_generation'] / stats['total_generations'])
# If ratio < 0.3, converging too fast
Solutions:
- Increase population size
- Increase mutation rate
- Use tournament selection (less pressure)
- Reduce elitism
Issue: Too Slow
Symptoms: Takes forever to run
Diagnosis:
import cProfile
# Profile the code
cProfile.run('ga.run()')
# Find bottlenecks
Solutions:
- Reduce population size
- Reduce generations
- Use faster operators (Order Crossover)
- Optimize distance calculations (caching)
📚 Further Reading
- Custom Operators Tutorial - Create optimized operators
- Architecture Deep Dive - Understand performance implications
- Configuration Reference - All parameters explained
- FAQ - Common performance questions
Having issues? → Troubleshooting