TSP‐Overview - adylagad/CSCI-561_Genetic-Algorithm GitHub Wiki
Traveling Salesman Problem (TSP) Overview
Understanding the classic optimization problem this framework solves.
🎯 What is TSP?
The Traveling Salesman Problem asks:
"Given a list of cities and the distances between them, what is the shortest possible route that visits each city exactly once and returns to the starting city?"
Simple Example
4 cities: A, B, C, D
Distances:
A-B: 10km B-C: 15km C-D: 12km
A-C: 8km B-D: 20km
A-D: 18km
Question: What's the shortest tour?
All Possible Tours
Tour 1: A → B → C → D → A = 10 + 15 + 12 + 18 = 55km
Tour 2: A → B → D → C → A = 10 + 20 + 12 + 8 = 50km ✓
Tour 3: A → C → B → D → A = 8 + 15 + 20 + 18 = 61km
Tour 4: A → C → D → B → A = 8 + 12 + 20 + 10 = 50km ✓
Tour 5: A → D → B → C → A = 18 + 20 + 15 + 8 = 61km
Tour 6: A → D → C → B → A = 18 + 12 + 15 + 10 = 55km
Best tours: #2 and #4 with 50km
Easy with 4 cities, but...
😱 The Complexity Problem
Why TSP is Hard
Number of possible tours grows factorially:
| Cities | Possible Tours | Time to Check All (1µs each) |
|---|---|---|
| 5 | 12 | 0.000012 seconds |
| 10 | 181,440 | 0.18 seconds |
| 15 | 43 billion | 12 hours |
| 20 | 60 quintillion | 1,927 years |
| 50 | 3.04×10⁶⁴ | Longer than universe exists |
| 500 | ∞ | Impossible |
Formula: (n-1)! / 2 tours for n cities
Our Input: 500 Cities
Possible tours: ~10^1134
For perspective:
- Atoms in universe: ~10^80
- This is 10^1054 times more!
Conclusion: We can't check all tours. We need a smart approach! → Genetic Algorithm
🌍 TSP in 3D Space
This implementation uses 3D coordinates:
City format: (x, y, z)
Example cities:
City 1: (66, 153, 121)
City 2: (116, 152, 189)
City 3: (95, 47, 90)
...
Distance between cities:
d = √[(x₂-x₁)² + (y₂-y₁)² + (z₂-z₁)²]
Example:
City 1 to City 2:
d = √[(116-66)² + (152-153)² + (189-121)²]
d = √[2500 + 1 + 4624]
d = √7125 ≈ 84.4 units
📊 TSP Variants
1. Symmetric TSP (This Implementation)
Distance A→B = Distance B→A
A →(10km)→ B
A ←(10km)← B (Same distance both ways)
2. Asymmetric TSP
Distance A→B ≠ Distance B→A (e.g., one-way streets, wind)
A →(10km)→ B
A ←(15km)← B (Different distances)
3. Euclidean TSP (This Implementation)
Cities have coordinates, distance is Euclidean
City A: (0, 0)
City B: (3, 4)
Distance: √(3² + 4²) = 5
4. Metric TSP
Triangle inequality holds: d(A,C) ≤ d(A,B) + d(B,C)
B
/|\
5/ | \4
/ | \
A 3 | C
Direct A→C (3) ≤ Via B: A→B→C (5+4)
🎯 Real-World Applications
1. Logistics & Delivery
Amazon delivery truck needs to deliver 50 packages:
- Each stop is a "city"
- Find shortest route
- Save time, fuel, money
Impact: 1% route improvement = millions saved
2. Manufacturing
Circuit board drilling:
- Drill 1000 holes
- Minimize drill movement
- Faster production
3. DNA Sequencing
Assemble DNA fragments:
- Each fragment is a "city"
- Find correct order
- Critical for genome projects
4. Telescope Scheduling
Observe 100 stars:
- Minimize telescope movement
- Maximize observation time
- Optimize expensive telescope time
🧮 Solution Quality Measures
Optimal vs Approximate
Optimal Solution: 48,000km (guaranteed best)
Our GA Solution: 48,234km (0.5% from optimal)
For 500 cities:
- Optimal: Unknown (can't calculate)
- Random tour: ~150,000km
- Greedy: ~65,000km
- Our GA: ~48,000km
Performance Metrics
Solution Quality:
Gap = (GA_solution - Best_known) / Best_known × 100%
<1%: Excellent
<5%: Good
<10%: Acceptable
> 10%: Poor
Runtime:
Our GA (500 cities): ~5 minutes
Exact algorithm: Impossible
Trade-off: Quick approximate vs slow optimal
🔍 TSP Hardness: NP-Complete
What does NP-Complete mean?
NP: Non-deterministic Polynomial time
- Easy to verify a solution
- Hard to find the solution
Complete: As hard as any problem in NP
- If you solve TSP efficiently, you solve ALL NP problems!
Practical Implications
Can't find optimal solution for large instances
↓
Use approximation algorithms
↓
Genetic Algorithm is one approach
↓
Get "good enough" solution quickly
📈 TSP Difficulty Factors
Problem Size (Number of Cities)
Small (< 20): Solvable exactly
Medium (20-100): Challenging but manageable
Large (100-1000): Need heuristics
Huge (>1000): Very difficult
City Distribution
Clustered cities:
●●
●● ●
●
(Easier - natural groups)
Random distribution:
● ●
● ● ●
● ●
(Harder - no obvious structure)
Dimension
2D: Easier to visualize
3D: Our implementation
4D+: Very abstract
🛠️ Solution Approaches
Exact Algorithms
Branch and Bound, Dynamic Programming
✅ Guarantees optimal solution ❌ Only practical for small instances (<20 cities)
Heuristic Algorithms
Nearest Neighbor, 2-opt, 3-opt
✅ Fast ❌ No quality guarantee ✅ Good for initial solutions
Metaheuristics
Genetic Algorithm, Simulated Annealing, Ant Colony
✅ Good balance of quality and speed ✅ Suitable for large instances ❌ No optimality guarantee
Our Approach: Genetic Algorithm
Why GA for TSP?
1. Handles 500 cities well
2. Finds good (not optimal) solutions
3. Reasonable runtime (~5 min)
4. Flexible and extensible
5. No problem-specific knowledge needed
📝 Input Format for Our Implementation
File: input/input1.txt
Line 1: Number of cities
500
Lines 2-501: City coordinates (x, y, z)
66 153 121
116 152 189
95 47 90
...
Each line represents one city in 3D space.
🎲 Challenges in TSP
1. Local Optima
Current tour: 50,000km (local best)
Small changes: 51,000km, 52,000km (all worse)
But there exists: 48,000km (global best)
Need to "jump" to find it!
GA Solution: Mutation helps escape local optima
2. Premature Convergence
Generation 100: Population converges to 55,000km
But better solutions exist!
GA Solution: Maintain diversity through population
3. Scale
500 cities = 125,000 distances to consider
Search space: 10^1134 possible tours
GA Solution: Smart sampling of search space
📚 Further Reading
- GA Explained - How GAs solve TSP
- Architecture Deep Dive - Implementation details
- Performance Tuning - Optimize TSP solutions
- Quick Start - Run TSP solver
🧪 Try It
Run the TSP solver on 500 cities:
python main.py
Watch it find a tour of ~48,000km in minutes - something exact algorithms can't do!
Next: See how it's implemented → Architecture Deep Dive