TSP‐Overview - adylagad/CSCI-561_Genetic-Algorithm GitHub Wiki

Traveling Salesman Problem (TSP) Overview

Understanding the classic optimization problem this framework solves.

🎯 What is TSP?

The Traveling Salesman Problem asks:

"Given a list of cities and the distances between them, what is the shortest possible route that visits each city exactly once and returns to the starting city?"

Simple Example


4 cities: A, B, C, D

Distances:
A-B: 10km B-C: 15km C-D: 12km
A-C: 8km B-D: 20km
A-D: 18km

Question: What's the shortest tour?

All Possible Tours


Tour 1: A → B → C → D → A = 10 + 15 + 12 + 18 = 55km
Tour 2: A → B → D → C → A = 10 + 20 + 12 + 8 = 50km ✓
Tour 3: A → C → B → D → A = 8 + 15 + 20 + 18 = 61km
Tour 4: A → C → D → B → A = 8 + 12 + 20 + 10 = 50km ✓
Tour 5: A → D → B → C → A = 18 + 20 + 15 + 8 = 61km
Tour 6: A → D → C → B → A = 18 + 12 + 15 + 10 = 55km

Best tours: #2 and #4 with 50km

Easy with 4 cities, but...

😱 The Complexity Problem

Why TSP is Hard

Number of possible tours grows factorially:

Cities Possible Tours Time to Check All (1µs each)
5 12 0.000012 seconds
10 181,440 0.18 seconds
15 43 billion 12 hours
20 60 quintillion 1,927 years
50 3.04×10⁶⁴ Longer than universe exists
500 Impossible

Formula: (n-1)! / 2 tours for n cities

Our Input: 500 Cities


Possible tours: ~10^1134

For perspective:

-   Atoms in universe: ~10^80
-   This is 10^1054 times more!

Conclusion: We can't check all tours. We need a smart approach! → Genetic Algorithm

🌍 TSP in 3D Space

This implementation uses 3D coordinates:


City format: (x, y, z)

Example cities:
City 1: (66, 153, 121)
City 2: (116, 152, 189)
City 3: (95, 47, 90)
...

Distance between cities:
d = √[(x₂-x₁)² + (y₂-y₁)² + (z₂-z₁)²]

Example:
City 1 to City 2:
d = √[(116-66)² + (152-153)² + (189-121)²]
d = √[2500 + 1 + 4624]
d = √7125 ≈ 84.4 units

📊 TSP Variants

1. Symmetric TSP (This Implementation)

Distance A→B = Distance B→A


A →(10km)→ B
A ←(10km)← B (Same distance both ways)

2. Asymmetric TSP

Distance A→B ≠ Distance B→A (e.g., one-way streets, wind)


A →(10km)→ B
A ←(15km)← B (Different distances)

3. Euclidean TSP (This Implementation)

Cities have coordinates, distance is Euclidean


City A: (0, 0)
City B: (3, 4)
Distance: √(3² + 4²) = 5

4. Metric TSP

Triangle inequality holds: d(A,C) ≤ d(A,B) + d(B,C)


     B
    /|\
  5/ | \4
   / | \
 A 3 | C

Direct A→C (3) ≤ Via B: A→B→C (5+4)

🎯 Real-World Applications

1. Logistics & Delivery


Amazon delivery truck needs to deliver 50 packages:

-   Each stop is a "city"
-   Find shortest route
-   Save time, fuel, money

Impact: 1% route improvement = millions saved

2. Manufacturing


Circuit board drilling:

-   Drill 1000 holes
-   Minimize drill movement
-   Faster production

3. DNA Sequencing


Assemble DNA fragments:

-   Each fragment is a "city"
-   Find correct order
-   Critical for genome projects

4. Telescope Scheduling


Observe 100 stars:

-   Minimize telescope movement
-   Maximize observation time
-   Optimize expensive telescope time

🧮 Solution Quality Measures

Optimal vs Approximate


Optimal Solution: 48,000km (guaranteed best)
Our GA Solution: 48,234km (0.5% from optimal)

For 500 cities:

-   Optimal: Unknown (can't calculate)
-   Random tour: ~150,000km
-   Greedy: ~65,000km
-   Our GA: ~48,000km

Performance Metrics

Solution Quality:


Gap = (GA_solution - Best_known) / Best_known × 100%

<1%: Excellent
<5%: Good
<10%: Acceptable

> 10%: Poor

Runtime:


Our GA (500 cities): ~5 minutes
Exact algorithm: Impossible

Trade-off: Quick approximate vs slow optimal

🔍 TSP Hardness: NP-Complete

What does NP-Complete mean?

NP: Non-deterministic Polynomial time

  • Easy to verify a solution
  • Hard to find the solution

Complete: As hard as any problem in NP

  • If you solve TSP efficiently, you solve ALL NP problems!

Practical Implications


Can't find optimal solution for large instances
↓
Use approximation algorithms
↓
Genetic Algorithm is one approach
↓
Get "good enough" solution quickly

📈 TSP Difficulty Factors

Problem Size (Number of Cities)


Small (< 20): Solvable exactly
Medium (20-100): Challenging but manageable
Large (100-1000): Need heuristics
Huge (>1000): Very difficult

City Distribution


Clustered cities:
●●
●● ●
●
(Easier - natural groups)

Random distribution:
● ●
● ● ●
● ●
(Harder - no obvious structure)

Dimension


2D: Easier to visualize
3D: Our implementation
4D+: Very abstract

🛠️ Solution Approaches

Exact Algorithms

Branch and Bound, Dynamic Programming

✅ Guarantees optimal solution ❌ Only practical for small instances (<20 cities)

Heuristic Algorithms

Nearest Neighbor, 2-opt, 3-opt

✅ Fast ❌ No quality guarantee ✅ Good for initial solutions

Metaheuristics

Genetic Algorithm, Simulated Annealing, Ant Colony

✅ Good balance of quality and speed ✅ Suitable for large instances ❌ No optimality guarantee

Our Approach: Genetic Algorithm


Why GA for TSP?

1. Handles 500 cities well
2. Finds good (not optimal) solutions
3. Reasonable runtime (~5 min)
4. Flexible and extensible
5. No problem-specific knowledge needed

📝 Input Format for Our Implementation


File: input/input1.txt

Line 1: Number of cities
500

Lines 2-501: City coordinates (x, y, z)
66 153 121
116 152 189
95 47 90
...

Each line represents one city in 3D space.

🎲 Challenges in TSP

1. Local Optima


Current tour: 50,000km (local best)
Small changes: 51,000km, 52,000km (all worse)

But there exists: 48,000km (global best)
Need to "jump" to find it!

GA Solution: Mutation helps escape local optima

2. Premature Convergence


Generation 100: Population converges to 55,000km
But better solutions exist!

GA Solution: Maintain diversity through population

3. Scale


500 cities = 125,000 distances to consider
Search space: 10^1134 possible tours

GA Solution: Smart sampling of search space

📚 Further Reading

🧪 Try It

Run the TSP solver on 500 cities:

python main.py

Watch it find a tour of ~48,000km in minutes - something exact algorithms can't do!


Next: See how it's implemented → Architecture Deep Dive