SPOF (Single Point of Failure) - VittorioDeMarzi/hero-beans GitHub Wiki

~: if any component in my system/infrastructure goes down, the whole service crashes.

To avoid it, use redundancy and failover.

Redundancy: Having multiple copies of a critical component so that if one fails, another can take over
Failover: The automatic switching to a redundant component when the primary one fails

Disclaimer

"Avoid adopting technologies just to solve problems that haven’t happened yet." – This documentation is not the ultimate guide to follow. Test, leverage, decide - don't follow blindly.

How to solve it?

Spring Boot

  • Run multiple replicas (Docker containers / Kubernetes pods).
  • Put them behind Nginx.
  • If one instance OOMs or restarts, others handle traffic.

MySQL

  • X Don't run MySQL in a single Docker container
  • Use RDS MySQL Multi-AZ (or Aurora).
    • Automated failover if the primary goes down.
    • Backups & monitoring built-in.
  • Add connection pool failover in Spring Boot (HikariCP can handle multiple hosts).

Nginx

  • X Don’t run a single Nginx in EC2 — SPOF.
  • Use AWS ALB (Application Load Balancer) instead of managing Nginx yourself.
    • Handles TLS termination.
    • Distributes requests across multiple app instances.
    • Scales automatically.

Infrastructure

  • If you run all in one VM, the VM itself is SPOF.
  • Cloud-managed, AWS.
  • X One EC2 = SPOF.
  • Use multiple AZs:
    • ALB → routes across multiple AZs automatically.
    • ECS/EKS → deploy tasks/pods in at least 2 AZs.
    • RDS Multi-AZ → standby in another AZ.
⚠️ **GitHub.com Fallback** ⚠️