SPOF (Single Point of Failure) - VittorioDeMarzi/hero-beans GitHub Wiki

~: if any component in my system/infrastructure goes down, the whole service crashes.

To avoid it, use redundancy and failover.

Redundancy: Having multiple copies of a critical component so that if one fails, another can take over
Failover: The automatic switching to a redundant component when the primary one fails

Disclaimer

"Avoid adopting technologies just to solve problems that haven’t happened yet." – This documentation is not the ultimate guide to follow. Test, leverage, decide - don't follow blindly.

How to solve it?

Spring Boot

Run multiple replicas (Docker containers / Kubernetes pods).
Put them behind Nginx.
If one instance OOMs or restarts, others handle traffic.

MySQL

X Don't run MySQL in a single Docker container
Use RDS MySQL Multi-AZ (or Aurora).
- Automated failover if the primary goes down.
- Backups & monitoring built-in.
Add connection pool failover in Spring Boot (HikariCP can handle multiple hosts).

Nginx

X Don’t run a single Nginx in EC2 — SPOF.
Use AWS ALB (Application Load Balancer) instead of managing Nginx yourself.
- Handles TLS termination.
- Distributes requests across multiple app instances.
- Scales automatically.

Infrastructure

If you run all in one VM, the VM itself is SPOF.
Cloud-managed, AWS.
X One EC2 = SPOF.
Use multiple AZs:
- ALB → routes across multiple AZs automatically.
- ECS/EKS → deploy tasks/pods in at least 2 AZs.
- RDS Multi-AZ → standby in another AZ.