On-Premises Failover Scenario – 4 Servers Across 2 Zones
Physical Layout
- Zone A (Rack A): Server A1, Server A2
- Zone B (Rack B): Server B1, Server B2
- Each zone is in separate racks, power circuits, possibly even separate rooms/buildings (if possible).
- Redundant network paths and power per zone.
⚙️ Typical Architecture
- HA Load Balancer: LVS / HAProxy / Keepalived / F5 (active-passive or active-active)
- Heartbeat/Health Monitoring: Keepalived, Corosync, or Pacemaker
- Shared State (optional): GlusterFS, DRBD, etcd, or replicated DB
🔁 Failover Scenarios
1. Single Server Failure (e.g., A1 goes down)
- Load balancer marks A1 as unhealthy.
- A2, B1, B2 continue serving traffic.
- No impact to availability.
2. Zone Failure (e.g., entire Rack A fails — A1 and A2)
- Power/network failure in Zone A.
- Load balancer detects both A1 and A2 as down.
- All traffic is redirected to B1 and B2.
- Ensure B1/B2 can handle the full load.
3. Intermittent Network Failure in One Zone
- Heartbeat may detect nodes as “split”.
- Use quorum-based or fencing mechanisms (STONITH) to avoid split-brain.
- Pacemaker/Corosync can help in cluster management and decision-making.
4. Load Balancer Node Fails
- Use HA load balancer pair with VRRP (Keepalived) or hardware failover.
- Virtual IP (VIP) is moved to the standby node.
5. Storage or DB Node Fails
- If using shared storage or clustered DBs:
- Ensure data replication (synchronous if possible).
- Use quorum-aware systems (odd number of nodes ideal, maybe an external arbitrator).
- DRBD or GlusterFS with quorum can help avoid data corruption.