kong – failover test

On-Premises Failover Scenario – 4 Servers Across 2 Zones

Physical Layout

  • Zone A (Rack A): Server A1, Server A2
  • Zone B (Rack B): Server B1, Server B2
  • Each zone is in separate racks, power circuits, possibly even separate rooms/buildings (if possible).
  • Redundant network paths and power per zone.

⚙️ Typical Architecture

  • HA Load Balancer: LVS / HAProxy / Keepalived / F5 (active-passive or active-active)
  • Heartbeat/Health Monitoring: Keepalived, Corosync, or Pacemaker
  • Shared State (optional): GlusterFS, DRBD, etcd, or replicated DB

🔁 Failover Scenarios

1. Single Server Failure (e.g., A1 goes down)

  • Load balancer marks A1 as unhealthy.
  • A2, B1, B2 continue serving traffic.
  • No impact to availability.

2. Zone Failure (e.g., entire Rack A fails — A1 and A2)

  • Power/network failure in Zone A.
  • Load balancer detects both A1 and A2 as down.
  • All traffic is redirected to B1 and B2.
  • Ensure B1/B2 can handle the full load.

3. Intermittent Network Failure in One Zone

  • Heartbeat may detect nodes as “split”.
  • Use quorum-based or fencing mechanisms (STONITH) to avoid split-brain.
  • Pacemaker/Corosync can help in cluster management and decision-making.

4. Load Balancer Node Fails

  • Use HA load balancer pair with VRRP (Keepalived) or hardware failover.
  • Virtual IP (VIP) is moved to the standby node.

5. Storage or DB Node Fails

  • If using shared storage or clustered DBs:
    • Ensure data replication (synchronous if possible).
    • Use quorum-aware systems (odd number of nodes ideal, maybe an external arbitrator).
    • DRBD or GlusterFS with quorum can help avoid data corruption.

Leave a comment