Name	Name	Last commit message	Last commit date
parent directory ..
init	init
DOCUMENTATION.md	DOCUMENTATION.md
README.md	README.md
docker-compose.yml	docker-compose.yml
go.mod	go.mod
go.sum	go.sum
main.go	main.go

Simulate Automatic Failover Using Docker

Tech Stack

Programming Language: Go 1.26.0
Container: Docker Compose
Database: PostgreSQL 18 (1 primary + 1 hot standby)

Overview

Automatic failover keeps a database available when the primary dies: a manager detects the outage and promotes a standby to take over as the new primary, then redirects writes to it. This project implements the core control loop — health probing, a failure threshold, and pg_promote() — in ~150 lines so the mechanism is explicit. Production tools (Patroni, repmgr, pg_auto_failover) wrap this with distributed leader election and fencing.

The demo actually stops the primary container mid-run, so you watch a real outage get detected and recovered with no data loss.

Related project: simulate_read_replicas sets up the same primary/standby streaming replication. Here the focus is what happens when the primary fails.

Key Concepts Demonstrated

Health checking — periodic SELECT 1 probes with a timeout.
Failure threshold — fail over only after N consecutive misses (avoid flapping).
Promotion — pg_promote() makes the standby writable; wait for pg_is_in_recovery() = false.
Write redirection — the manager routes writes to the promoted node.
No data loss — rows written before the crash survive on the promoted node.

Architecture

        ┌──────────────── Failover Manager (Go) ────────────────┐
        │ loop: SELECT 1 on primary every 1s                    │
        │ 3 consecutive failures → pg_promote(standby)          │
        │ wait pg_is_in_recovery()=false → route writes to it   │
        └───────────────┬───────────────────────┬───────────────┘
                        │ probes                 │ promote
              ┌─────────▼────────┐     stream    ┌▼─────────────────┐
              │ Primary  :5470   │ ────WAL────►  │ Standby  :5471   │
              │ (stopped mid-demo)│               │ → NEW PRIMARY    │
              └──────────────────┘               └──────────────────┘

How to Run

docker compose up -d      # primary + standby (standby clones via pg_basebackup)
docker compose ps         # wait until both healthy
go mod tidy
go run main.go            # the program stops the primary container itself

The program calls docker stop failover_primary to simulate the crash. If your user can't run docker without sudo, the program prints the command to run manually in another terminal.

Expected Output

A row is written and replicated to the standby. The manager then stops the primary, its health probes start failing, and after 3 misses it promotes the standby. A post-failover write succeeds on the new primary, and the final count includes both pre- and post-failover rows — proving no data was lost.

Restore After the Demo

docker compose up -d postgres_primary   # bring the old node back (now a stale ex-primary)
# In production you'd re-clone it as a standby of the new primary (pg_rewind / pg_basebackup).
docker compose down -v                  # or wipe everything

A Note on Split-Brain

If the old primary comes back still thinking it's the primary, you have two writable nodes = split-brain and divergent data. Real HA tools prevent this with fencing (STONITH), a distributed consensus store (etcd/Consul) for a single source of truth on who is leader, and quorum so a minority partition won't promote itself. This demo omits fencing for clarity — don't ship it as-is.

Cleanup

docker compose down -v

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Simulate Automatic Failover Using Docker

Tech Stack

Overview

Key Concepts Demonstrated

Architecture

How to Run

Expected Output

Restore After the Demo

A Note on Split-Brain

Cleanup

FilesExpand file tree

simulate_automatic_failover

Directory actions

More options

Directory actions

More options

Latest commit

History

simulate_automatic_failover

Folders and files

parent directory

README.md

Simulate Automatic Failover Using Docker

Tech Stack

Overview

Key Concepts Demonstrated

Architecture

How to Run

Expected Output

Restore After the Demo

A Note on Split-Brain

Cleanup