Skip to content

FAILOVER_SCOPE.md — Failover & Promotion

Status

  • Phase: 6
  • Authority: Normative
  • Depends on: FAILOVER_VISION.md
  • Frozen Dependencies: Phases 0–5

1. Purpose of This Document

This document defines the exact scope boundaries of Phase 6.

Its role is to make Phase 6: - Explicit - Finite - Non-expanding - Immune to scope creep

Anything not explicitly allowed here is forbidden.


2. In-Scope Capabilities (Allowed)

Phase 6 is limited to the following capabilities.

2.1 Explicit Replica Promotion

Phase 6 defines: - When a replica may be promoted - What conditions must be satisfied - How promotion is validated - How authority is transferred

Promotion is: - Explicitly triggered - Deterministically evaluated - Either accepted or rejected atomically


2.2 Promotion Safety Validation

Before promotion, the system MUST be able to prove:

  • Replica WAL position is sufficient
  • No acknowledged writes would be lost
  • No dual-primary condition can arise
  • MVCC visibility rules remain intact
  • Replication invariants from Phase 5 are preserved

If proof fails, promotion MUST be rejected.


2.3 Failover State Modeling

Phase 6 introduces: - A formal failover-related state model - Explicit state transitions - Explicit forbidden transitions

No implicit or inferred transitions are allowed.


2.4 Authority Rebinding

Phase 6 defines: - How write authority is transferred - When the old primary is considered invalid - How the new primary assumes authority

Authority rebinding is: - Single-writer - Non-overlapping - Explicitly observable


2.5 Observability & Explanation (Additive Only)

Phase 6 may add: - New observability events - New explanation surfaces

But: - Must reuse existing observability infrastructure - Must remain passive - Must not influence control flow


3. Explicit Non-Goals (Forbidden)

Phase 6 MUST NOT introduce any of the following.

3.1 Automatic Failover

Forbidden: - Automatic leader election - Background promotion - Health-check-driven failover - Time-based decisions

All promotion is explicit.


3.2 Consensus Protocols

Phase 6 MUST NOT: - Introduce Raft, Paxos, Zab, etc. - Add quorum voting - Add majority-based authority

Single-writer authority remains absolute.


3.3 Multi-Primary or Split-Brain Handling

Phase 6 does NOT: - Allow dual primaries - Tolerate split brain - Attempt conflict resolution

If safety cannot be proven, the system must halt or reject.


3.4 Replication Redesign

Phase 6 MUST NOT: - Change WAL semantics - Change replication protocol - Change snapshot behavior - Change recovery logic

Replication from Phase 5 is frozen.


3.5 Admin UI or Operator Tooling

Phase 6 does NOT include: - Admin dashboards - Web UI - CLI convenience tooling - Operator workflows

These belong to a later phase.


4. Scope Boundaries with Frozen Phases

4.1 Phase 5 Boundary

Phase 6 may: - Read Phase 5 replication state - Validate Phase 5 invariants - React to Phase 5 states

Phase 6 may NOT: - Modify Phase 5 state machines - Add hidden transitions - Alter replication correctness rules


4.2 Phase 0–4 Boundary

Phase 6 MUST NOT: - Affect WAL durability semantics - Affect MVCC visibility - Affect recovery determinism - Affect observability guarantees


5. Failure Handling Scope

Phase 6 is responsible for: - Defining promotion failure cases - Making failures explicit - Explaining why promotion failed

Phase 6 is NOT responsible for: - Repairing failures - Masking failures - Retrying promotion automatically


6. Completeness Criteria

Phase 6 scope is considered complete when:

  • All allowed behaviors are specified
  • All forbidden behaviors are explicitly excluded
  • No ambiguity exists about system behavior
  • No frozen phase semantics are touched

7. Scope Lock Rule

Once FAILOVER_SCOPE.md is approved:

Any request not covered by this document
MUST be deferred to a later phase.


END OF DOCUMENT