REPLICATION_LOG_FLOW.md¶
AeroDB — Replication Log Flow & State Propagation¶
Status¶
- This document is authoritative
- It defines how history moves from Primary to Replica
- WAL is the only replicated unit of truth
- No implementation details or optimizations appear here
- Phase-1 and MVCC semantics are frozen and assumed correct
1. Purpose of Log Flow Design¶
Replication correctness in AeroDB depends on one fact:
All authoritative history is encoded in the WAL.
This document defines:
- What is replicated
- In what order
- With what completeness guarantees
- How gaps and inconsistencies are detected
- When replication must stop
Replication is history replication, not state reconstruction.
2. Replication Units¶
2.1 WAL as the Replication Unit¶
The WAL is:
- Totally ordered
- Checksummed
- CommitId-governed
- Deterministic under replay
Replication must replicate WAL records verbatim.
Replication must not:
- Re-encode WAL entries
- Reorder WAL entries
- Merge or split WAL entries
- Infer missing WAL entries
2.2 Replicated Record Scope¶
All WAL records are replicated, including but not limited to:
- Commit identity records
- Version persistence records
- Snapshot markers
- GC records
- Checkpoint markers
If a WAL record affects recovery or visibility, it must be replicated.
3. Ordering Guarantees¶
3.1 Strict Order Preservation¶
Replication must preserve:
- Byte-level WAL order
- CommitId ordering
- Logical causality
For any two WAL records A and B:
No exceptions.
3.2 Prefix Application Rule¶
At any time, a Replica’s applied WAL must satisfy:
Replicas may lag, but must never:
- Skip records
- Reorder records
- Invent records
4. WAL Transfer Semantics¶
4.1 Transfer Model (Abstract)¶
Replication is defined abstractly as:
- Primary emits WAL records
- Replica receives WAL records
- Replica appends them to its WAL
- Replica replays them deterministically
The transport mechanism is out of scope.
4.2 Durability Boundary¶
A WAL record is considered replicated only when:
- It is durably appended to the Replica’s WAL
- Its checksum is verified
- Its ordering is validated
Partial receipt does not count.
5. Gap Detection¶
5.1 Explicit Gap Definition¶
A WAL gap exists if:
- Replica receives record
N+1 - Record
Nis missing or corrupted
Gaps are detected by:
- WAL sequence metadata
- Checksums
- Explicit ordering markers
5.2 Gap Handling Rule¶
If a gap is detected:
- Replica must stop applying WAL
- Replica enters
ReplicationHalted - No reads or writes are allowed
Gaps are fatal until resolved explicitly.
6. Snapshot vs WAL Bootstrap¶
Replication supports two bootstrap paths, both correctness-equivalent.
6.1 WAL-Only Bootstrap¶
Allowed only if:
- Replica has an empty state
- Full WAL history is available
- WAL replay cost is acceptable
Replica behavior:
- Receive WAL from genesis
- Replay deterministically
6.2 Snapshot-Based Bootstrap¶
Allowed when:
- WAL history is too large
- Replica is too far behind
- Operator initiates snapshot transfer
Snapshot must:
- Represent a valid MVCC cut
- Include commit boundary
- Be self-contained
After snapshot restore:
- Replica resumes WAL replay strictly after snapshot boundary
7. Snapshot + WAL Consistency Rule¶
If snapshot boundary commit is C_snap:
- WAL replay must begin at:
C_snap must not be replayed * WAL entries > C_snap must be contiguous Violation → fatal error.
8. Replica WAL Validation¶
Before applying replicated WAL, Replica must verify:
- WAL record integrity
- CommitId monotonicity
- No divergence from local history
- No conflicting records
If validation fails → replication halts.
9. Interaction with MVCC¶
Replication must preserve:
- CommitId assignment order
- Version creation order
- Snapshot boundaries
- GC eligibility semantics
Replication must not:
- Recompute MVCC state
- Skip MVCC metadata
- Compact history implicitly
Replica MVCC state must match Primary for the same WAL prefix.
10. Crash Semantics During Replication¶
10.1 Crash During WAL Transfer¶
If Replica crashes:
- Before WAL append → record is lost, retransmission required
- After WAL append → record is durable
Recovery replays WAL deterministically.
10.2 Crash During Snapshot Transfer¶
If crash occurs:
-
Snapshot is either:
-
Fully applied, or
- Fully discarded
No partial snapshot state is valid.
11. Explicitly Forbidden Behaviors¶
Replication must never:
- Apply WAL out of order
- Skip WAL records
- “Fill gaps” heuristically
- Reconstruct WAL from state
- Apply WAL without checksum verification
- Continue replication after divergence
Correctness > liveness.
12. Log Flow Summary¶
Replication log flow guarantees:
- Single authoritative history
- Deterministic state propagation
- Explicit detection of inconsistency
- Safe lag, never silent divergence
If history cannot be proven identical, replication must stop.