MVCC_WAL_INTERACTION.md¶
AeroDB — MVCC and WAL Interaction¶
Status¶
- This document is authoritative
- It defines how MVCC state is made durable and recoverable
- WAL is the sole source of truth
- No implementation-level encoding is specified
1. Role of the WAL in MVCC¶
In AeroDB, the Write-Ahead Log (WAL) is:
- The only authority on committed state
- The source of deterministic ordering
- The mechanism by which MVCC survives crashes
All MVCC-relevant state must be represented in the WAL.
2. Commit Identity Assignment¶
2.1 Deterministic Assignment¶
- Commit identities are assigned exactly once
- Assignment occurs as part of commit
-
The ordering of commit identities is:
-
Total
- Strict
- Replayable
No commit identity exists outside the WAL.
2.2 Monotonicity¶
- Commit identities are strictly increasing
- Gaps are allowed only if explicitly represented
- Ordering must be reconstructible solely from WAL order
Commit identity order is derived, not inferred.
3. WAL Records and MVCC State¶
3.1 Required MVCC WAL Semantics¶
The WAL must record, in an unambiguous sequence:
- Intent to commit a write set
- The versions created by that write set
- The commit identity assignment
Recovery must be able to reconstruct:
- Exact version chains
- Exact commit ordering
- Visibility boundaries
3.2 Atomic Visibility Guarantee¶
- Visibility is tied to commit identity durability
- A version becomes visible only after its commit identity is durable
- WAL fsync is the visibility barrier
If a commit identity is not durable, the commit does not exist.
4. Crash Scenarios¶
4.1 Crash Before Commit Record¶
If a crash occurs:
-
Before commit identity is persisted:
-
No versions are visible
- No partial MVCC state survives
- Recovery discards any incomplete version data
4.2 Crash After Commit Record¶
If a crash occurs:
-
After commit identity persistence:
-
All versions associated with that commit are visible
- Recovery must reconstruct full version chains
There is no intermediate state.
5. Interaction with Checkpointing¶
- Checkpoints represent a stable MVCC cut
- All versions visible at checkpoint time must be included
- Commit identities beyond the checkpoint boundary are excluded
Checkpointing does not alter MVCC semantics.
6. Interaction with Snapshots¶
-
Snapshots capture:
-
Version data
- MVCC metadata
- Commit identity boundary
-
Snapshots must be sufficient to:
-
Serve read views
- Resume WAL replay
Snapshots are self-contained MVCC states.
7. WAL Replay Rules¶
During recovery:
- WAL is replayed in strict order
- Commit identities are re-established deterministically
- Version chains are reconstructed
- Visibility rules are reapplied
Recovery does not:
- Reassign commit identities
- Infer missing data
- Skip MVCC records
8. Garbage Collection Interaction¶
- GC decisions must be WAL-represented
- Version removal must be replayable
-
GC must never:
-
Remove versions needed for recovery
- Alter commit identity ordering
GC is subordinate to WAL correctness.
9. Explicitly Forbidden WAL Behaviors¶
The WAL must never:
- Contain ambiguous commit boundaries
- Implicitly encode MVCC state
- Allow out-of-order visibility
- Allow visibility without durability
If MVCC state is not in the WAL, it does not exist.
10. Summary¶
MVCC relies on WAL to provide:
- Deterministic commit ordering
- Crash-safe visibility
- Recoverable version history
The WAL remains the single source of truth.