REPLICATION_PROOFS.md¶
AeroDB Phase 2 — MVCC Correctness Proofs¶
Status¶
- This document is authoritative
- It provides argument-based proofs of MVCC correctness
- Proofs are invariant-driven, not implementation-dependent
- No algorithms or data structures appear here
1. Proof Methodology¶
All proofs in this document are based on:
- Explicit invariants (Phase-1 + Phase-2)
- Deterministic WAL semantics
- Exhaustive failure enumeration
- Absence of heuristics or timing assumptions
Each proof answers one question:
“Why is this behavior always correct, regardless of execution order or crash timing?”
2. Proof of Visibility Correctness¶
Claim¶
For any read operation, the set of visible versions is correct and deterministic.
Argument¶
-
Visibility is defined solely by:
-
Version commit identity
- Read view upper bound
-
Commit identities are:
-
Totally ordered
- WAL-governed
- Deterministically replayed
- A read view is immutable once established
-
Visibility rule selects:
-
The maximum commit identity ≤ read upper bound
Therefore:
- Visibility does not depend on timing
- Visibility does not depend on concurrency
- Visibility does not depend on execution order
Conclusion: Visibility is deterministic and correct by construction.
3. Proof of Snapshot Isolation¶
Claim¶
AeroDB MVCC provides deterministic snapshot isolation.
Argument¶
- A read view fixes a commit identity boundary
- No versions beyond this boundary are visible
- All reads within an operation use the same boundary
- Writes only become visible after commit identity durability
Therefore:
- No dirty reads are possible
- No non-repeatable reads are possible
- No phantom visibility changes are possible
Conclusion: Snapshot isolation is guaranteed without locks or heuristics.
4. Proof of Write Atomicity¶
Claim¶
Writes are atomically visible or invisible, never partial.
Argument¶
- Visibility is tied to commit identity durability
- Commit identity is WAL-persisted as a single authority
- Versions without a durable commit identity are invisible
- Versions with a durable commit identity must be complete or recovery fails
Therefore:
- Partial visibility cannot occur
- Atomicity is enforced at the visibility boundary
Conclusion: Write atomicity is guaranteed across crashes.
5. Proof of Recovery Determinism¶
Claim¶
Recovery reconstructs the exact MVCC state deterministically.
Argument¶
- WAL is replayed in strict order
- Commit identities are derived from WAL order
- Version creation is replayed deterministically
- GC and snapshot events are WAL-represented
Therefore:
- Recovery outcome is uniquely determined by WAL
- No ambiguity exists after replay
Conclusion: Recovery is deterministic and reproducible.
6. Proof of Crash Safety¶
Claim¶
All crash points resolve to valid MVCC states.
Argument¶
- All failure points are enumerated in
MVCC_FAILURE_MATRIX.md -
Each crash point maps to:
-
Commit exists fully, or
- Commit does not exist at all
- No intermediate visibility state is allowed
Therefore:
- Crashes cannot introduce undefined states
- Corruption is detected, not repaired
Conclusion: Crash safety is complete and explicit.
7. Proof of Snapshot Correctness¶
Claim¶
Snapshots represent valid, self-contained MVCC states.
Argument¶
-
Snapshots capture:
-
All versions ≤ snapshot boundary
- All MVCC metadata
- WAL replay resumes strictly after snapshot boundary
- Snapshots impose retention constraints on GC
Therefore:
- Snapshots can serve reads independently
- Snapshots preserve visibility semantics
Conclusion: Snapshots are correct MVCC cuts.
8. Proof of Garbage Collection Safety¶
Claim¶
GC never removes a version that could be visible.
Argument¶
-
GC eligibility requires:
-
Version commit identity < visibility lower bound
- No snapshot retention
- Recovery safety
- Visibility lower bound is explicit and deterministic
- GC actions are WAL-recorded
Therefore:
- No visible or potentially visible version is reclaimed
- GC is replay-safe
Conclusion: GC is safe by proof, not probability.
9. Proof of Phase-1 Compatibility¶
Claim¶
MVCC does not alter Phase-1 behavior.
Argument¶
-
Phase-1 behavior is equivalent to:
-
Single-version chains
- Read views always at latest commit
- MVCC generalizes, never replaces, Phase-1 semantics
- All Phase-1 invariants remain unchanged
Therefore:
- Identical inputs produce identical outputs
- Phase-1 tests remain valid
Conclusion: MVCC is a strict superset of Phase-1 behavior.
10. Proof of Replication Readiness¶
Claim¶
MVCC semantics are replication-safe.
Argument¶
-
All MVCC state is:
-
WAL-represented
- Deterministic
- Commit identities define a total order
- Visibility rules are stateless beyond WAL
Therefore:
- MVCC state can be reconstructed on another node
- No local-only behavior exists
Conclusion: Replication can consume MVCC without redefining it.
11. Global Correctness Theorem¶
Statement¶
Given the Phase-1 invariants, the Phase-2 MVCC invariants, and the WAL as the sole authority, AeroDB MVCC guarantees:
- Deterministic visibility
- Atomic writes
- Snapshot isolation
- Crash-safe recovery
- Safe garbage collection
- Phase-1 behavioral equivalence
Under all execution orders and crash scenarios.
12. End of MVCC Design Phase¶
At this point:
- MVCC design is complete
- All invariants are specified
- All failure modes are defined
- No implementation assumptions remain