CHECKPOINT PIPELINING — PHASE 3¶
Status¶
- Phase: 3
- Authority: Normative
- Scope: Checkpoint execution performance optimization
- Dependencies:
- PERF_VISION.md
- PERF_INVARIANTS.md
- PERF_PROOF_RULES.md
- PERFORMANCE_BASELINE.md
- CRITICAL_PATHS.md
- SEMANTIC_EQUIVALENCE.md
- FAILURE_MODEL_PHASE3.md
- ROLLBACK_AND_DISABLEMENT.md
- PERFORMANCE_OBSERVABILITY.md
This document specifies Checkpoint Pipelining as a correctness-preserving optimization. If any rule herein cannot be proven, this optimization MUST NOT be implemented.
1. Purpose¶
Baseline AeroDB checkpointing is strictly sequential:
- Snapshot creation
- Snapshot persistence
- Snapshot fsync
- Checkpoint marker write
- WAL truncation
This is maximally clear but can cause: - Long pause times - Poor write throughput during checkpoints
Checkpoint Pipelining improves performance by: - Overlapping preparatory work with normal operation - While preserving all durability, ordering, and recovery semantics
Checkpoint Pipelining does not: - Change snapshot semantics - Change checkpoint semantics - Change WAL truncation rules - Introduce speculative persistence
2. Baseline Reference (Normative)¶
Baseline checkpoint behavior is defined in:
PERFORMANCE_BASELINE.md§7CRITICAL_PATHS.md§5
Baseline invariants:
- Snapshot represents a precise MVCC cut
- Snapshot must be fully durable before checkpoint marker
- WAL truncation occurs only after checkpoint durability
- Recovery selects last valid checkpoint deterministically
These invariants MUST remain true.
3. Definition of Checkpoint Pipelining¶
3.1 Conceptual Definition¶
Checkpoint Pipelining allows:
Overlapping non-authoritative checkpoint preparation work
with normal database operation, while deferring all authoritative durability decisions to the baseline ordering.
Only work that has no correctness authority may be pipelined.
3.2 Explicit Non-Definition (What This Is NOT)¶
Checkpoint Pipelining does NOT allow:
- Writing a checkpoint marker early
- Truncating WAL early
- Using incomplete snapshots
- Making snapshots visible before durability
- Time-based checkpoint triggers
- Background speculative cleanup
If any step alters checkpoint authority, the optimization is invalid.
4. Mechanical Description¶
4.1 Baseline Checkpoint Path (Simplified)¶
- Select checkpoint CommitId
- Freeze snapshot visibility
- Enumerate persistent state
- Write snapshot files
- fsync snapshot
- Write checkpoint marker
- fsync marker
- Truncate WAL
4.2 Pipelined Checkpoint Path¶
Checkpoint Pipelining permits the following restructuring:
Phase A — Preparation (Pipeline-Eligible)¶
- Snapshot CommitId selection
- Snapshot visibility freeze
- Snapshot enumeration
- Snapshot file writes (not yet authoritative)
These steps: - Produce tentative snapshot artifacts - Have no recovery authority - May overlap with normal reads and writes
Phase B — Authority (Non-Pipelined)¶
- Snapshot fsync
- Checkpoint marker write
- Checkpoint marker fsync
- WAL truncation
These steps: - Remain strictly ordered - Are identical to baseline semantics - Define durability and recovery authority
4.3 Pipeline Rules¶
- Phase A work MUST be restart-discardable
- Phase B work MUST preserve baseline ordering exactly
- No read or write may observe Phase-A artifacts as authoritative
- No recovery logic may consult Phase-A artifacts
5. Invariant Preservation Matrix¶
(Referenced from PERF_INVARIANTS.md)
Durability¶
- D-1 (Acknowledged Write Durability): Preserved
- D-2 (Atomic Commit Boundary): Preserved
- D-3 (No Silent Downgrade): Preserved
Determinism¶
- DET-1 (Crash Determinism): Preserved
- DET-2 (Replay Equivalence): Preserved
- DET-3 (Bounded Execution): Preserved
MVCC¶
- MVCC-1 (Snapshot Isolation): Preserved
- MVCC-2 (CommitId Authority): Preserved
- MVCC-3 (Version Chain Integrity): Preserved
Replication¶
- REP-1, REP-2, REP-3: Preserved
Failure & Recovery¶
- FR-1, FR-2, FR-3: Preserved
Observability¶
- OBS-1, OBS-2: Preserved
Disablement¶
- DIS-1, DIS-2, DIS-3: Preserved
No invariant is weakened.
6. Semantic Equivalence Argument¶
Checkpoint Pipelining is semantically equivalent to baseline because:
- The checkpoint CommitId is identical
- The snapshot contents are identical
- The durability boundary is identical
- The checkpoint marker is written at the same semantic point
- WAL truncation rules are unchanged
Pipelined work only prepares data earlier; it does not change when that data becomes authoritative.
7. Failure Matrix¶
7.1 Crash During Phase A (Preparation)¶
- Baseline: no checkpoint
- Pipelined: no authoritative checkpoint
Recovery: - Discards tentative snapshot artifacts - Uses previous checkpoint
Equivalent.
7.2 Crash Between Phase A and Phase B¶
- No snapshot fsync
- No checkpoint marker
Recovery: - Tentative snapshot ignored - WAL replay continues from last checkpoint
Equivalent.
7.3 Crash During Snapshot fsync (Phase B)¶
- fsync incomplete → snapshot not durable
Recovery: - Snapshot rejected - WAL replay used
Equivalent.
7.4 Crash After Snapshot fsync, Before Marker fsync¶
- Snapshot durable
- Marker not durable
Recovery: - Snapshot not selected - WAL replay used
Equivalent.
7.5 Crash After Marker fsync, Before WAL Truncation¶
- Checkpoint valid
- WAL not yet truncated
Recovery: - Checkpoint selected - WAL replay resumes correctly
Equivalent.
8. Recovery Proof¶
- Recovery logic remains unchanged
- Recovery selects checkpoints based only on durable markers
- Tentative artifacts are ignored
- Replay behavior is deterministic
No optimization-specific replay logic exists.
9. Disablement & Rollback¶
9.1 Disablement Mechanism¶
Checkpoint Pipelining MUST be disableable via:
- Compile-time flag or
- Startup configuration
Disablement restores: - Fully sequential checkpoint behavior
9.2 Compatibility Proof¶
- Snapshot format unchanged
- Checkpoint marker format unchanged
- WAL format unchanged
Pipelined artifacts are compatible or discardable.
9.3 No Ghost State¶
- Tentative snapshot artifacts are clearly marked or isolated
- No persistent flags indicate “in-progress checkpoint”
- No metadata leaks into recovery logic
10. Observability¶
Permitted metrics (passive only):
- checkpoint.pipeline.prepare_duration
- checkpoint.pipeline.authority_duration
- checkpoint.pipeline.aborted_count
Metrics MUST NOT: - Influence scheduling - Influence pipeline depth - Influence checkpoint triggering
11. Testing Requirements¶
Checkpoint Pipelining MUST introduce:
- Crash tests at every pipeline boundary
- Recovery equivalence tests
- Disablement equivalence tests
- WAL truncation correctness tests
- Replication compatibility tests
All existing tests MUST pass unmodified.
12. Explicit Non-Goals¶
Checkpoint Pipelining does NOT aim to:
- Change checkpoint frequency
- Reduce checkpoint durability guarantees
- Make checkpoints incremental
- Introduce background checkpoints
It improves overlap only.
13. Final Rule¶
A checkpoint is defined by when it becomes durable,
not by when work starts.
Checkpoint Pipelining is valid only if recovery, replicas, and clients cannot observe any difference.
END OF DOCUMENT