WAL BATCHING — PHASE 3¶
Status¶
- Phase: 3
- Authority: Normative
- Scope: Write-ahead log emission optimization
- Dependencies:
- PERF_VISION.md
- PERF_INVARIANTS.md
- PERF_PROOF_RULES.md
- PERFORMANCE_BASELINE.md
- CRITICAL_PATHS.md
- SEMANTIC_EQUIVALENCE.md
- FAILURE_MODEL_PHASE3.md
- ROLLBACK_AND_DISABLEMENT.md
- PERFORMANCE_OBSERVABILITY.md
This document specifies WAL Batching as a correctness-preserving optimization. If any requirement in this document cannot be proven, WAL Batching MUST NOT be implemented.
1. Purpose¶
Baseline AeroDB performs:
- One logical WAL record write per commit
- One physical write per record (even if contiguous)
This results in: - Excessive syscall overhead - Redundant write amplification
WAL Batching reduces overhead by: - Writing multiple WAL records in a single physical write - Without changing WAL semantics, ordering, or durability
WAL Batching affects how WAL is written, not what WAL means.
2. Baseline Reference (Normative)¶
Baseline WAL behavior is defined in:
PERFORMANCE_BASELINE.md§3.2CRITICAL_PATHS.md§2.2
Baseline invariants:
- WAL is append-only
- Logical record boundaries are explicit
- Each record is checksummed
- Ordering is strict
- fsync defines durability
These invariants MUST remain true.
3. Definition of WAL Batching¶
3.1 Conceptual Definition¶
WAL Batching allows:
Multiple logically independent WAL records
to be written to disk using a single physical write operation,
while preserving logical record boundaries and ordering.
Each WAL record remains: - Individually checksummed - Individually parseable - Individually replayable
Only the physical I/O shape changes.
3.2 Non-Definition (What WAL Batching Is NOT)¶
WAL Batching does NOT mean:
- Combining records into a new logical format
- Removing per-record checksums
- Changing replay semantics
- Reordering records
- Delaying fsync
- Time-based buffering
- Adaptive buffering based on load
If batching changes record semantics, it is invalid.
4. Mechanical Description¶
4.1 Baseline WAL Write Path¶
For each commit:
- Serialize WAL record
- Compute checksum
- Write record to WAL file
- (Later) fsync WAL
Each record write is independent.
4.2 WAL Batching Write Path¶
Under WAL Batching, the following mechanical change is permitted:
- Serialize WAL record A
- Serialize WAL record B
- Serialize WAL record C
- Concatenate serialized records into a contiguous buffer
- Perform one write() for the buffer
- (Later) fsync WAL
Rules: - Serialization order MUST match commit order - No record may be split across buffers - Record boundaries MUST be preserved in the byte stream
4.3 Batch Formation Rules¶
Batches MAY be formed only by:
- Sequential availability of serialized records
- Explicit bounded buffering
Batches MUST NOT be formed by: - Timers - Load heuristics - Background aggregation - Dynamic resizing based on latency
Batch size MUST be: - Explicitly bounded - Deterministic - Configuration-defined or compile-time defined
5. Invariant Preservation Matrix¶
(Referenced from PERF_INVARIANTS.md)
Durability¶
- D-1 (Acknowledged Write Durability): Preserved
- D-2 (Atomic Commit Boundary): Preserved
- D-3 (No Silent Downgrade): Preserved
Determinism¶
- DET-1 (Crash Determinism): Preserved
- DET-2 (Replay Equivalence): Preserved
- DET-3 (Bounded Execution): Preserved
MVCC¶
- MVCC-1, MVCC-2, MVCC-3: Preserved
Replication¶
- REP-1, REP-2, REP-3: Preserved
Failure & Recovery¶
- FR-1, FR-2, FR-3: Preserved
Observability¶
- OBS-1, OBS-2: Preserved
Disablement¶
- DIS-1, DIS-2, DIS-3: Preserved
No invariant is weakened.
6. Semantic Equivalence Argument¶
WAL Batching is semantically equivalent to baseline because:
- The WAL byte stream is a strict concatenation of baseline records
- Record boundaries are unchanged
- Record ordering is unchanged
- Checksums validate each record independently
- fsync semantics are unchanged
- Replay reads records identically
From the perspective of: - Clients - Recovery - Replicas
The WAL is indistinguishable from baseline emission.
7. Failure Matrix¶
7.1 Crash Before Write¶
- Baseline: no WAL data written
- Batched: no WAL data written
Equivalent.
7.2 Crash During Batched Write¶
Possible outcomes: - No bytes written - Partial buffer written
In all cases: - Partial records are detected by checksum - Replay behavior matches baseline partial-write behavior
Equivalent.
7.3 Crash After Batched Write, Before fsync¶
- Baseline: data may be lost
- Batched: data may be lost
Equivalent.
7.4 Crash After fsync¶
- Baseline: records durable
- Batched: records durable
Equivalent.
7.5 Disk Error During Write¶
- Error surfaced immediately
- Commit fails
- No acknowledgment
Equivalent.
8. Recovery Proof¶
- WAL replay logic is unchanged
- Replay reads sequential records
- Checksum validation is per-record
- No batching metadata exists
Replay determinism is preserved.
9. Disablement & Rollback¶
9.1 Disablement Mechanism¶
WAL Batching MUST be disableable via:
- Compile-time flag or
- Startup configuration
Disablement restores: - One write per WAL record
9.2 Compatibility Proof¶
- WAL format unchanged
- Snapshot format unchanged
- Checkpoint format unchanged
Batched WAL is readable without batching enabled.
9.3 No Ghost State¶
- No persistent batching metadata
- No WAL flags
- No snapshot annotations
All batching state is in-memory and discardable.
10. Observability¶
Permitted metrics (passive only):
- wal_batch.records_per_write
- wal_batch.bytes_per_write
- wal_batch.write_syscalls
Metrics MUST NOT: - Influence batch size - Influence flushing - Influence grouping
11. Testing Requirements¶
WAL Batching MUST introduce:
- WAL equivalence tests
- Partial-write fault injection tests
- Crash-recovery tests
- Enable → write → crash → disable → recover tests
- Replication prefix validation tests
All existing tests MUST pass unmodified.
12. Explicit Non-Goals¶
WAL Batching does NOT aim to:
- Change fsync frequency (see Group Commit)
- Reduce durability guarantees
- Introduce adaptive buffering
- Change WAL semantics
It optimizes syscall efficiency only.
13. Final Rule¶
WAL Batching is valid only if the WAL
could have been written record-by-record
and no observer could ever tell.
If replay, replication, or recovery can distinguish it, WAL Batching is invalid.
END OF DOCUMENT