Write-Ahead Log
Purpose¶
This document defines the Write-Ahead Log (WAL) design for aerodb.
The WAL is the authoritative durability mechanism in Phase 0. No acknowledged write exists unless it is fully persisted in the WAL.
This document is binding. Any implementation that violates these rules is incorrect.
WAL Design Principles¶
The WAL in aerodb prioritizes:
- Durability over throughput
- Determinism over optimization
- Simplicity over cleverness
- Explicit failure over silent recovery
The WAL is intentionally conservative. Phase 0 does not attempt to optimize write performance.
Authoritative Role of the WAL¶
The WAL is:
- The single source of truth for crash recovery
- Required for every acknowledged write
- Fully replayed on every startup in Phase 0
The WAL is never bypassed.
WAL Scope (Phase 0)¶
Included in WAL¶
- Document INSERT operations
- Document UPDATE operations
- Document DELETE operations
- Schema version reference per document
- Operation sequence number
Explicitly Excluded from WAL¶
- Index mutations
- Schema creation or migration
- Configuration changes
- Checkpoints
- Metadata updates
Indexes are derived state and are rebuilt from storage.
WAL File Layout¶
File Properties¶
- Append-only
- Single file
- Never truncated in Phase 0
- Opened with exclusive write access
WAL Record Ordering¶
- WAL records are strictly ordered by a monotonically increasing sequence number
- Sequence numbers start at
1 - Sequence numbers never repeat or rewind
Sequence numbers define the total order of writes.
WAL Record Structure¶
Each WAL record is self-contained and self-describing.
+------------------------+
| Record Length (u32) |
+------------------------+
| Record Type (u8) |
+------------------------+
| Sequence Number (u64) |
+------------------------+
| Payload (variable) |
+------------------------+
| Checksum (u32) |
+------------------------+
Field Definitions¶
| Field | Description |
|---|---|
| Record Length | Total byte length of the record |
| Record Type | INSERT / UPDATE / DELETE |
| Sequence Number | Global monotonic operation ID |
| Payload | Operation-specific data |
| Checksum | CRC32 or equivalent over entire record except checksum |
WAL Payload Format¶
Common Payload Fields¶
Every WAL payload MUST include:
- Collection identifier
- Document primary key
- Schema version identifier
- Full document body (post-operation state)
Payload Rules¶
- WAL records always store the full document state
- No delta encoding
- No partial updates
- Deletes are represented as tombstone records
This guarantees deterministic replay.
WAL Record Types¶
INSERT¶
Represents insertion of a new document.
Rules: - Document must fully conform to schema - Full document body stored in payload
UPDATE¶
Represents replacement of an existing document.
Rules: - WAL stores the entire new document - Partial updates are forbidden - Schema version must be explicit
DELETE¶
Represents deletion of a document.
Rules: - WAL stores a tombstone payload - Document primary key remains referenced - Schema version included for validation
Write Rules (Critical)¶
WAL Write Sequence¶
For every write operation:
- Construct WAL record
- Append record to
wal.log - Flush WAL to disk using
fsync - Only after fsync may the operation proceed
- Only after storage write completes may the operation be acknowledged
Acknowledgment before fsync is forbidden.
fsync Policy (Phase 0)¶
- Every WAL append is followed by
fsync - No batching
- No group commit
- No async durability
This is intentional.
WAL Integrity Guarantees¶
Checksum Enforcement¶
- Every WAL record includes a checksum
- Checksum covers:
- Record header
- Payload
- Sequence number
Any checksum mismatch is corruption.
WAL Corruption Policy¶
Phase 0 Rule: Zero Tolerance¶
If any WAL corruption is detected:
- Startup halts immediately
- No partial replay
- No skipping records
- No repair attempts
The database refuses to start.
This is mandatory.
WAL Replay Rules¶
Replay Order¶
- WAL records are replayed strictly in sequence number order
- Replay always starts from the first record
- Replay is single-threaded
Replay Behavior¶
For each WAL record:
- Validate checksum
- Validate record structure
- Validate schema version existence
- Apply document state to document storage
- If record is DELETE, write tombstone
If any step fails → recovery halts.
WAL and Idempotency¶
Replay Idempotency Rule¶
Applying the same WAL record twice must result in the same final state.
This is achieved by: - Full-document replacement - Primary-key-based overwrite semantics
No record may depend on previous application state.
WAL and Document Storage Interaction¶
- WAL durability precedes storage mutation
- Storage writes are considered secondary
- On crash:
- WAL is authoritative
- Storage is reconciled to WAL state
Indexes are rebuilt after replay completes.
WAL and Recovery Determinism¶
Recovery must be:
- Deterministic
- Repeatable
- Independent of timing or environment
Given the same WAL file, recovery must always produce the same storage state.
WAL Growth Policy (Phase 0)¶
Phase 0 Rule¶
- WAL grows without bound
- WAL is never truncated
- WAL is never compacted
This is a conscious trade-off.
WAL and Clean Shutdown¶
Clean Shutdown Marker¶
- On clean shutdown, aerodb writes a shutdown marker to metadata
- WAL contents remain unchanged
- WAL is not truncated
On restart, WAL is still fully replayed.
Forbidden WAL Behaviors¶
The following are explicitly forbidden:
- Skipping WAL entries
- Partial WAL replay
- WAL batching without fsync
- Delta-based WAL entries
- WAL-driven index mutation
- Silent corruption handling
Any of these violate core invariants.
Invariants Enforced by WAL¶
| Invariant | Enforcement |
|---|---|
| D1 | fsync before acknowledgment |
| R1 | WAL precedes all storage writes |
| R2 | Sequential deterministic replay |
| R3 | Explicit recovery success/failure |
| K1 | Checksums on every record |
| K2 | Halt-on-corruption policy |
| C1 | Full-document atomicity |
Phase 0 Limitations (Intentional)¶
- No checkpointing
- No WAL truncation
- No WAL compression
- No encryption-at-rest
- No replication hooks
All intentionally deferred.
Final Statement¶
The WAL is the foundation of aerodb’s reliability.
If WAL correctness is compromised: - Data safety is compromised - Recovery is compromised - Trust is compromised
There are no acceptable shortcuts.
The WAL must remain boring, strict, and absolute.