READ PATH OPTIMIZATION — PHASE 3¶
Status¶
- Phase: 3
- Authority: Normative
- Scope: Read-only performance optimizations
- Dependencies:
- PERF_VISION.md
- PERF_INVARIANTS.md
- PERF_PROOF_RULES.md
- PERFORMANCE_BASELINE.md
- CRITICAL_PATHS.md
- SEMANTIC_EQUIVALENCE.md
- FAILURE_MODEL_PHASE3.md
- ROLLBACK_AND_DISABLEMENT.md
- PERFORMANCE_OBSERVABILITY.md
This document specifies read-path optimizations that preserve snapshot isolation, determinism, and replication safety.
Any optimization described here MUST be provably read-only and MUST NOT alter MVCC visibility, ordering, or failure behavior.
1. Purpose¶
Baseline AeroDB read behavior prioritizes correctness:
- Explicit MVCC visibility checks
- Deterministic version-chain traversal
- No speculative caching
- No read shortcuts
This yields correct but potentially expensive reads.
Read Path Optimization improves performance by: - Eliminating redundant work - Reducing repeated traversal - Avoiding unnecessary allocations
While strictly preserving: - Snapshot isolation - Visibility semantics - Deterministic behavior
2. Baseline Reference (Normative)¶
Baseline read behavior is defined in:
PERFORMANCE_BASELINE.md§4CRITICAL_PATHS.md§3
Baseline invariants:
- Every read evaluates MVCC visibility explicitly
- Version chains are traversed in CommitId order
- No read observes uncommitted or future data
- Reads do not mutate persistent state
These properties MUST remain true.
3. Definition of Read Path Optimization¶
3.1 Conceptual Definition¶
A read-path optimization is any change that:
Reduces the cost of determining which committed version is visible to a snapshot, without changing the result.
All optimizations are: - Read-only - Snapshot-scoped - Deterministic - Discardable
3.2 Explicit Non-Definition (What This Is NOT)¶
Read Path Optimization does NOT allow:
- Skipping MVCC visibility checks
- Pre-committing visibility
- Read-your-own-writes shortcuts
- Cross-snapshot caching
- Adaptive or speculative behavior
- Time-based cache expiry
If visibility is inferred instead of proven, the optimization is invalid.
4. Allowed Optimization Classes¶
Only the following read optimizations are permitted.
4.1 Snapshot-Local Visibility Caching¶
Description¶
Within a single snapshot, cache the result of:
- Version-chain traversal
- Visibility decision
Keyed by: - Document identifier - Snapshot CommitId
Rules¶
- Cache is valid ONLY for one snapshot
- Cache MUST be discarded when snapshot ends
- Cache entries are immutable
Proof Obligation¶
- Cached result equals traversal result
- No snapshot mutation occurs
- No cross-snapshot leakage
4.2 Deterministic Short-Circuit Traversal¶
Description¶
If version chains are ordered by CommitId:
- Stop traversal once first visible version is found
(This matches baseline semantics but avoids unnecessary work.)
Rules¶
- Order MUST be explicit
- Visibility MUST be checked
- No speculative skipping
Proof Obligation¶
- Later versions cannot be visible
- Earlier versions are superseded
4.3 Zero-Copy Read Materialization¶
Description¶
Avoid copying document data when:
- Data is immutable
- Lifetime exceeds read duration
- Snapshot guarantees immutability
Rules¶
- No mutable aliasing
- No shared ownership beyond snapshot
- Memory safety must be explicit
Proof Obligation¶
- Data cannot change
- Snapshot lifetime bounds usage
4.4 Index-Assisted Visibility Filtering (Read-Only)¶
Description¶
Use existing in-memory indexes to:
- Narrow candidate version sets
- Without skipping visibility checks
Rules¶
- Indexes are advisory only
- Visibility check is mandatory
- Index absence must not affect correctness
5. Forbidden Read Optimizations¶
The following are explicitly forbidden:
- Skipping visibility checks
- Caching across snapshots
- Global read caches
- Heuristic pruning
- Time-based invalidation
- Read-write coupling
- Replica-only visibility shortcuts
- Opportunistic consistency
If correctness depends on cache freshness, the optimization is invalid.
6. Invariant Preservation Matrix¶
(Referenced from PERF_INVARIANTS.md)
Durability¶
- D-1, D-2, D-3: Not Applicable (Read-Only)
Determinism¶
- DET-1 (Crash Determinism): Preserved
- DET-2 (Replay Equivalence): Preserved
- DET-3 (Bounded Execution): Preserved
MVCC¶
- MVCC-1 (Snapshot Isolation): Preserved
- MVCC-2 (CommitId Authority): Preserved
- MVCC-3 (Version Chain Integrity): Preserved
Replication¶
- REP-1, REP-2, REP-3: Preserved
Failure & Recovery¶
- FR-1, FR-2, FR-3: Preserved
Observability¶
- OBS-1, OBS-2: Preserved
Disablement¶
- DIS-1, DIS-2, DIS-3: Preserved
7. Semantic Equivalence Argument¶
Read Path Optimization is semantically equivalent to baseline because:
- For any snapshot and document:
- The same version is selected
- The same data is returned
- Visibility rules are evaluated identically
- No persistent state is modified
- No ordering is changed
Optimizations only reduce how much work is done, not what is decided.
8. Failure Matrix¶
8.1 Crash During Read¶
- Baseline: read aborted, no state change
- Optimized: read aborted, no state change
Equivalent.
8.2 Crash After Cache Population¶
- Cache is in-memory only
- Cache is discarded on restart
Equivalent.
8.3 Memory Allocation Failure¶
- Baseline: read fails
- Optimized: read fails
Equivalent.
9. Recovery Proof¶
- Read optimizations do not affect WAL
- No persistent read state exists
- Recovery logic is unchanged
Replay behavior is identical.
10. Disablement & Rollback¶
10.1 Disablement Mechanism¶
Read optimizations MUST be disableable via:
- Compile-time flag or
- Startup configuration
Disablement restores: - Baseline traversal - Baseline allocation behavior
10.2 Compatibility Proof¶
- No WAL changes
- No snapshot changes
- No checkpoint changes
Data is always readable.
10.3 No Ghost State¶
- No persistent caches
- No hidden metadata
- No optimization markers
All state is in-memory and snapshot-scoped.
11. Observability¶
Permitted metrics (passive only):
- read_path.version_chain_traversals
- read_path.cache_hits
- read_path.cache_misses
- read_path.materialization_copies
Metrics MUST NOT: - Influence caching - Influence traversal - Influence allocation
12. Testing Requirements¶
Read Path Optimization MUST introduce:
- Snapshot equivalence tests
- Cache correctness tests
- Cross-snapshot isolation tests
- Crash-during-read tests
- Disablement equivalence tests
All existing tests MUST pass unmodified.
13. Explicit Non-Goals¶
Read Path Optimization does NOT aim to:
- Change isolation levels
- Improve write performance
- Introduce speculative reads
- Relax MVCC rules
It optimizes execution cost only.
14. Final Rule¶
A faster read that returns the wrong version
is worse than a slow read.
Read-path optimizations are valid only when they are invisible to all observers.
END OF DOCUMENT