43-Fix Pipeline Audit
The Starting Point
The reader pipeline — transcribe.js, book.html, reading.html, plex.py, ingest.py — had shipped fast over two prior sessions. Working, but not audited. The goal: systematic quality pass before building on top of it.
The Audit
10 parallel agents, each examining one dimension:
- Chapter index consistency (zero-based edge cases)
- Multi-format audio (MP3 vs M4A handling)
- Prologue sync (pause detection, interval leaks)
- Memory and performance (DOM accumulation, cache misses)
- Race conditions (rapid seeks, chapter switches)
- Error handling (fetch failures, audio errors)
- Audiomark pipeline (rate limiting, dedup)
- Progress bar accuracy (album-level seek math)
- Book resolution cache (invalidation, fallback)
- Cross-page feature parity (reading.html vs book.html)
Result: 43 issues. 1 Critical, 14 High, 23 Medium, 12 Low.
The Critical Find
bpFreshStart() was dead code. It called reset() which cleared partKey, then tried to read partKey on the next line. Every call to bpFreshStart silently did nothing — no transcription restart, no state cleanup. Fixed by storing partKey in a separate bpPartKey variable before reset() clears it.
This is the kind of bug that passes every manual test because the system degrades gracefully. Transcription just… doesn’t restart after seeks. You’d never notice unless you were looking for it.
Fix Architecture
5 agents, partitioned by file ownership:
- Agent 1: transcribe.js — sentence cache, lineEls cache, GC improvements
- Agent 2: book.html — generation guards, listener cleanup,
??over|| - Agent 3: reading.html — same patterns, 14 fetch error handling sites
- Agent 4: plex.py + proxy/plex.py — Prologue pause detection,
prologueStop()interval cleanup - Agent 5: ingest.py — rate limiting (15s audiomark, 5s highlight), dedup (30s same-book),
time.monotonic()
Agent 6 verified all fixes read-only after the others finished.
Zero merge conflicts. The partition-by-file strategy works because most bugs live in one file even when they manifest across pages.
LoB Integration
Built app/proxy/lob.py — proxy module to the Library of Babel service. New POST /api/ingest/highlight endpoint accepts text highlights and asynchronously enriches them via LoB using create_task(). The enrichment is fire-and-forget: the highlight saves immediately, LoB connections populate in the background.
Error Handling Sweep
Every fetch() call in both HTML files got r.ok and data.error checks. 14 sites in reading.html, 4 in book.html. Audio error handlers moved to module level (persistent, not re-registered per play). Before this, a network blip during transcription would silently break the waterfall with no user feedback.
Chrome Smoke Test
Used Chrome MCP to verify all three pages (home, reading, book) load with zero JS console errors. Ran JS stability tests confirming:
- Loop generation guards kill stale callbacks
- Listener cleanup prevents
loadedmetadatastacking - Sentence cache avoids per-frame DOM queries
- Rate limiting rejects rapid audiomark attempts
Only pre-existing issue: “radar failed 500” — unrelated to this session’s work.
What Shipped
Two commits:
0b31540— 43 audit fixes across 6 filesad8396a— LoB integration + error handling sweep
Key Learnings
Partition parallel agents by file, not by concern. A race condition bug might span book.html and transcribe.js, but the fix lives in one file. Assign to that file’s agent.
{ once: true } does not mean “only one listener.” It means “remove after firing.” Five rapid calls stack five listeners. Track and remove manually.
time.monotonic() over time.time() for rate limiting. Immune to NTP corrections and clock drift. The difference is academic until it isn’t.
Audit code, not docs. CLAUDE.md said rate limiting existed. The code said otherwise. The code was right.