From Plex Token Hunt to “Hey Siri, Audiomark”

Started with “check synology-garden for the Plex API key.” Ended with a fully operational voice-triggered audiobook transcription pipeline running in under 5 seconds.

The Starting Point

Life-dashboard had reading/highlight support but nothing for audiobooks. Plex has been my audiobook player for years — 205 books, 3.5 years of listening history — but that data was trapped inside the Plex database. The goal was integration.

First task: find the Plex token. Dug through the synology-garden project (NAS management scripts) where it was stored in a config file.

Phase 1: Backfill History

Plex’s /status/sessions/history/all endpoint exposes complete listening history with pagination. Built an async backfill script:

Paginated through all history entries
Filtered to audiobook library (excluding music/video)
Extracted title, author, duration, timestamps
Fetched book descriptions from album metadata
Converted to UnifiedEvent format with deterministic IDs

Result: 1,061 events, 205 unique books, Sept 2022 through May 2026. Idempotent — safe to re-run.

Phase 2: Periodic Sync

Added a 30-minute background loop (alongside the existing Threads proxy sync) that checks Plex for new listening sessions and ingests them. No more manual backfill needed — new listens appear automatically.

Phase 3: Now-Playing in /api/reading/current

Updated the reading/current endpoint to combine:

Text highlights (existing)
Audiobook listening events (new)
Plex real-time now-playing status

If you’re currently listening to an audiobook, it shows up with live progress.

Phase 4: The Audiomark Endpoint

This is where it got interesting. The realization:

Plex exposes the exact file path on the NAS. The NAS is mounted locally. ffmpeg can seek into any position in a container file in 0.15 seconds. mlx_whisper transcribes 2 minutes of audio in 4 seconds on M2 Pro.

That chain means: query what’s playing, extract the audio around the current position, transcribe it, save it. Total time: ~5 seconds.

Built POST /api/ingest/audiomark:

Hit Plex /status/sessions for current playback
Map NAS path to local mount (/volume1/ to /Volumes/)
Calculate album-wide progress (sum all tracks, not just current file)
ffmpeg extract +/- 1 minute around current position
mlx_whisper transcribe the clip
Store as searchable highlight event

Phase 5: Docker to Host Migration

The pipeline requires:

NAS mount access (not available in Docker)
ffmpeg binary
mlx_whisper (Apple Silicon Neural Engine)

Moved life-dashboard from Docker to host uvicorn. Updated the systemd-equivalent (launchd plist) accordingly.

The Debugging

mlx_whisper output path: It writes transcription files to --output-dir using the input file’s basename, not the full path. When the input was /tmp/audiomark_clip.wav, the output was audiomark_clip.txt in the output dir. Took a minute to catch.

Album progress vs track progress: Plex reports offset within the current track. A book split across 47 files might show “95% complete” when you’re 95% through track 12 of 47. Had to fetch the full track list and sum durations.

Proper noun mangling: whisper-base is fast but inaccurate on names. “Crysknife” -> “Christ’s knife”, “Leto” -> “Latos”. Acceptable for bookmarking (you know what book you’re in), but worth noting.

The Result

A Siri Shortcut that sends a single POST request. Five seconds later, the passage you’re listening to is transcribed and searchable in the life-dashboard event log. Title, author, progress, and the actual text of the passage — all indexed.

This is the audiobook bookmarking workflow I’ve been trying to build for years. The missing piece was realizing that Plex + local NAS mount + Apple Silicon inference eliminates every bottleneck that made previous attempts impractical.

Key Technical Decisions

Deterministic event IDs: SHA256(source + timestamp + content_prefix). Re-running backfill is safe.
Album-wide progress: Sum all tracks, not just current file. Consistent across single-file and multi-file books.
whisper-base over whisper-large: 4s vs 30s+ transcription time. Speed matters for a voice-triggered UX.
+/- 1 minute extraction: Captures context around the moment you triggered the bookmark.
Host uvicorn over Docker: Required for NAS mount + ffmpeg + MLX access. Trade-off accepted.

What Shipped

Feature	Details
Plex audiobook backfill	1,061 events, 205 books, Sept 2022 - May 2026
Periodic sync	30-min background loop, automatic new listen ingestion
/api/reading/current	Combined highlights + listens + now-playing
POST /api/ingest/audiomark	Voice-triggered passage transcription, ~5s total
highlight quick-sync	Highlight ingest auto-checks Plex for title/author context