Safari Autoplay Policy and FFmpeg Container Mismatch

Browser Audio + FFmpeg web

TL;DR: Safari’s autoplay policy voids audio.play() if called from an async callback (user gesture context lost). FFmpeg’s -c copy fails silently when the source codec doesn’t match the output container extension. Both are documented nowhere useful.

Safari Autoplay: The Gesture Context Problem

Safari requires a “user gesture” to initiate audio playback. The catch: if your audio.play() call is inside an async callback — even one triggered by a click — Safari considers the gesture context expired.

// BROKEN: gesture context lost after await
button.onclick = async () => {
  const data = await fetch('/api/stream-info');
  audio.src = data.url;
  audio.play(); // Safari blocks this
};

// FIX: set autoplay before the async gap
button.onclick = async () => {
  audio.autoplay = true; // primed synchronously
  const data = await fetch('/api/stream-info');
  audio.src = data.url;  // autoplay triggers on load
};

Alternative: prime the audio element with a silent WAV on the synchronous click, then swap src later. Both work. The key insight is that Safari tracks gesture context at the call-stack level, not the event-handler level.

This matters for any app that fetches metadata before playing — which is every app that doesn’t hardcode URLs.

FFmpeg Container Mismatch

When clipping audio with -c copy (no re-encode), FFmpeg trusts you to match the output extension to the source codec. If the source is MP3 and you write to .m4a:

# BROKEN: MP3 stream into M4A container
ffmpeg -ss 100 -i source.mp3 -t 60 -c copy clip.m4a
# Error: codec not supported in container

The fix is format-aware output:

# Detect source extension, match output
ext="${source##*.}"
ffmpeg -ss 100 -i "source.${ext}" -t 60 -c copy "clip.${ext}"

This applies to both the transcription-segment pipeline (extracting clips for whisper) and the audiomark clip pipeline (saving shareable clips). The life-dashboard codebase now detects source format from the file extension and propagates it through both pipelines.

Plex Rating Key Levels

A related gotcha from the same session: Plex returns different “rating key” types depending on the API endpoint. Recently-played history can return either album-level (parentRatingKey) or track-level (ratingKey) identifiers for the same audiobook.

The fix: check the type field. If it’s "track", climb one level to parentRatingKey before querying for album metadata. Both the book_detail and stream-info endpoints need this check.

Takeaways

  • Safari autoplay: prime audio.autoplay = true synchronously, before any async work
  • FFmpeg container: always match output extension to source codec when using -c copy
  • Plex keys: always normalize to album-level before metadata queries
  • All three bugs share a pattern: the API does what you asked, not what you meant. The contract is stricter than the mental model.