Life Dashboard: 55K Events, 12 Agents, One Evening
Life Dashboard: 55K Events, 12 Agents, One Evening
Started with a project skeleton and placeholder data. Ended with a fully operational personal telemetry platform ingesting 55,354 events across 11 sources spanning five years.
The Starting Point
Life-dashboard existed as a FastAPI project with SQLite, a few stub routes, and sample data. The goal was to make it real: ingest actual personal data, build proper infrastructure, wire up the frontend.
The scope expanded. Predictably.
Postgres Migration
SQLite was fine for prototyping. It stops being fine when you want JSONB columns, connection pooling, materialized views, and concurrent writes from background tasks. Migrated to Postgres with asyncpg. The UnifiedEvent schema uses typed columns for universal fields and JSONB for source-specific payloads.
Custom asyncpg codecs handle JSON serialization. Connection pooling via asyncpg’s built-in pool. The migration was clean because the original schema was already event-sourced — just needed to swap the storage layer.
Nine Adapters, Eleven Sources
Each data source gets its own adapter that normalizes raw data into UnifiedEvent records:
- lifetracker — charger connects, driving sessions, bedtime triggers, Ioniq key events (8K+ events from Aug 2021)
- clipboard — clipboard copies with embedded health data, two schema versions (v1 nested, v2 flat) (3.7K entries)
- journal — daily freeform text and serialized bookmark records (729 entries)
- bookmarks — book highlights with title/author/page (410 entries)
- mileage — odometer readings and trip distances
- locations — lat/lng coordinate pairs from various triggers
- driving — structured driving records from old Data Jar backups that filled timeline gaps
- sleep — 645 nights of sleep data from ByTheWeiCo
- bytheweico — 36K threads analysis records and 1.7K plex audiobook events
The hardest part was Data Jar. Apple’s Data Jar app stores everything in a recursive type/value wrapper structure. Every level of nesting has {"type": "...", "value": ...}. The unwrapper had to be recursive and handle every Data Jar type (string, number, date, array, dictionary, boolean).
The 4,185 Broken Timestamps
Ingesting real data immediately surfaced a data quality problem. 4,185 timestamps contained Unicode narrow no-break space characters (U+202F) in AM/PM formatted time strings. This is an iOS locale artifact — some system locales insert U+202F between the time and AM/PM.
Python’s strptime choked on these silently, producing incorrect timestamps. The fix was a whitespace normalization step before parsing. But finding 4,185 instances meant every daily summary had been subtly wrong during development with sample data.
Fibonacci Audit Sampling
After the timestamp fiasco, I needed a data quality check that would run automatically. The Fibonacci-batch random sampling system runs on every server boot.
It samples at Fibonacci-spaced offsets (1, 1, 2, 3, 5, 8, 13 = 33 samples total) through the sorted event list. Each sample gets validated: schema conformance, timestamp bounds, source-specific invariants. The non-uniform spacing deliberately over-samples boundaries where edge cases cluster.
33 samples is cheap enough to run on every boot. If any fail, the dashboard logs warnings before serving data.
SQL-First Architecture
Every API endpoint maps to a named query in queries.py. Routes are thin wrappers: parse params, execute query, format response. No ORM, no query builder, no scattered SQL.
The SQL explorer with presets matching every API endpoint lets you run the exact same queries the API uses. Full transparency into what the data layer is doing.
Geocoding Pipeline
Location events arrive as coordinate pairs. A background task reverse-geocodes them via Nominatim at 1 req/sec (respecting the usage policy). 525+ addresses resolved. The pipeline is resumable — tracks which coordinates are done, picks up on restart.
Daily Summary Materialized View
A materialized view computes one row per day across all 11 sources. 1,618 days covered. Pre-aggregated counts per source, total events, date range. Refreshes on ingest. Makes “what happened today” queries instant instead of scanning 55K rows.
Globe.gl and HTMX Frontend
Real location data wired to Globe.gl as points. Driving records rendered as arcs between start/end coordinates. HTMX fragments pull from real database queries. The globe is the hero visualization; the HTMX panels are the operational dashboard.
Upstream Service Proxying
Proxied 16 Threads Analysis endpoints through life-dashboard. This lets the dashboard aggregate data from threads-analysis (port 4323), cinder (4242), and other upstream services without the frontend needing to know about multiple backends.
iOS Shortcuts Ingest Endpoint
POST /api/ingest/lifetracker accepts new events from iOS Shortcuts. The deterministic event ID scheme means the Shortcut can fire multiple times on flaky connections without creating duplicates.
API Call Logging
Every API request logged to api_log: timestamp, path, method, response time, status code. The audit trail pattern from enterprise SAD methodology. When you proxy 16 upstream endpoints and accept POST data from iOS Shortcuts, you want to see who’s querying what.
Enterprise SAD
Wrote a 500-line Software Architecture Document using financial compliance framework methodology. 15 sections covering system context, component decomposition, data flow, security boundaries, deployment topology, and operational procedures. Then started a PCI-grade version applying the published PCI DSS standard’s structure to personal data architecture.
Also created 7 project-specific compliance skills for automated scanning and auditing.
Data Archaeology
The old Data Jar backups contained driving records that weren’t in the primary export. Auditing these backups and running differential ingestion filled gaps in the timeline. The deterministic ID scheme made this safe — overlapping records were silently deduplicated.
The Numbers
| Metric | Value |
|---|---|
| Total events ingested | 55,354 |
| Data sources | 11 |
| Time span | Aug 2021 — Apr 2026 |
| Data adapters built | 9 |
| Broken timestamps fixed | 4,185 |
| Addresses geocoded | 525+ |
| Days in summary view | 1,618 |
| Agents spawned | 12 |
| Upstream endpoints proxied | 16 |
Agents
12 agents total, all completed. Used for parallel adapter development, geocoding pipeline, frontend wiring, SAD generation, compliance skill creation, and data archaeology across old backups. The parallel agent pattern works especially well for data adapters since each one is a self-contained transformation with no shared state.
What’s Next
The dashboard is operational but the visualization layer is minimal. Globe.gl shows where you’ve been; the HTMX panels show what happened today. The gap is temporal navigation — scrubbing through five years of daily summaries with the globe animating to match. That’s the next session.