Life Dashboard: 55K Events, 12 Agents, One Evening

claude-codelife-dashboarddata-engineeringpostgresfastapi

Life Dashboard: 55K Events, 12 Agents, One Evening

Started with a project skeleton and placeholder data. Ended with a fully operational personal telemetry platform ingesting 55,354 events across 11 sources spanning five years.


The Starting Point

Life-dashboard existed as a FastAPI project with SQLite, a few stub routes, and sample data. The goal was to make it real: ingest actual personal data, build proper infrastructure, wire up the frontend.

The scope expanded. Predictably.

Postgres Migration

SQLite was fine for prototyping. It stops being fine when you want JSONB columns, connection pooling, materialized views, and concurrent writes from background tasks. Migrated to Postgres with asyncpg. The UnifiedEvent schema uses typed columns for universal fields and JSONB for source-specific payloads.

Custom asyncpg codecs handle JSON serialization. Connection pooling via asyncpg’s built-in pool. The migration was clean because the original schema was already event-sourced — just needed to swap the storage layer.

Nine Adapters, Eleven Sources

Each data source gets its own adapter that normalizes raw data into UnifiedEvent records:

  • lifetracker — charger connects, driving sessions, bedtime triggers, Ioniq key events (8K+ events from Aug 2021)
  • clipboard — clipboard copies with embedded health data, two schema versions (v1 nested, v2 flat) (3.7K entries)
  • journal — daily freeform text and serialized bookmark records (729 entries)
  • bookmarks — book highlights with title/author/page (410 entries)
  • mileage — odometer readings and trip distances
  • locations — lat/lng coordinate pairs from various triggers
  • driving — structured driving records from old Data Jar backups that filled timeline gaps
  • sleep — 645 nights of sleep data from ByTheWeiCo
  • bytheweico — 36K threads analysis records and 1.7K plex audiobook events

The hardest part was Data Jar. Apple’s Data Jar app stores everything in a recursive type/value wrapper structure. Every level of nesting has {"type": "...", "value": ...}. The unwrapper had to be recursive and handle every Data Jar type (string, number, date, array, dictionary, boolean).

The 4,185 Broken Timestamps

Ingesting real data immediately surfaced a data quality problem. 4,185 timestamps contained Unicode narrow no-break space characters (U+202F) in AM/PM formatted time strings. This is an iOS locale artifact — some system locales insert U+202F between the time and AM/PM.

Python’s strptime choked on these silently, producing incorrect timestamps. The fix was a whitespace normalization step before parsing. But finding 4,185 instances meant every daily summary had been subtly wrong during development with sample data.

Fibonacci Audit Sampling

After the timestamp fiasco, I needed a data quality check that would run automatically. The Fibonacci-batch random sampling system runs on every server boot.

It samples at Fibonacci-spaced offsets (1, 1, 2, 3, 5, 8, 13 = 33 samples total) through the sorted event list. Each sample gets validated: schema conformance, timestamp bounds, source-specific invariants. The non-uniform spacing deliberately over-samples boundaries where edge cases cluster.

33 samples is cheap enough to run on every boot. If any fail, the dashboard logs warnings before serving data.

SQL-First Architecture

Every API endpoint maps to a named query in queries.py. Routes are thin wrappers: parse params, execute query, format response. No ORM, no query builder, no scattered SQL.

The SQL explorer with presets matching every API endpoint lets you run the exact same queries the API uses. Full transparency into what the data layer is doing.

Geocoding Pipeline

Location events arrive as coordinate pairs. A background task reverse-geocodes them via Nominatim at 1 req/sec (respecting the usage policy). 525+ addresses resolved. The pipeline is resumable — tracks which coordinates are done, picks up on restart.

Daily Summary Materialized View

A materialized view computes one row per day across all 11 sources. 1,618 days covered. Pre-aggregated counts per source, total events, date range. Refreshes on ingest. Makes “what happened today” queries instant instead of scanning 55K rows.

Globe.gl and HTMX Frontend

Real location data wired to Globe.gl as points. Driving records rendered as arcs between start/end coordinates. HTMX fragments pull from real database queries. The globe is the hero visualization; the HTMX panels are the operational dashboard.

Upstream Service Proxying

Proxied 16 Threads Analysis endpoints through life-dashboard. This lets the dashboard aggregate data from threads-analysis (port 4323), cinder (4242), and other upstream services without the frontend needing to know about multiple backends.

iOS Shortcuts Ingest Endpoint

POST /api/ingest/lifetracker accepts new events from iOS Shortcuts. The deterministic event ID scheme means the Shortcut can fire multiple times on flaky connections without creating duplicates.

API Call Logging

Every API request logged to api_log: timestamp, path, method, response time, status code. The audit trail pattern from enterprise SAD methodology. When you proxy 16 upstream endpoints and accept POST data from iOS Shortcuts, you want to see who’s querying what.

Enterprise SAD

Wrote a 500-line Software Architecture Document using financial compliance framework methodology. 15 sections covering system context, component decomposition, data flow, security boundaries, deployment topology, and operational procedures. Then started a PCI-grade version applying the published PCI DSS standard’s structure to personal data architecture.

Also created 7 project-specific compliance skills for automated scanning and auditing.

Data Archaeology

The old Data Jar backups contained driving records that weren’t in the primary export. Auditing these backups and running differential ingestion filled gaps in the timeline. The deterministic ID scheme made this safe — overlapping records were silently deduplicated.

The Numbers

MetricValue
Total events ingested55,354
Data sources11
Time spanAug 2021 — Apr 2026
Data adapters built9
Broken timestamps fixed4,185
Addresses geocoded525+
Days in summary view1,618
Agents spawned12
Upstream endpoints proxied16

Agents

12 agents total, all completed. Used for parallel adapter development, geocoding pipeline, frontend wiring, SAD generation, compliance skill creation, and data archaeology across old backups. The parallel agent pattern works especially well for data adapters since each one is a self-contained transformation with no shared state.

What’s Next

The dashboard is operational but the visualization layer is minimal. Globe.gl shows where you’ve been; the HTMX panels show what happened today. The gap is temporal navigation — scrubbing through five years of daily summaries with the globe animating to match. That’s the next session.