Threads Analysis: 50K Posts, 69 Endpoints, Zero Cloud
50,704 Posts. Two API Servers. One Mac Mini. Zero Cloud Costs.
It’s 2am and I’m staring at a Grafana dashboard with 14 panels all showing green. Six hours ago this repo was a GitHub URL with some WebXR galaxy code and no local clone. Now there are two API servers running, a Postgres database with vector embeddings for 41,943 posts, enrichment pipelines that have classified every single post by sentiment and energy and intent, a knowledge graph with 12,039 edges, and a haiku oracle that texts me poetry generated from my own Threads posts via iMessage.
I need to write this down before I forget how we got here.
The Starting Point
The threads-analysis repo existed on GitHub — private, had the WebXR galaxy visualizations from the Fireflies session, the information-theory pipeline from the original platform build. But no local clone on this machine. No .env. No data files. Just code.
So step one was git clone. Step two was “let’s build the API layer I’ve been meaning to build.” Step nineteen was a haiku oracle sending poetry to my phone.
Scope creep is a feature, not a bug.
The Architecture
Two servers. One database. All local.
┌─────────────────────────────────────────────┐
│ Mac Mini (M2 Pro) │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Flask API │ │ Node API │ │
│ │ :4323 │ │ :4322 │ │
│ │ 69 endpoints │ │ 26 endpoints│ │
│ │ Swagger/ │ │ Gemma 4 RAG │ │
│ │ ReDoc/Scalar │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ └────────┬───────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ PostgreSQL │ │
│ │ pgvector:pg17 │ │
│ │ 50,706 posts │ │
│ │ 41,943 vectors │ │
│ └─────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Ollama │ │ Grafana │ │
│ │ Gemma 4 │ │ 14 panels │ │
│ │ nomic-embed │ │ │ │
│ └──────────────┘ └──────────────┘ │
│ │
│ All accessible over Tailscale │
└─────────────────────────────────────────────┘
Flask API (:4323) — 69 Endpoints
Started at 58, ended at 69. The Flask server is the primary API — handles all the CRUD, the analysis endpoints, the enrichment pipeline triggers, the genealogy museums, the knowledge graph queries. Six different API doc UIs because I couldn’t pick one:
- Swagger UI — the classic
- ReDoc — clean, readable
- RapiDoc — interactive
- Scalar — modern, beautiful
- Elements — Stoplight’s thing
- RapiPDF — generates PDF docs from OpenAPI spec
Yeah, six doc renderers for one API is overkill. But now I can compare them side by side and I have opinions about all of them. Scalar wins.
Node API (:4322) — 26 Endpoints + Gemma 4 RAG
The Node server handles the AI-powered stuff. Gemma 4 running through Ollama for retrieval-augmented generation. You ask it a question about my Threads posts and it pulls relevant vectors from pgvector, feeds them as context, and generates an answer grounded in actual data.
26 endpoints covering search, RAG queries, the haiku oracle, and vector similarity lookups.
The Data Pipeline
Sync: 50,704 Posts
Pulled everything from the Threads API. 19,997 original posts and 30,707 replies. The sync handles pagination, rate limiting, incremental updates. Every post goes into Postgres with full metadata.
Vector Embeddings: 41,943 Text Posts
Not every post has text (some are pure images/video). For the 41,943 that do, I generated 768-dimensional embeddings using nomic-embed-text running locally through Ollama. No OpenAI API calls. No embedding costs. Just a local model grinding through 42K posts on the M2 Pro.
The embeddings live in pgvector — proper HNSW indexing for fast cosine similarity search. Ask “find posts similar to this one” and it returns results in milliseconds.
Enrichment Pipeline
Every text post got classified across four dimensions:
| Dimension | What It Captures |
|---|---|
| Sentiment | Positive / negative / neutral + confidence score |
| Energy | High / low / calm / chaotic |
| Intent | Inform / persuade / vent / joke / question / reflect |
| Language | Primary language detection |
All running through local models. The whole enrichment pass across 41,943 posts took a while but cost exactly zero dollars.
The Genealogy Museums
This is where it got weird and I loved it.
Tech Genealogy Museum
49 topics extracted from the corpus. Not just tags — actual technology topics with post counts and temporal distribution. Claude shows up 452 times. ChatGPT 248. The museum tracks how tech discourse evolves over time in my posting.
792 co-occurrence edges map which technologies get discussed together. Claude and API show up together a lot. Python and data. visionOS and spatial computing. It’s a map of tech adjacency in my own thinking.
Pedagogy Genealogy Museum
30 topics focused on learning, teaching, knowledge transfer. Then 150 additional posts discovered purely through vector similarity — posts that weren’t explicitly tagged as pedagogical but whose embeddings clustered near known pedagogy content. The vectors found teaching moments I didn’t know I was having.
The Haiku Oracle
This is the one that made me laugh out loud at 1am.
The haiku oracle picks random posts from across different time periods, feeds them to Gemma 4, and asks it to distill the essence into a haiku. Then it sends the haiku to me via iMessage using the macOS osascript bridge. Every haiku gets logged with a UUID and the source posts are recorded in a graph — so you can trace any haiku back to the original thoughts that inspired it.
I got a haiku about something I posted in 2023 and it was genuinely insightful about a pattern I didn’t notice at the time.
An AI reading my old posts and texting me poetry about them. On a Mac mini. Over Tailscale. At 1am. The future is so specific.
Knowledge Graph & Interaction Network
The knowledge graph grew from the original session’s 1,638 nodes / 11,155 edges to 1,896 nodes / 12,039 edges. The new nodes come from the tech and pedagogy genealogy analysis — more entities extracted, more relationships mapped.
The interaction graph captures 1,746 interactions across 422 users. Who I reply to, who replies to me, the shape of my Threads social network as a directed graph. Up from the original 409 users / 1,668 interactions.
Grafana Dashboard
14 panels covering:
- Post volume over time (with reply breakdown)
- Sentiment distribution
- Energy classification
- Top topics by post count
- Embedding cluster visualization
- Enrichment pipeline progress
- API health checks
- Knowledge graph growth metrics
- Sync status and freshness
All pointing at the local Postgres instance. Accessible over Tailscale from my MacBook or phone.
The Extras
iOS Shortcuts guide — 7 recipes for querying the API from your phone. “What was I posting about this time last year?” as a home screen widget.
visionOS native app design doc — 7 visualization concepts for taking the Fireflies WebXR experience to native SwiftUI + RealityKit. The WebXR version is good. A native version with proper spatial audio and shared spaces would be something else entirely.
Auto-generating llms.txt — the API generates its own llms.txt file from live routes. If an LLM hits the server, it can read the machine-readable endpoint catalog and figure out how to use the API without human documentation.
Monitor script — checks every 15 minutes that both servers are up, Postgres is responding, Ollama is loaded. If anything’s down, sends an iMessage alert.
The Code Review Gauntlet
Ran three parallel code reviews with different lenses:
- Reuse review — found duplicated logic across Flask and Node servers
- Quality review — error handling gaps, missing type hints, inconsistent response formats
- Efficiency review — N+1 queries, unnecessary round-trips, unindexed columns
11 fixes applied from the combined findings. Then a security audit scrubbed any leaked secrets before the push to GitHub.
The Numbers
| Metric | Count |
|---|---|
| Posts synced | 50,704 |
| Original posts | 19,997 |
| Replies | 30,707 |
| Posts in Postgres | 50,706 |
| Text posts with vectors | 41,943 |
| Vector dimensions | 768 |
| Flask endpoints | 69 |
| Node endpoints | 26 |
| Knowledge graph nodes | 1,896 |
| Knowledge graph edges | 12,039 |
| Interaction graph users | 422 |
| Interaction graph edges | 1,746 |
| Tech genealogy topics | 49 |
| Pedagogy topics | 30 |
| Co-occurrence edges | 792 |
| Grafana panels | 14 |
| iOS Shortcut recipes | 7 |
| visionOS concepts | 7 |
| Code review fixes | 11 |
| Cloud cost | $0 |
What I Learned This Session
pgvector on local Postgres is seriously underrated. Everyone reaches for Pinecone or Weaviate. But if your data fits on one machine — and 50K posts absolutely fits on one machine — pgvector with HNSW indexing is fast, free, and you already know SQL. No new query language. No managed service. No API rate limits.
Two API servers is fine. Flask for the structured data layer, Node for the AI/RAG layer. They share a database. They have different strengths. Trying to force everything into one framework would have been slower to build and harder to maintain.
Vector similarity finds things taxonomy can’t. The pedagogy museum’s 150 vector-discovered posts are the proof. My tagging system would never have caught those posts as pedagogical. But the embeddings knew. The geometry of meaning is more flexible than any classification system I could design by hand.
Ollama + local models changes the economics completely. Embedding 42K posts through OpenAI’s API would have cost real money. Running nomic-embed-text locally cost electricity and time. For batch processing where latency doesn’t matter, local inference is the obvious choice.
Stack
| Layer | Tech |
|---|---|
| Primary API | Flask (Python), 69 endpoints |
| AI/RAG API | Node.js, 26 endpoints |
| Database | PostgreSQL 17 + pgvector |
| Embeddings | nomic-embed-text (768d) via Ollama |
| LLM | Gemma 4 via Ollama |
| Monitoring | Grafana (14 panels) |
| Alerts | iMessage via osascript |
| Hardware | Mac mini M2 Pro |
| Network | Tailscale |
| Docs | Swagger, ReDoc, RapiDoc, Scalar, Elements, RapiPDF |
All local. All private. All mine.