What This Is

The Threads Analysis API is a Flask REST API serving ~50K posts from @maybe_foucault on Threads. It runs on port 4323, backed by Postgres 17 with pgvector, and is designed to be called from iOS Shortcuts, LLM agents, and the Fireflies spatial computing app.

69 endpoints. No auth. JSON everything. Auto-generated OpenAPI spec.

Base URL & Access

Tailscale:  http://100.71.141.45:4323

UI	Path
Swagger UI	`/docs/swagger`
ReDoc	`/docs/redoc`
RapiDoc	`/docs/rapidoc`
Scalar	`/docs/scalar`
Stoplight Elements	`/docs/elements`
PDF Export	`/docs/rapipdf`
OpenAPI spec	`/docs/openapi.json`
LLM instructions (full)	`/llms.txt`
LLM instructions (compact)	`/llms-mini.txt`

Both llms.txt files are auto-generated from live Flask routes — they always reflect what’s actually deployed.

Authentication

None. The API is accessible only over Tailscale. If you’re on the tailnet, you’re in.

Response Format

Every endpoint returns the same envelope:

{
  "data": "...(object or array)",
  "count": 42,
  "query": { "type": "search", "q": "foucault" },
  "generated_at": "2026-04-06T12:00:00+00:00"
}

Post-returning endpoints use "posts" instead of "data" as the top-level key.

Post Object

Field	Type	Description
`id`	string	Threads post ID
`text`	string	Post body text
`timestamp`	ISO string	When it was posted
`ago`	string	Human-readable relative time (“3 hours ago”)
`variety`	string	`original`, `reply`, `quote`, or `repost`
`tags`	string[]	Multi-label taxonomy tags (20 categories)
`primary_tag`	string	Single dominant tag
`surprise`	float	Self-information per word in bits (higher = more surprising)
`word_count`	int	Word count
`permalink`	string	Threads permalink URL
`metrics.views`	int	View count
`metrics.likes`	int	Like count
`metrics.replies`	int	Reply count
`sentiment`	float	-1 to 1 (negative to positive)
`energy`	string	`low`, `mid`, or `high`
`intent`	string	`statement`, `reaction`, `question`, `share`, `social`, `shitpost`
`language`	string	`en`, `de`, `vi`, `es`, etc.

Time-Based (8 endpoints)

Method	Path	Description	Key Params
GET	`/posts/now`	Posts in last 30 minutes	`limit` (default 500)
GET	`/posts/hour`	Posts in last hour	`limit`
GET	`/posts/today`	Posts since midnight UTC	`limit`
GET	`/posts/week`	Posts in last 7 days	`limit`
GET	`/posts/month`	Posts in last 30 days	`limit`
GET	`/posts/since`	Posts since N minutes ago	`minutes` (default 60), `limit`
GET	`/posts/between`	Posts between two ISO dates	`from`, `to` (required), `limit`
GET	`/posts/latest`	Last N posts	`n` (default 10)

Search & Discovery (8 endpoints)

Method	Path	Description	Key Params
GET	`/posts/search`	Full-text search (Postgres GIN)	`q` (required)
GET	`/posts/tag/{tag}`	Posts with a given tag	—
GET	`/posts/tag/{tag}/latest`	Latest N posts in a tag	`n` (default 5)
GET	`/posts/random`	One random post (excludes reposts)	—
GET	`/posts/random/{tag}`	Random post from a tag	—
GET	`/posts/{post_id}`	Single post by ID	—
GET	`/posts/similar/{post_id}`	Vector similarity search (pgvector cosine)	—
GET	`/posts/semantic-search`	Search by meaning via embeddings	`q` (required)

The two vector endpoints (similar and semantic-search) use nomic-embed-text 768d embeddings stored in pgvector. similar finds posts near a given post. semantic-search embeds your query string on the fly via Ollama, then finds the 20 nearest posts.

Analytics (8 endpoints)

Method	Path	Description	Key Params
GET	`/stats/overview`	Total posts, date range, posts today/week/month	—
GET	`/stats/streak`	Current consecutive posting days	—
GET	`/stats/top`	Top N posts by metric	`by` (views/likes/replies), `n` (default 10)
GET	`/stats/top/today`	Most engaged posts from today	—
GET	`/stats/hourly`	Posts per hour today	—
GET	`/stats/daily`	Posts per day this week	—
GET	`/stats/tags`	Tag distribution with counts and percentages	—
GET	`/stats/velocity`	Posting rate (posts/day) over 7, 30, 90 days	—

Method	Path	Description	Key Params
GET	`/social/mentions`	Who I mention most (top 20)	—
GET	`/social/interactions`	Reply interactions from conversations table (top 30)	—
GET	`/social/conversations/{post_id}`	Full reply thread for a post	—

Knowledge Graph (2 endpoints)

Method	Path	Description	Key Params
GET	`/graph/topics`	Tag clusters with connection strengths (from kg_nodes/kg_edges)	—
GET	`/graph/related/{tag}`	Tags related to a given tag via knowledge graph edges	—

Built from PMI-weighted co-occurrence, temporal proximity, and TF-IDF concept nodes.

Digests (3 endpoints)

Method	Path	Description	Key Params
GET	`/digest/today`	Structured summary: counts, top tags, top post	—
GET	`/digest/week`	Weekly summary with daily breakdown	—
GET	`/digest/brief`	One-paragraph natural language summary (designed for Apple Intelligence)	—

Enrichment & Insights (5 endpoints)

Method	Path	Description	Key Params
GET	`/vibe/now`	Today’s vibe breakdown (falls back to intent if no vibe tags)	—
GET	`/mood`	Current mood — sentiment + energy from last 24h	—
GET	`/drift`	Topic drift: what you’re posting more/less about vs last month	—
GET	`/bridges`	Posts bridging 3+ topic clusters	—
GET	`/who-am-i`	Full personality snapshot: tags, sentiment, vibes, energy, peak hours, top people	—

/who-am-i is the “mirror” endpoint — it pulls together every enrichment dimension into one response. Good for LLM context injection.

Analysis (5 endpoints)

Method	Path	Description	Key Params
GET	`/analysis/sentiment`	Sentiment over time	`window` (day/week/month), `tag` (optional)
GET	`/analysis/energy`	Energy distribution	`since` (ISO date, optional)
GET	`/analysis/intent`	Intent distribution	`since` (ISO date, optional)
GET	`/analysis/hours`	Posting pattern by hour of day (with avg sentiment)	—
GET	`/analysis/language`	Language distribution across all posts	—

Tech Genealogy (5 endpoints)

The tech genealogy tracks 49 technology topics across all posts — programming languages, frameworks, infrastructure, AI tools. It builds a co-occurrence graph showing which technologies get discussed together and how your tech interests evolved over time.

Method	Path	Description	Key Params
GET	`/genealogy/topics`	All tech topics with frequency and date range	—
GET	`/genealogy/timeline`	Monthly topic frequencies	`topic` (optional)
GET	`/genealogy/connections`	Topic-to-topic co-occurrence graph	`topic` (optional)
GET	`/genealogy/evolution`	Chronological posts mentioning a topic	`topic` (required)
GET	`/genealogy/brief`	Natural language summary of tech journey	—

Pedagogy Genealogy (6 endpoints)

Same structure as tech genealogy but for pedagogical methods and teaching concepts (30 topics). Includes a vector-search endpoint for semantic discovery — it embeds your query and finds pedagogically relevant posts even when keywords don’t match.

Method	Path	Description	Key Params
GET	`/pedagogy/topics`	All pedagogy topics with frequency and date range	—
GET	`/pedagogy/timeline`	Monthly topic frequencies	`topic` (optional)
GET	`/pedagogy/connections`	Pedagogy topic co-occurrence graph	`topic` (optional)
GET	`/pedagogy/evolution`	Chronological posts mentioning a topic	`topic` (required)
GET	`/pedagogy/brief`	Natural language summary of pedagogical journey	—
GET	`/pedagogy/vector-search`	Semantic search for pedagogical posts	`q` (required)

Haiku Oracle (2 endpoints)

Method	Path	Description	Key Params
GET	`/haiku/latest`	Latest haiku with source post graph	—
GET	`/haiku/all`	All haikus with source counts (last 50)	—

The haiku oracle runs 2-4 times a day. It picks random posts, feeds them to Gemma 4, and generates a haiku. Each haiku gets a UUID and is linked back to its source posts via haiku_edges — so you can trace which posts inspired which haiku. The latest haiku also gets sent to iMessage.

Utility (5 endpoints)

Method	Path	Description	Key Params
GET	`/health`	Health check (verifies DB connection)	—
GET	`/llms.txt`	Full LLM instruction file (auto-generated from live routes)	—
GET	`/llms-mini.txt`	Compact LLM instruction file	—
GET	`/docs/swagger`	Swagger UI	—
GET	`/docs/openapi.json`	Raw OpenAPI 3.0 spec	—

Data Pipeline

Threads API --> posts.json --> Postgres (pgvector)
                  |
           analysis pipeline --> tags, surprise, knowledge graph
                  |
           enrichment --> sentiment, energy, intent, language
                  |
           genealogy --> tech museum (49 topics) + pedagogy museum (30 topics)
                  |
           embeddings --> nomic-embed-text 768d --> semantic search

The sync worker pulls from the Threads API every 30 minutes. Posts go through information-theoretic analysis (Shannon entropy, Zipf, Heaps’ law), heuristic classification into 20 tags with 35 sub-tags, and then enrichment for sentiment, energy, intent, and language — all computed locally, no LLM needed. The genealogy tables are built by keyword extraction across tech and pedagogy domains. Embeddings are generated via Ollama (nomic-embed-text) and stored as pgvector 768d vectors.

iOS Shortcuts

Every GET endpoint works directly in iOS Shortcuts:

Get Contents of URL action with the Tailscale URL
Get Dictionary Value to extract posts or data
Repeat with Each to iterate results

Tailscale needs to be running on your iPhone/iPad. No auth headers needed. The /digest/brief endpoint returns a single natural language paragraph — ideal for Siri/Apple Intelligence integration.

Example Queries

Question	Endpoint
What did I post today?	`GET /posts/today`
Search for Foucault	`GET /posts/search?q=foucault`
My tech journey	`GET /genealogy/brief`
Find similar posts	`GET /posts/similar/{id}`
Current mood	`GET /mood`
Haiku oracle	`GET /haiku/latest`
Who am I?	`GET /who-am-i`

Build Your Own — Starter Prompts

These are copy-paste prompts for Claude Code (or any coding agent). Each one bootstraps a piece of the pipeline against YOUR data. Swap “Threads” for whatever dataset you have — tweets, journal entries, Obsidian notes, Slack messages.

Prompt 1: The Foundation

I have [DESCRIBE YOUR DATA — e.g., “50K social media posts in a JSON file with text, timestamp, and author fields”].

Build me:

A Postgres schema that captures the data + metadata

A seed script that loads the JSON into Postgres

A Docker Compose with Postgres + a Flask API

10 GET endpoints: /posts/latest, /posts/today, /posts/search?q=, /posts/random, /stats/overview, /stats/tags, /posts/tag/{tag}, /health, /llms.txt (auto-generated from routes), and an OpenAPI spec at /docs

Bind to 0.0.0.0 so I can access it over Tailscale

Use flask-openapi3 with all doc UIs. Parameterized SQL only. JSON responses with { data, count, query, generated_at }.

Prompt 2: Embeddings + Semantic Search

My Postgres has [N] text posts. Ollama is running locally with nomic-embed-text.

Upgrade Postgres to pgvector/pgvector:pg17

Add a vector(768) column to my posts table

Write a batch embedding script that calls Ollama nomic-embed-text for each post, 10 concurrent, with progress logging

Add two Flask endpoints: /posts/semantic-search?q= (embeds the query, cosine similarity search) and /posts/similar/{id} (find posts similar to a given post)

Run it on my first 100 posts as a test

Prompt 3: Enrichment (No LLM Needed)

I have [N] text posts in Postgres.

Write an enrichment script that computes these for every post — no LLM calls, just text analysis:

sentiment: lexicon-based, -1 to 1

energy: low/mid/high based on caps ratio, exclamation marks, length

intent: question/reaction/share/social/statement/shitpost (regex classification)

language: detect en/es/de/vi/other from character patterns

hour_bucket: extract UTC hour from timestamp

is_weekend: boolean

Add columns to Postgres, batch update all posts, and add Flask endpoints: /mood (24h summary), /vibe/now (today’s breakdown), /analysis/sentiment?window=week, /analysis/hours (posting pattern by hour).

Prompt 4: Knowledge Genealogy (The Museum)

I have [N] posts with text in Postgres + embeddings.

Build a “genealogy” pipeline that tracks how I talk about [DOMAIN — e.g., “technology”, “cooking”, “fitness”] over time:

Define 30-50 topic keywords with regex patterns (e.g., ‘python’: /\bpython\b/i)

Scan all posts, extract topic mentions, store in a genealogy table (post_id, topic, timestamp)

Compute co-occurrence edges: topics that appear in the same post or within 24 hours of each other

Add Flask endpoints: /genealogy/topics, /genealogy/timeline?topic=, /genealogy/connections, /genealogy/evolution?topic=, /genealogy/brief (natural language summary)

The brief endpoint should say things like “Top topic: X (N mentions since Month Year). Strongest connection: X <-> Y.”

Also do a vector-discovery pass: pick the 5 longest keyword-matched posts as seeds, find the 50 nearest posts by embedding similarity, add any with similarity > 0.5 as ‘vector-discovered’ entries.

Prompt 5: The Haiku Oracle (Creative Agent)

I have [N] posts in Postgres spanning [DATE RANGE].

Build a “haiku oracle” agent that:

Picks 8-10 random posts across time periods (ancient, old, recent, fresh, + 1 high-surprise post)

Sends them to a local LLM (Ollama gemma4:e4b) with a prompt to write one haiku (5-7-5) capturing the thread connecting them

Stores the haiku with a UUID in a graph structure: haikus table (uuid, haiku, model, generated_at) + haiku_edges table (haiku_uuid, post_id, period, post_text snapshot, post_timestamp)

Sends the haiku via iMessage using AppleScript (read phone from .env ALERT_PHONE)

Has a —loop mode that runs 2-4 times per day at random hours

Flask endpoints: /haiku/latest (with full source graph), /haiku/all

Prompt 6: iOS Shortcuts Integration

My Flask API is running at http://[TAILSCALE_IP]:4323 with [N] endpoints.

Write me an iOS Shortcuts guide with step-by-step recipes for:

“My Last Posts” — ask for a number, GET /posts/latest?n={input}, format and display

“Search Threads” — ask for text, URL-encode, GET /posts/search?q={encoded}, display results

“Daily Digest” — GET /digest/brief, show the natural language summary

“Ask My Data” — get question, pass JSON to ChatGPT via “Ask ChatGPT” action for interpretation

The key insight: use the API for data retrieval, use on-device Apple Intelligence or ChatGPT for the interpretation layer. Cheap local API + smart output processing.

The Philosophy

This isn’t a tutorial about building an API. It’s about treating your own data as a first-class citizen.

Every post you’ve ever written is a data point. Every @mention is a graph edge. Every topic shift is a signal. The tools to analyze all of this used to cost thousands in cloud compute. Now it runs on a Mac mini with free local models.

The prompts above aren’t prescriptive — they’re starting points. Your Claude instance will adapt them to your data shape, your schema, your questions. The pedagogy here is: give someone the foundation and the tools, and the curiosity does the rest.

Start with Prompt 1. Get your data into Postgres. See it in Swagger UI. Then go deeper.

Built in one session. 69 endpoints. Zero cloud costs. The data was always there — it just needed infrastructure.