Threads Analysis API — Technical Reference

APIthreadsdatapipelinedocumentation

What This Is

The Threads Analysis API is a Flask REST API serving ~50K posts from @maybe_foucault on Threads. It runs on port 4323, backed by Postgres 17 with pgvector, and is designed to be called from iOS Shortcuts, LLM agents, and the Fireflies spatial computing app.

69 endpoints. No auth. JSON everything. Auto-generated OpenAPI spec.


Base URL & Access

Tailscale:  http://100.71.141.45:4323
UIPath
Swagger UI/docs/swagger
ReDoc/docs/redoc
RapiDoc/docs/rapidoc
Scalar/docs/scalar
Stoplight Elements/docs/elements
PDF Export/docs/rapipdf
OpenAPI spec/docs/openapi.json
LLM instructions (full)/llms.txt
LLM instructions (compact)/llms-mini.txt

Both llms.txt files are auto-generated from live Flask routes — they always reflect what’s actually deployed.


Authentication

None. The API is accessible only over Tailscale. If you’re on the tailnet, you’re in.


Response Format

Every endpoint returns the same envelope:

{
  "data": "...(object or array)",
  "count": 42,
  "query": { "type": "search", "q": "foucault" },
  "generated_at": "2026-04-06T12:00:00+00:00"
}

Post-returning endpoints use "posts" instead of "data" as the top-level key.


Post Object

FieldTypeDescription
idstringThreads post ID
textstringPost body text
timestampISO stringWhen it was posted
agostringHuman-readable relative time (“3 hours ago”)
varietystringoriginal, reply, quote, or repost
tagsstring[]Multi-label taxonomy tags (20 categories)
primary_tagstringSingle dominant tag
surprisefloatSelf-information per word in bits (higher = more surprising)
word_countintWord count
permalinkstringThreads permalink URL
metrics.viewsintView count
metrics.likesintLike count
metrics.repliesintReply count
sentimentfloat-1 to 1 (negative to positive)
energystringlow, mid, or high
intentstringstatement, reaction, question, share, social, shitpost
languagestringen, de, vi, es, etc.

Time-Based (8 endpoints)

MethodPathDescriptionKey Params
GET/posts/nowPosts in last 30 minuteslimit (default 500)
GET/posts/hourPosts in last hourlimit
GET/posts/todayPosts since midnight UTClimit
GET/posts/weekPosts in last 7 dayslimit
GET/posts/monthPosts in last 30 dayslimit
GET/posts/sincePosts since N minutes agominutes (default 60), limit
GET/posts/betweenPosts between two ISO datesfrom, to (required), limit
GET/posts/latestLast N postsn (default 10)

Search & Discovery (8 endpoints)

MethodPathDescriptionKey Params
GET/posts/searchFull-text search (Postgres GIN)q (required)
GET/posts/tag/{tag}Posts with a given tag
GET/posts/tag/{tag}/latestLatest N posts in a tagn (default 5)
GET/posts/randomOne random post (excludes reposts)
GET/posts/random/{tag}Random post from a tag
GET/posts/{post_id}Single post by ID
GET/posts/similar/{post_id}Vector similarity search (pgvector cosine)
GET/posts/semantic-searchSearch by meaning via embeddingsq (required)

The two vector endpoints (similar and semantic-search) use nomic-embed-text 768d embeddings stored in pgvector. similar finds posts near a given post. semantic-search embeds your query string on the fly via Ollama, then finds the 20 nearest posts.


Analytics (8 endpoints)

MethodPathDescriptionKey Params
GET/stats/overviewTotal posts, date range, posts today/week/month
GET/stats/streakCurrent consecutive posting days
GET/stats/topTop N posts by metricby (views/likes/replies), n (default 10)
GET/stats/top/todayMost engaged posts from today
GET/stats/hourlyPosts per hour today
GET/stats/dailyPosts per day this week
GET/stats/tagsTag distribution with counts and percentages
GET/stats/velocityPosting rate (posts/day) over 7, 30, 90 days

Social (3 endpoints)

MethodPathDescriptionKey Params
GET/social/mentionsWho I mention most (top 20)
GET/social/interactionsReply interactions from conversations table (top 30)
GET/social/conversations/{post_id}Full reply thread for a post

Knowledge Graph (2 endpoints)

MethodPathDescriptionKey Params
GET/graph/topicsTag clusters with connection strengths (from kg_nodes/kg_edges)
GET/graph/related/{tag}Tags related to a given tag via knowledge graph edges

Built from PMI-weighted co-occurrence, temporal proximity, and TF-IDF concept nodes.


Digests (3 endpoints)

MethodPathDescriptionKey Params
GET/digest/todayStructured summary: counts, top tags, top post
GET/digest/weekWeekly summary with daily breakdown
GET/digest/briefOne-paragraph natural language summary (designed for Apple Intelligence)

Enrichment & Insights (5 endpoints)

MethodPathDescriptionKey Params
GET/vibe/nowToday’s vibe breakdown (falls back to intent if no vibe tags)
GET/moodCurrent mood — sentiment + energy from last 24h
GET/driftTopic drift: what you’re posting more/less about vs last month
GET/bridgesPosts bridging 3+ topic clusters
GET/who-am-iFull personality snapshot: tags, sentiment, vibes, energy, peak hours, top people

/who-am-i is the “mirror” endpoint — it pulls together every enrichment dimension into one response. Good for LLM context injection.


Analysis (5 endpoints)

MethodPathDescriptionKey Params
GET/analysis/sentimentSentiment over timewindow (day/week/month), tag (optional)
GET/analysis/energyEnergy distributionsince (ISO date, optional)
GET/analysis/intentIntent distributionsince (ISO date, optional)
GET/analysis/hoursPosting pattern by hour of day (with avg sentiment)
GET/analysis/languageLanguage distribution across all posts

Tech Genealogy (5 endpoints)

The tech genealogy tracks 49 technology topics across all posts — programming languages, frameworks, infrastructure, AI tools. It builds a co-occurrence graph showing which technologies get discussed together and how your tech interests evolved over time.

MethodPathDescriptionKey Params
GET/genealogy/topicsAll tech topics with frequency and date range
GET/genealogy/timelineMonthly topic frequenciestopic (optional)
GET/genealogy/connectionsTopic-to-topic co-occurrence graphtopic (optional)
GET/genealogy/evolutionChronological posts mentioning a topictopic (required)
GET/genealogy/briefNatural language summary of tech journey

Pedagogy Genealogy (6 endpoints)

Same structure as tech genealogy but for pedagogical methods and teaching concepts (30 topics). Includes a vector-search endpoint for semantic discovery — it embeds your query and finds pedagogically relevant posts even when keywords don’t match.

MethodPathDescriptionKey Params
GET/pedagogy/topicsAll pedagogy topics with frequency and date range
GET/pedagogy/timelineMonthly topic frequenciestopic (optional)
GET/pedagogy/connectionsPedagogy topic co-occurrence graphtopic (optional)
GET/pedagogy/evolutionChronological posts mentioning a topictopic (required)
GET/pedagogy/briefNatural language summary of pedagogical journey
GET/pedagogy/vector-searchSemantic search for pedagogical postsq (required)

Haiku Oracle (2 endpoints)

MethodPathDescriptionKey Params
GET/haiku/latestLatest haiku with source post graph
GET/haiku/allAll haikus with source counts (last 50)

The haiku oracle runs 2-4 times a day. It picks random posts, feeds them to Gemma 4, and generates a haiku. Each haiku gets a UUID and is linked back to its source posts via haiku_edges — so you can trace which posts inspired which haiku. The latest haiku also gets sent to iMessage.


Utility (5 endpoints)

MethodPathDescriptionKey Params
GET/healthHealth check (verifies DB connection)
GET/llms.txtFull LLM instruction file (auto-generated from live routes)
GET/llms-mini.txtCompact LLM instruction file
GET/docs/swaggerSwagger UI
GET/docs/openapi.jsonRaw OpenAPI 3.0 spec

Data Pipeline

Threads API --> posts.json --> Postgres (pgvector)
                  |
           analysis pipeline --> tags, surprise, knowledge graph
                  |
           enrichment --> sentiment, energy, intent, language
                  |
           genealogy --> tech museum (49 topics) + pedagogy museum (30 topics)
                  |
           embeddings --> nomic-embed-text 768d --> semantic search

The sync worker pulls from the Threads API every 30 minutes. Posts go through information-theoretic analysis (Shannon entropy, Zipf, Heaps’ law), heuristic classification into 20 tags with 35 sub-tags, and then enrichment for sentiment, energy, intent, and language — all computed locally, no LLM needed. The genealogy tables are built by keyword extraction across tech and pedagogy domains. Embeddings are generated via Ollama (nomic-embed-text) and stored as pgvector 768d vectors.


iOS Shortcuts

Every GET endpoint works directly in iOS Shortcuts:

  1. Get Contents of URL action with the Tailscale URL
  2. Get Dictionary Value to extract posts or data
  3. Repeat with Each to iterate results

Tailscale needs to be running on your iPhone/iPad. No auth headers needed. The /digest/brief endpoint returns a single natural language paragraph — ideal for Siri/Apple Intelligence integration.


Example Queries

QuestionEndpoint
What did I post today?GET /posts/today
Search for FoucaultGET /posts/search?q=foucault
My tech journeyGET /genealogy/brief
Find similar postsGET /posts/similar/{id}
Current moodGET /mood
Haiku oracleGET /haiku/latest
Who am I?GET /who-am-i

Build Your Own — Starter Prompts

These are copy-paste prompts for Claude Code (or any coding agent). Each one bootstraps a piece of the pipeline against YOUR data. Swap “Threads” for whatever dataset you have — tweets, journal entries, Obsidian notes, Slack messages.

Prompt 1: The Foundation

I have [DESCRIBE YOUR DATA — e.g., “50K social media posts in a JSON file with text, timestamp, and author fields”].

Build me:

  1. A Postgres schema that captures the data + metadata
  2. A seed script that loads the JSON into Postgres
  3. A Docker Compose with Postgres + a Flask API
  4. 10 GET endpoints: /posts/latest, /posts/today, /posts/search?q=, /posts/random, /stats/overview, /stats/tags, /posts/tag/{tag}, /health, /llms.txt (auto-generated from routes), and an OpenAPI spec at /docs
  5. Bind to 0.0.0.0 so I can access it over Tailscale

Use flask-openapi3 with all doc UIs. Parameterized SQL only. JSON responses with { data, count, query, generated_at }.

My Postgres has [N] text posts. Ollama is running locally with nomic-embed-text.

  1. Upgrade Postgres to pgvector/pgvector:pg17
  2. Add a vector(768) column to my posts table
  3. Write a batch embedding script that calls Ollama nomic-embed-text for each post, 10 concurrent, with progress logging
  4. Add two Flask endpoints: /posts/semantic-search?q= (embeds the query, cosine similarity search) and /posts/similar/{id} (find posts similar to a given post)
  5. Run it on my first 100 posts as a test

Prompt 3: Enrichment (No LLM Needed)

I have [N] text posts in Postgres.

Write an enrichment script that computes these for every post — no LLM calls, just text analysis:

  • sentiment: lexicon-based, -1 to 1
  • energy: low/mid/high based on caps ratio, exclamation marks, length
  • intent: question/reaction/share/social/statement/shitpost (regex classification)
  • language: detect en/es/de/vi/other from character patterns
  • hour_bucket: extract UTC hour from timestamp
  • is_weekend: boolean

Add columns to Postgres, batch update all posts, and add Flask endpoints: /mood (24h summary), /vibe/now (today’s breakdown), /analysis/sentiment?window=week, /analysis/hours (posting pattern by hour).

Prompt 4: Knowledge Genealogy (The Museum)

I have [N] posts with text in Postgres + embeddings.

Build a “genealogy” pipeline that tracks how I talk about [DOMAIN — e.g., “technology”, “cooking”, “fitness”] over time:

  1. Define 30-50 topic keywords with regex patterns (e.g., ‘python’: /\bpython\b/i)
  2. Scan all posts, extract topic mentions, store in a genealogy table (post_id, topic, timestamp)
  3. Compute co-occurrence edges: topics that appear in the same post or within 24 hours of each other
  4. Add Flask endpoints: /genealogy/topics, /genealogy/timeline?topic=, /genealogy/connections, /genealogy/evolution?topic=, /genealogy/brief (natural language summary)
  5. The brief endpoint should say things like “Top topic: X (N mentions since Month Year). Strongest connection: X <-> Y.”

Also do a vector-discovery pass: pick the 5 longest keyword-matched posts as seeds, find the 50 nearest posts by embedding similarity, add any with similarity > 0.5 as ‘vector-discovered’ entries.

Prompt 5: The Haiku Oracle (Creative Agent)

I have [N] posts in Postgres spanning [DATE RANGE].

Build a “haiku oracle” agent that:

  1. Picks 8-10 random posts across time periods (ancient, old, recent, fresh, + 1 high-surprise post)
  2. Sends them to a local LLM (Ollama gemma4:e4b) with a prompt to write one haiku (5-7-5) capturing the thread connecting them
  3. Stores the haiku with a UUID in a graph structure: haikus table (uuid, haiku, model, generated_at) + haiku_edges table (haiku_uuid, post_id, period, post_text snapshot, post_timestamp)
  4. Sends the haiku via iMessage using AppleScript (read phone from .env ALERT_PHONE)
  5. Has a —loop mode that runs 2-4 times per day at random hours
  6. Flask endpoints: /haiku/latest (with full source graph), /haiku/all

Prompt 6: iOS Shortcuts Integration

My Flask API is running at http://[TAILSCALE_IP]:4323 with [N] endpoints.

Write me an iOS Shortcuts guide with step-by-step recipes for:

  1. “My Last Posts” — ask for a number, GET /posts/latest?n={input}, format and display
  2. “Search Threads” — ask for text, URL-encode, GET /posts/search?q={encoded}, display results
  3. “Daily Digest” — GET /digest/brief, show the natural language summary
  4. “Ask My Data” — get question, pass JSON to ChatGPT via “Ask ChatGPT” action for interpretation

The key insight: use the API for data retrieval, use on-device Apple Intelligence or ChatGPT for the interpretation layer. Cheap local API + smart output processing.


The Philosophy

This isn’t a tutorial about building an API. It’s about treating your own data as a first-class citizen.

Every post you’ve ever written is a data point. Every @mention is a graph edge. Every topic shift is a signal. The tools to analyze all of this used to cost thousands in cloud compute. Now it runs on a Mac mini with free local models.

The prompts above aren’t prescriptive — they’re starting points. Your Claude instance will adapt them to your data shape, your schema, your questions. The pedagogy here is: give someone the foundation and the tools, and the curiosity does the rest.

Start with Prompt 1. Get your data into Postgres. See it in Swagger UI. Then go deeper.


Built in one session. 69 endpoints. Zero cloud costs. The data was always there — it just needed infrastructure.