Claude Code Sessions: AI Hedge Fund

Claude CodefinanceinfrastructureDockeragentstradingrisk-managementTailscalentfyautomation

From Portfolio PIDs to a Market Data Platform in One Session

Date: March 21, 2026 Project: ai-hedge-fund (forked, extended) Tools: Claude Code (two terminal sessions), Docker, Postgres, Poetry, Finviz Elite


The Starting Point

I have this fork of virattt’s ai-hedge-fund — a multi-agent trading analysis system where LLMs pretend to be Warren Buffett, Charlie Munger, Michael Burry, and a dozen other investors. Each “analyst” is really just the same Claude model with a different system prompt. The mechanical analysts (technical, fundamentals, growth, sentiment, valuation) actually compute things. The persona agents are what I’ve started calling LLM cosplay — they add cost without adding signal.

The project already had Finviz Elite integration: a client that scrapes financial data, 22 portfolio templates covering everything from mega-cap tech to biotech to REITs, and a filter registry with 105 categories and 3,390 filter values. What it didn’t have was portfolio IDs (PIDs) — Finviz’s internal identifier for saved screener views.

So the session started small. Just add PIDs to 22 portfolios.


Act 1: Validation Cascade

Adding PIDs meant touching portfolios.py, which meant looking at every ticker in every portfolio. 672 tickers across 22 templates. Some of these were written months ago. Stocks get delisted. Companies get acquired. Tickers change.

So I asked Claude to validate them. Not by guessing — by hitting the Finviz API.

The validation ran across all 22 portfolios in parallel. Results:

  • 639 unique tickers checked against live Finviz data
  • 33 stale/delisted tickers caught automatically
  • Acquisitions (Figma’s attempted merger changed ADBE’s profile), delistings (some biotech SPACs that evaporated), ticker renames (META was still listed as FB in one portfolio from 2023)
  • Crypto tokens that had leaked in from training data — AAVE and UNI are DeFi tokens, not US equities

The key insight: portfolio tickers hallucinate over time. Not because the LLM made them up (though some were that), but because markets move. A ticker list from six months ago has entropy. You need live validation as a build step, not a one-time check.


Act 2: “Let’s Put This on Docker”

This is the moment the session pivoted. I was looking at the validated portfolio data and thinking about the rate limit problem. Finviz Elite allows one screener request per 60 seconds. With 22 portfolios and hundreds of filters, real-time analysis is impractical. You need a cache layer.

Which means you need a database. Which means you need infrastructure.

The conversation went something like:

“What if we just… put this on Docker with Postgres?”

And then the session became something else entirely.


Act 3: Research Agents in Parallel

This is where Claude Code becomes a thinking partner rather than a code generator. Instead of building sequentially, I spawned parallel research agents:

Agent 1: Yahoo Finance API research While I worked on the Docker setup, an agent researched Yahoo Finance as a data source. yfinance (the Python library) is unofficial but widely used. No API key needed. Rate limits are generous for personal use. It covers fundamentals, historical prices, options chains, institutional holdings — basically everything Finviz has plus real-time-ish quotes.

Agent 2: Postgres schema design A second agent designed the database schema for a market data platform. Not just “dump JSON into a table” — a proper normalized schema:

TablePurpose
stocksTicker registry with sector, industry, market cap
daily_pricesOHLCV with partitioning by date
fundamentalsP/E, P/B, ROE, margins — quarterly snapshots
analyst_signalsOutput from each mechanical analyst per ticker per run
portfolio_holdingsWhich tickers belong to which portfolio template
finviz_snapshotsRaw Finviz data cached with TTL
yahoo_snapshotsRaw Yahoo Finance data cached with TTL

The schema was designed for time-series queries from the start — partitioned tables, proper indexes on (ticker, date), and a signals table that stores analyst outputs as JSONB so the schema doesn’t break when I add new analysts.

Agent 3: Vercel cost analysis A third agent researched whether Vercel could host the data pipeline. This one came back with a clear answer.


Act 4: The Cost Insight

The Vercel analysis was illuminating:

PlatformMonthly CostBest For
Vercel Pro$25/moStatic dashboard, serverless API
Railway$5/moPersistent Postgres + cron jobs
Local Docker$0/moHeavy data pipeline, full control
Vercel Free$0/moStatic site serving only

Vercel Pro works for this — you can run serverless functions that query an external database. But a data pipeline that scrapes Finviz every 60 seconds and runs Yahoo Finance bulk downloads doesn’t belong on serverless. You’re paying for compute that sits idle between 10-second bursts of work, and you’re fighting cold starts on a process that needs to maintain rate limit state.

Railway at $5/mo is the sweet spot if you want cloud hosting. But for a personal project where I’m the only user, local Docker at $0 is the obvious choice.


Act 5: Hybrid Architecture

The architecture that emerged:

LOCAL DOCKER (the engine room)
├── Postgres 16 ──── market data, signals, caches
├── data-pipeline ── Finviz scraper (respects 60s rate limit)
│ Yahoo Finance bulk download (daily)
│ Signal generators (15 mechanical strategies)
├── redis ────────── Rate limit state, job queue
└── pgAdmin ──────── Local DB admin at localhost:5050

VERCEL (the storefront)
├── Static dashboard ── Next.js reading from Postgres read replica
├── API routes ──────── /api/signals, /api/portfolio, /api/health
└── Cost: $0 ────────── Free tier handles static + light API

Docker does the heavy lifting. Vercel serves the results. The Postgres instance is the bridge — Docker writes, Vercel reads. For a single-user system, this is a direct read from the same database. If I ever wanted to scale it, the Postgres connection string is the only thing that changes.


Act 6: Two Terminals Converge

Here’s the part that still feels like science fiction.

I had two Claude Code sessions running simultaneously. This one — building the market data platform. And another one analyzing my Threads posting data (37,912 posts from @maybe_foucault) for the bythewei.co archive.

The Threads session was doing sentiment analysis. It had classified every post into 20 discourse categories, computed Shannon entropy over the topic distribution, and built a transition matrix showing how topics flow into each other. Standard information theory stuff.

Then it produced a sentiment integration document — a specification for how social media sentiment signals could feed into a financial analysis pipeline. The idea: if you can classify 37,912 social media posts by topic and sentiment, you can do the same thing to financial news, earnings call transcripts, and SEC filings.

That document slots directly into this project’s signal pipeline. The NLP infrastructure built for Threads analysis (tokenization, sentiment classification, entropy computation) is the same infrastructure needed for news sentiment analysis in the hedge fund project.

Two sessions. Two different projects. One shared Postgres instance. The output of one becomes the input of the other.


Act 7: The $0.82 to $0.02 Arc

This is the number that makes the whole session worth writing about.

The original ai-hedge-fund runs all analysis through LLMs. Every ticker, every analyst, every run. The persona agents (Buffett, Munger, Burry, Graham, etc.) are just different system prompts asking Claude to roleplay as a famous investor. Here’s what that costs:

LLM cosplay approach:

  • 6 persona agents x ~2,000 input tokens x ~500 output tokens per ticker
  • At Claude Sonnet pricing: ~$0.82 per full analysis run (5 tickers)
  • The “analysis” is the LLM pattern-matching on financial data and producing text that sounds like Warren Buffett
  • No reproducibility — run it twice, get different answers

Mechanical computation approach:

  • 15 rule-based signal generators
  • Inputs: price data, fundamentals, ratios — all cached in Postgres
  • Pure Python computation: moving averages, RSI, MACD, Piotroski F-Score, Altman Z-Score, PEG ratio
  • Cost: ~$0.02 per run (just the database query time and compute)
  • Fully reproducible — same inputs always produce same outputs

That’s a 41x cost reduction. And the mechanical signals are better because they’re grounded in decades of academic finance research, not in an LLM’s attempt to channel Benjamin Graham’s ghost.

The 15 signal generators designed in this session:

#SignalCategoryBased On
1Trend FollowingTechnical50/200 SMA crossover
2Mean ReversionTechnicalBollinger Band deviation
3MomentumTechnicalRSI + MACD convergence
4Volume ProfileTechnicalOBV + volume-price trend
5Piotroski F-ScoreFundamental9-point financial health
6Altman Z-ScoreFundamentalBankruptcy probability
7Quality ScoreFundamentalROE + margins + debt ratios
8PEG RatioGrowthPrice/earnings-to-growth
9Revenue AccelerationGrowthQoQ revenue growth rate of change
10Earnings SurpriseGrowthActual vs. consensus estimates
11News SentimentSentimentNLP on financial news (not LLM)
12Insider ActivitySentimentSEC Form 4 buy/sell ratio
13Institutional FlowSentiment13F filing changes
14DCF Intrinsic ValueValuationDiscounted cash flow model
15Relative ValueValuationSector-relative multiples

Each one is a Python function. No LLM calls. No prompt engineering. Just math on real data.


Act 8: Skills as Automation

The last thing built in the session: a /market-analysis Claude Code skill. Skills are reusable commands that encode a workflow so future sessions don’t have to rediscover it.

The skill does:

  1. Pull latest data from Finviz + Yahoo Finance into Postgres
  2. Run all 15 signal generators across the active portfolio
  3. Aggregate signals into a composite score per ticker
  4. Output a ranked watchlist with bull/bear/neutral ratings
  5. Cache results with a 4-hour TTL

One command. No LLM spend. The entire analysis pipeline that used to require carefully prompting six different AI personas now runs as a single deterministic computation.


What I Actually Learned

Claude Code as thinking partner, not code generator. The most valuable moments weren’t “write me a Dockerfile.” They were the architectural conversations — “should this be on Vercel or Docker?” “what’s the right cache TTL for fundamental data?” “how do you partition a time-series table in Postgres?” The code generation was a side effect of the thinking.

Parallel agents are a superpower. While I was building one thing, three research agents were investigating adjacent questions. The Yahoo Finance research, the Postgres schema, and the Vercel cost analysis all ran concurrently. By the time I needed each answer, it was already there. This is the workflow that makes Claude Code feel different from a chat interface — you’re delegating, not waiting.

Two sessions can collaborate via shared infrastructure. The Threads analysis session and the hedge fund session were independent processes that happened to write to the same Postgres instance. No explicit coordination. No message passing. Just two streams of work that produced complementary artifacts. The sentiment analysis infrastructure from Threads became a component in the financial analysis pipeline.

Replace theater with computation. The persona agents were always theater. They gave the project a cool demo (“look, AI Warren Buffett is analyzing your portfolio!”) but the actual signal was noise. Real financial analysis is unglamorous — it’s moving average crossovers and Piotroski scores and DCF models. The LLM’s job isn’t to be the analyst. It’s to help you build the analyst.

The best infrastructure is boring infrastructure. Docker + Postgres + Python cron jobs. No Kubernetes. No event sourcing. No microservices. Just a database, a scraper, and some math. The entire platform fits in a docker-compose.yml that’s 40 lines long.


Session Stats

MetricValue
Duration~6 hours
Parallel research agents spawned3
Tickers validated639
Stale tickers caught33
Portfolio templates with PIDs22
Signal generators designed15
Cost per analysis run (before)$0.82
Cost per analysis run (after)$0.02
Docker services in compose4 (Postgres, pipeline, Redis, pgAdmin)
Vercel monthly cost$0 (static dashboard on free tier)
Claude Code sessions running simultaneously2

Docker Quick Start

# Start everything
docker compose -f docker/docker-compose.yml -p ai-hedge-fund up postgres pipeline dashboard grafana streamlit-dashboard

# First run: seed portfolios + pull data
poetry run python src/data/scheduler.py --seed
poetry run python src/data/scheduler.py --once finviz
poetry run python src/data/scheduler.py --once backfill # 5yr Yahoo OHLCV
PortServiceWhat
3000Grafana11-panel dashboard (admin/hedgefund)
5434PostgresRaw market data (639 tickers)
8080nginxStatic dashboard (auto-refreshes)
8501StreamlitInteractive analysis (5 pages)

Files Changed

src/tools/finviz/portfolios.py — 22 portfolios with PIDs, 33 stale tickers removed
src/data/pipeline_finviz.py — Finviz snapshot pipeline (22 portfolios, 908 rows)
src/data/pipeline_yahoo.py — Yahoo OHLCV pipeline (639 tickers, 5yr backfill)
src/data/scheduler.py — Scheduler + auto-export after each pull
src/data/export_static.py — JSON export for static dashboard
src/dashboard/app.py — Streamlit dashboard (894 lines, 5 pages)
docker/docker-compose.yml — Postgres + pipeline + nginx + Grafana + Streamlit
docker/schema.sql — 9 tables, monthly/yearly partitions
docker/grafana/ — Pre-provisioned datasource + 11-panel dashboard
docker/nginx.conf — Static dashboard server with security headers
docs/signal-generators.md — 15 mechanical signal generators (63K chars)
docs/threads-sentiment-integration.md — Threads API sentiment pipeline spec
.claude/commands/market-analysis.md — /market-analysis skill

The session started as “add some IDs to a Python dict” and ended as “build a market data platform with three frontends.” That’s not scope creep. That’s what happens when your thinking partner can research three things while you build a fourth.


Session 2: From Market Commentary to Production Trading Model

Date: March 22-23, 2026 Tools: Claude Code, Docker, Postgres, Alpaca Markets API, Finviz Elite


The Wake-Up Call

Session 1 built the infrastructure. Session 2 started by trying to use it — I asked Claude to do a market analysis. It pulled fresh Finviz data, queried Postgres, ran screeners, web-searched for macro context (Iran war, oil at $100, S&P below 200-day SMA), and produced a report.

My partner looked at it and said: “What type of trading model did you use?”

The honest answer: none. It was a market commentary with data sprinkled on top. Better vibes than the LLM cosplay agents, but still vibes. The report said “buy XOM because oil is up” — that’s analysis, not a model. A model has defined entry/exit rules, position sizing math, backtested performance, and risk parameters.


”The Model IS the Risk Management”

My partner dropped three insights that shaped everything:

  1. “Most of the work is risk management” — signal generation is the easy part. Anyone can compute RSI. The hard part is not losing money.

  2. “Stop losses kill profit on volatile securities” — and traders actively hunt stop-loss clusters. If your stop is at an obvious level, someone will run it.

  3. “The whole point of having an execution runtime” — we have Docker running 24/7. We don’t need conditional orders on the exchange. All logic stays internal. The market never sees our stops.

This reframed the entire build. Instead of “build signals then add risk management,” it became “build risk management then add signals.”


The Risk-First Architecture

Position sizing IS the risk management. No stop-losses on the exchange. Ever.

Exit triggers (internal only, never on exchange):
 1. Signal reversal — the signal that opened the position flips
 2. Time decay — held past max holding period (21-126 days by strategy type)
 3. Circuit breaker — portfolio drawdown hits threshold

What we do NOT do:
 Price-based stop-losses (get hunted)
 Trailing stops on exchange (visible to everyone)
 Conditional orders (telegraph your intentions)

The position sizer uses fractional Kelly (25% of Kelly optimal) normalized by volatility. A low-vol utility stock gets a bigger position than a high-vol biotech — not because we like utilities more, but because the risk contribution is the same.

Circuit breakers:

  • -5% portfolio drawdown: reduce all sizes by 30%
  • -10%: close everything, 100% cash, wait for regime change
  • -15%: halt trading 5 days, manual restart required

Seven Strategies, Zero LLM Calls

We extracted the existing 10 mechanical signals from technicals.py and quant_analyst.py into standalone strategy modules, then added 3 new Finviz-based strategies:

StrategySourceWhat It Does
Momentumtechnicals.pySMA alignment, EMA crossovers, ADX strength
Mean Reversiontechnicals.pyRSI + Bollinger Bands + z-score
Quant Momentumquant_analyst.pyMACD, Parabolic SAR, Heikin-Ashi
Pattern Recognitionquant_analyst.pyBB W-bottom/M-top, candlestick patterns
Value + Qualitysignal-generators.mdSector-relative P/E + Piotroski F-Score
Growthsignal-generators.mdEPS acceleration + earnings surprise
Flow/Sentimentsignal-generators.mdVolume, short interest, insider flow (L2 proxy)

Each strategy is backtested independently. Must show Sharpe > 0.3 out-of-sample or it’s excluded. The ensemble combines them with regime-adaptive weights — momentum gets 35% weight in risk-on, but only 10% in transition.

6,189 lines of trading code. Zero LLM calls anywhere.


15 Million Rows

This is where it got wild.

Session 1 had 781K rows of Yahoo daily data. My partner said that’s low. Asked about 1-minute bars. Asked about a tick database. Asked why we’re limited to 639 stocks.

So we added Alpaca Markets. Free API. Unlimited historical bars. IEX exchange data (not SIP — that’s paid). Connected over Ethernet for speed.

Then someone said “we have like 100 GB to spare” and it escalated:

poetry run python src/data/pipeline_alpaca.py --backfill 1day --days 1825 --universe

That pulls 5 years of daily bars for every tradable US equity — 12,323 tickers. It ran for about 90 minutes over Ethernet and filled Postgres with:

TimeframeRowsTickersHistory
Daily10,543,41112,2375 years
Hourly1,797,4585392 years
5-min1,488,11553960 days
1-min637,1965397 days
Yahoo daily781,1546395 years
Finviz8,138639snapshots
Total15,255,47212,2373.4 GB

The 639 Finviz tickers remain the curated signal universe — the stocks we’ve done fundamental work on. Alpaca gives price data for everything else. Screener presets can discover new tickers outside the 639, and Alpaca can instantly pull their history for backtesting.


Walk-Forward Backtesting

The backtester runs 5 rolling windows:

W1: Train 2021-01 → 2023-12, Test 2024-01 → 2024-06
W2: Train 2021-07 → 2024-06, Test 2024-07 → 2024-12
W3: Train 2022-01 → 2024-12, Test 2025-01 → 2025-06
W4: Train 2022-07 → 2025-06, Test 2025-07 → 2025-12
W5: Train 2023-01 → 2025-12, Test 2026-01 → 2026-03

Anti-overfit rules: minimum 30 trades per window, maximum 5 tunable parameters per strategy, must report both in-sample AND out-of-sample metrics. If in-sample Sharpe is 3.0 but OOS is 0.5, the strategy is overfit.

The mechanical allocator replaces the LLM portfolio manager entirely:

  • Composite score +70 to +100 → full position (3%)
  • +40 to +69 → standard (2%)
  • +10 to +39 → half (1%)
  • Below +10 → no position

Paper trade minimum 3 months before any real capital.


13 Skills

The session ended with building 13 Claude Code skills — reusable slash commands for every common workflow:

SkillWhat It Does
/market-analysisFull signal pipeline + allocation recommendation
/screenerRun 23 Finviz presets (oversold, GARP, squeeze, etc.)
/regime-checkQuick RISK_ON/OFF/NEUTRAL/TRANSITION + why
/ticker-deep-diveAll 7 strategies + Finviz data + news on one ticker
/pull-dataFresh Finviz + Yahoo + Alpaca pull, export dashboard
/data-statusRow counts, freshness, DB size, container health
/backfillHistorical data: --universe for all 12K tickers
/paper-statusOpen positions, equity curve, recent trades, P&L
/portfolio-healthCircuit breaker status, concentration, correlation
/docker-statusCheck services, restart if needed, show logs
/export-reportMarkdown report to Desktop with web research
/add-tickersAdd symbols, pull Alpaca data, update portfolios
/validate-portfoliosValidate all 639 tickers against live Finviz API

A new Claude session reads the CLAUDE.md, runs /data-status to see what’s fresh, and starts working. No ramp-up needed.


Session 2 Stats

MetricValue
Duration~8 hours
Agents spawned~25
Commits pushed16 (ai-hedge-fund) + 3 (bythewei)
Trading system19 files, 6,189 lines
Strategies7 mechanical, zero LLM
Risk managementPosition sizing, vol budget, circuit breakers, regime detection
Postgres rows15,255,472
Tickers in database12,237 (full US equity market)
Database size3.4 GB
Skills created11 new (13 total)
Cost per run (before → after)$0.82 → $0.00

The Uncomfortable Truth

My partner was right about everything. The first session built infrastructure. The second session tried to use it and discovered the gap between “data platform” and “trading model.” The gap is risk management.

Stop-losses get hunted. Static rules get gamed. LLMs cosplaying as Warren Buffett add cost without signal. The real edge is boring: size positions according to volatility, exit on signal reversal not price levels, and never let the market see your stops.

The model is the risk management. Everything else is inputs.


Session 3: The Mac Mini Command Center

Date: March 27, 2026 Tools: Claude Code (MacBook Air → SSH → Mac mini), Docker, Tailscale, ntfy, Apple Shortcuts


The Pivot

Sessions 1 and 2 built a market data platform on the MacBook. Session 3 moved it to its permanent home — a Mac mini upstairs running as an always-on command center. But the migration was just the trigger. The real build was the notification and control layer on top.

Full writeup: The Mac Mini Command Center


What Got Built

Database migration: 26M rows of Postgres market data (6.1 GB) dumped, compressed to 475 MB, transferred over SMB, restored in 2 minutes. All 12 Docker containers rebuilt and running on the Mac mini.

Self-hosted ntfy: Private notification server on the Mac mini, accessible only via Tailscale. 7 channels with distinct personalities — the docker-events channel is a Drama Queen, hedge-pipelines is a Tired Data Engineer, hedge-signals is a Terse Floor Trader.

iPhone remote control: Apple Shortcuts POST commands to ntfy → Mac mini command listener executes them → results POST back. “Hey Siri, market check” from your Apple Watch queries a 26M-row Postgres database.

Autonomous monitoring: 8 scripts on cron/launchd: pipeline health, trade signals, resource alerts, daily P&L summary, heartbeat monitoring. Mac mini watches itself and reports to my wrist.

Claude-notify upgrade: Notifications now parse last_assistant_message from the hook payload. “Done, come back” became “Done: Fixed iPad layout crash in reader view.” Background Haiku summarization (free via claude -p) sends a polished follow-up.

TRMNL integration: E-ink dashboard scaffolded for ambient awareness — portfolio value, regime status, pipeline health. Different rhythm than ntfy: hourly glances vs real-time alerts.


Session 3 Stats

MetricValue
Duration~4 hours
Agents spawned11
Postgres rows on Mac mini26,149,991
Docker containers12 + ntfy
ntfy channels7
Monitoring scripts8
Cron jobs6
Launchd daemons2
iPhone commands available6
Total monthly cost$0

The Three-Layer Architecture

AMBIENT  — TRMNL e-ink    — hourly     — "glance at desk, everything's fine"
ACTIVE   — ntfy → Watch   — real-time  — "pipeline failed, react"
INTERACT — SSH + Claude   — on-demand  — "build, analyze, ship"

The Mac mini is the engine. The MacBook is the cockpit. The iPhone is the remote. The TRMNL is the instrument panel. Each device does what it’s best at.