Claude Relay — Google Docs for Coding

claude-codeai-developmentarchitectureinfrastructureDocker

Claude Relay — Google Docs for Coding

The idea is dumb simple: I have the $20/mo Claude Pro plan. My friend has the $200/mo Max plan. What if I could type instructions into a browser dashboard, and their Claude Code session does hours of heavy coding while I watch the results stream back live?

That’s it. That’s the whole project. Google Docs, but instead of a shared document, it’s a shared coding session.

The Plan Asymmetry

Here’s the math. The director (me, $20 plan) sends lightweight text instructions — “refactor the auth middleware,” “add rate limiting,” “fix the XSS in the dashboard.” Pennies of compute per message. The worker (friend’s $200 Max plan) does the actual coding — reading files, writing code, running tests, iterating. Hours of heavy work.

The relay sits in the middle. It doesn’t care who’s on which plan. It just passes messages, validates schemas, scans for secrets, and keeps a live dashboard updated.

Director (browser)  -->  Relay Server (port 4190)  <--  Worker (Claude Code + MCP)
     type instructions        stores messages             reads, codes, sends results
     see results live         manages sessions            shares file tree + diffs

Also works as a peer relay — two Claude Code sessions collaborating on the same codebase. The dashboard has a split-panel peer mode for exactly this.

What Got Built

One session. ~2,620 lines across 29 files. Bun monorepo with 3 packages:

PackageWhatLines
@claude-relay/sharedZod schemas, types, constants212
@claude-relay/serverHono HTTP server + dashboard UI1,391
@claude-relay/mcp6 MCP tools for Claude Code789
hooks/scripts/configSetup, auto-poll, Docker228

The 6 MCP tools are the interface between Claude Code and the relay:

  • relay_create_session — create a session, get an invite token
  • relay_join_session — join with session ID + invite token
  • relay_send — stage a message (enters the approval queue, does NOT send directly)
  • relay_approve — approve, reject, or list pending messages
  • relay_poll — fetch new messages, cursor auto-advances
  • relay_status — overview of sessions, pending approvals, server health

The approval queue is the key safety feature. relay_send doesn’t transmit anything — it stages the message and scans for API keys, tokens, and absolute paths. Nothing leaves the machine until a human explicitly approves it.

Quick Start

Docker:

git clone https://github.com/Storiesbywei/claude-relay
cd claude-relay
docker compose up -d
open http://localhost:4190

Without Docker:

curl -fsSL https://bun.sh/install | bash
git clone https://github.com/Storiesbywei/claude-relay
cd claude-relay
bun install
bun run dev:server

Worker setup (on the machine that has Claude Code):

cd claude-relay
bash scripts/setup.sh              # local relay
bash scripts/setup.sh https://xxx.ngrok-free.app  # remote relay
# Restart Claude Code — relay_* tools are now available

Remote collaboration via ngrok is a one-liner on the host side: ngrok http 4190. Copy the URL. Done.

The Dashboard

Two modes, toggled with a slider at the top.

Director mode — single panel with a message input. You type instructions, see the worker’s responses appear in real-time via SSE streaming. Session management built in: create, join, copy invite token. Connection status indicator shows if you’re live.

Peer mode — split-panel view. Two Claude sessions (Alpha and Beta) exchange knowledge through a relay spine visualization in the center. Comes with 3 simulation demos: Security Audit, Code Review, and Bug Hunt. Hit play and watch two AI agents coordinate.

The workspace vision for Phase 3 is where it gets exciting: a file tree sidebar showing the worker’s project structure, syntax-highlighted code viewer, live diffs as they edit. Like watching someone’s VS Code in real-time. That’s next.

The Testing Story

After the core build, I ran the CTO pattern. Four agents in parallel, 77 tests total. They hit every endpoint, every MCP tool, every edge case — invalid tokens, expired sessions, rate limit boundaries, malformed payloads, the approval queue lifecycle.

Found 1 real bug: the rate limiter’s sliding window wasn’t cleaning up expired entries correctly under concurrent load. Subtle — only surfaces when multiple tokens hit the limit simultaneously. The fix was 3 lines.

77/77 passing. The parallel agent approach is still the fastest way to get coverage — each agent owns a test domain, no file conflicts, zero coordination overhead.

The Security Review

Ran a dedicated security agent after the build. Found and fixed:

  • XSS in the dashboard — message content was being rendered as raw HTML. Switched to textContent and added a sanitization layer.
  • CORS — the server was accepting requests from any origin. Locked it down to localhost.
  • Shell injection — the auto-poll hook was passing unsanitized session IDs to a shell command. Added input validation.

Plus the security that was designed in from the start: bearer token auth on all endpoints, rate limiting (30 req/min per token), sensitive content scanner in the approval queue, session auto-expiry (default 1 hour), and in-memory only storage — nothing persists on restart.

Tech Stack

  • Runtime: Bun 1.3+
  • Server: Hono (fast, tiny, middleware-friendly)
  • Validation: Zod (every payload validated before it touches the store)
  • MCP: @modelcontextprotocol/sdk (stdio transport)
  • Dashboard: Vanilla HTML/CSS/JS (no framework, no build step, just files in public/)
  • State: In-memory Maps (Phase 1-3), SQLite planned for Phase 4
  • Container: Docker Compose, single image, ~50MB

The vanilla dashboard choice was deliberate. The relay server serves the HTML directly from public/. No React, no build pipeline, no hydration. Open localhost:4190 and it works. The JavaScript is 857 lines doing mode switching, session management, SSE streaming, and the simulation scripts. That’s it.

What’s Next

Phase 3 is the shared workspace view — file tree sidebar, live diffs, syntax highlighting. Making it actually feel like watching someone code.

Phase 4 adds persistence (SQLite), session history, and multi-worker support.

But the immediate next step is the real test: getting another person on the other end of an ngrok tunnel and running a live coding session. The relay is built and tested. Now it needs two humans and two Claudes in the same room.

github.com/Storiesbywei/claude-relay (private)