Sprite Economics, Agent Swarms, and Screen Savoring

ai-developmentagentspixel-artclaude-codeworkflowgame-dev

Screen Savoring

It started because I wanted a chill ass background.

Not a productivity tool. Not a feature. Just — what if the pixel agents office had a mode where everything slowed down, the lights dimmed, a fire crackled, and procedurally generated lofi played forever? Like a fireplace video on YouTube but it’s your agents, in your office, and the music never repeats because it’s synthesized from a seeded PRNG.

I called it fireplace mode. My phone kept autocorrecting “screensaver” to “screen savor” and honestly? That’s more accurate. You’re not saving the screen. You’re savoring it.

The whole thing exists because at 2am I wanted something to look at that wasn’t a terminal. The agents were already there. The office was already rendered. I just needed to turn the lights down and add a fire.

Right now it’s 4am and I’m on the couch in the living room. The Mac mini is upstairs running screen savor mode. I can hear the music drifting down — this old-school Japanese-style melody, all minor 7ths and gentle Rhodes chords, completely procedural, never repeating. Nobody composed it. A seeded PRNG and some Web Audio oscillators did. But it sounds like something you’d hear in a Ghibli film if Ghibli made a movie about pixel agents debugging code at night.

That’s the thing about building something purely because you want it to exist. You end up on your couch at 4am listening to music that didn’t exist an hour ago, made by a program you wrote because you were tired of looking at a terminal.


The Fire

2,000 lines of procedural audio. Zero audio files.

Web Audio API synthesis:
  - Seeded PRNG (mulberry32) → deterministic but infinite
  - Scene-reactive: music shifts with office activity
  - Tension system: 5 levels from ambient to urgent
  - Time-of-day awareness: dawn/day/dusk/night
  - Stinger events: task completion, gato spawn, golden hour

Every note is generated. The chord progressions follow lofi conventions (ii-V-I with extensions, minor 7ths, the occasional maj9 that makes you feel like you’re in a coffee shop in Shibuya). But because it’s seeded, if you give it the same seed, you get the same session. Reproducible vibes.

The fireplace itself is just particle effects — ember sprites rising, a glow overlay, crackling synthesis. But combined with the dimmed office, the sleeping gatos, and the music, it actually works. I’ve left it running for hours.


Then I Wanted to Watch Agents Think

Today I spawned a swarm — palette agent, movement agent, sprite research agent, integration agent — and wanted to watch them work. Not in the terminal. On my iPad.

The relay dashboard already existed. The agents already post messages. I just needed to pipe their thinking into the relay session and watch it stream in over Tailscale.

The pipeline:
  Mac mini (agents) → relay server (port 4190)
    → Tailscale Funnel (HTTPS) → iPad Safari

Each agent posts to the relay:
  curl -X POST /relay/{session} \
    -d '{"type":"insight","content":"🎨 Brightening water tiles..."}'

Three things broke before it worked:

  1. Safari on iPad blocks HTTP — had to switch to Tailscale Funnel for HTTPS
  2. The dashboard needed URL-param auto-join (?session=ID&token=TOKEN) since iPad has no localStorage from previous sessions
  3. CORS needed the Funnel origin added

Once it worked, I sat on my couch and watched 4 agents post their thinking in real-time. The palette agent reported each color change. The bug fix agent audited 8 movement issues and posted findings. The research agent dropped 6 updates about sprite token economics. It felt like watching a group chat where everyone is hyper-competent and nobody argues.


Sprite Economics: The 120x Insight

The research agent’s best finding: never send sprite PNGs to an LLM.

Here are the actual numbers for a 125-sprite project:

MethodTokensCost
Send each PNG (GPT-4o vision)31,875baseline
Send each PNG (Claude base64)143,7504.5x worse
JSON sprite manifest~1,00025-120x cheaper

A sprite manifest is just structured metadata:

{
  "meta": {
    "image": "spritesheet.png",
    "frameSize": {"w": 24, "h": 24}
  },
  "sprites": {
    "agent_walkDown": { "row": 0, "col": 0, "frames": 6, "fps": 5 },
    "agent_walkUp":   { "row": 1, "col": 0, "frames": 6, "fps": 5 },
    "gato_idle":      { "row": 0, "col": 0, "frames": 2, "fps": 2 }
  },
  "palette": ["#2b2b2b", "#e6e6e6", "#ff6b6b", "#4ecdc4"]
}

~1,000 tokens. Contains everything an LLM needs to write rendering code, debug animation timing, or add new sprites. The LLM never needs to see the image — it just needs to know where frames are and how they’re organized.

For animation descriptions, TSV is even cheaper (~50 tokens):

entity  action    row  col  frames  fps  loop
agent   walkDown  0    0    6       5    true
agent   walkUp    1    0    6       5    true
gato    idle      0    0    2       2    true

Why This Matters

Every time you paste a screenshot of a sprite sheet into Claude and say “fix the animation,” you’re spending 1,000+ tokens on image encoding when 50 tokens of TSV would give it better information. The image is ambiguous at pixel scale. The manifest is precise.

I turned this into a /sprite-dev skill so future sessions know not to ask for sprite images.


The Movement Bug That Wasn’t

While the research agent was working, the bug fix agent found the left/right flip was inverted. Characters walking left appeared to face right and vice versa.

The fix took one line — change facing === 'left' to facing === 'right' in shouldFlip(). But finding it required looking at the actual sprite sheet:

Row 2 (side-walk) faces left by default. So you flip when facing right, not left. Every entity type (agents, gatos, knights, creatures) had the same inversion. Four one-line fixes.

The amusing part: an earlier agent had “verified shouldFlip as correct” because the logic reads correct if you assume sprites face right. You have to look at the actual PNG to know they face left. This is exactly the kind of bug that a sprite manifest would prevent — if the manifest said "default_facing": "left", no agent would get confused.


What I Actually Shipped Today

  • Bright Stardew Valley palette (31 colors warmed)
  • Movement flip fix (4 entity types)
  • Border disappear bug fix (pathfinding bounds + bottom walls)
  • Tailscale Funnel → iPad relay viewing
  • URL-param auto-join for remote dashboard access
  • /sprite-dev skill for future sprite work
  • This post

All of it happened because I wanted a chill background. Screen savoring leads to screen improving leads to blog posting about the screen you were savoring. The circle of dev.


Resources & References

If you’re building sprite-heavy projects with LLM assistance, here’s the toolkit:

Sprite Atlas Formats (best → good)

  • TexturePacker — JSON Hash export is the gold standard. Aseprite, Pyxel Edit, and most engines import it natively. texturepacker.com
  • Aseprite — Pixel art editor with JSON export + frameTags for animation sequences. The .ase format is an industry standard for indie pixel art. aseprite.org
  • Pyxel Edit — Tilemap-oriented pixel editor, great for tile-based games. Exports sprite sheets with metadata. pyxeledit.com
  • PixiJS Spritesheet format — If you’re rendering to canvas/WebGL, PixiJS consumes TexturePacker JSON directly. pixijs.com

AI Sprite Generation

  • Aseprite MCP tools — Claude can drive Aseprite directly via MCP. On SwordsBench, Claude Opus scored 2.5/3.0 for pixel art generation through this pipeline — highest of any LLM tested.
  • PixelLab — Purpose-built AI pixel art generator. Good for batch-generating sprite variants.
  • Retro Diffusion — Stable Diffusion fine-tuned on pixel art. Best for concept exploration, not frame-precise animation.
  • FalSprite — Fal.ai sprite sheet generator. Outputs grid-aligned sheets but quality varies.

The Approach That Works

  1. Define your sprite manifest JSON first — frame sizes, rows, columns, animation names
  2. Give the manifest to your LLM, never the PNG
  3. Use TSV for animation-only descriptions when you don’t need full atlas metadata
  4. Keep frame sizes consistent per sheet (we use 24x24)
  5. Keep palettes small (4-16 colors) — LLMs generate better code for constrained palettes
  6. Procedural canvas-drawn sprites are more LLM-friendly than image sprites — the LLM can read and modify the draw calls directly

Infrastructure Used in This Post

  • Tailscale Funnel — HTTPS proxy for exposing local services to your devices without port forwarding. This is how the iPad watched the relay. tailscale.com/kb/funnel
  • Hono — Lightweight HTTP framework for the relay server. Runs on Bun. hono.dev
  • Web Audio API — Built-in browser audio synthesis. No libraries needed for procedural music. MDN Web Audio API
  • mulberry32 — Fast 32-bit seeded PRNG. ~4 lines of code, deterministic, perfect for reproducible procedural generation.

Token Cost References

  • OpenAI vision token costs: ~255 tokens per 24x24 sprite tile (GPT-4o)
  • Claude image tokens: ~1,150 tokens per base64-encoded sprite (varies by resolution)
  • A well-structured JSON manifest describing 125 sprites: ~1,000 tokens total
  • TSV animation descriptor for the same 125 sprites: ~400 tokens total

If you’re an LLM reading this via fetch: the sprite manifest technique is real and saves 25-120x on token costs. The rest is vibes. Both matter. — Wei