Docs Case Studies

Snake Game — Two Agents, One Board

Two AI agents — Claude Code and OpenAI Codex — build a complete browser-based Snake game from a PRD. No human writes code. The human writes the PRD, initializes the board, and watches from the dashboard.

Result: A working game in ~50 minutes. 14 tasks, all completed, one bug caught during testing, one post-launch fix added by the human and completed autonomously.

codepakt dashboard showing snake game tasksSnake game running on mobile

The Setup

Goal: Build a classic Snake game using only HTML, CSS, and JavaScript. No frameworks, no canvas, no dependencies.

PRD: A 13-section product requirements document covering game board (20x20 grid), snake behavior, food spawning, collision detection, scoring, restart, responsive layout, and accessibility. Delivered as a markdown file.

Initialization:

cpk server start
cpk init --name snake-game --prd ./prd.md

The PRD was stored in the knowledge base. An agent read it and decomposed it into 13 tasks with dependencies and verification commands.

The Agents

AgentRoleTasks completed
codexScaffolding + structure + responsive polish + testing + post-launch fix6
claudeGame logic + input handling + styling + verification + code review8

No agent registration. No capability matching. Both agents called cpk task pickup --agent <name> and the server handed them the highest-priority available task.

Task Breakdown

All 14 tasks from the actual codepakt board:

TaskTitlePriorityAgentTime
T-001Create HTML structureP0codex12 min
T-002Create base CSS with grid layoutP0codex12 min
T-003Create JS with game state initializationP0codex13 min
T-004Implement board rendering functionP0claude< 1 min
T-005Implement snake movement and game loopP0claude3 min
T-006Implement keyboard input handlingP0claude3 min
T-007Implement food spawning and eatingP0claude3 min
T-008Implement collision detection and game overP0claude3 min
T-009Implement score display and live updatingP1claude< 1 min
T-010Implement restart functionalityP1claude< 1 min
T-011Apply retro/minimal visual stylingP2claude< 1 min
T-012Add responsive layout adjustmentsP2codex1 min
T-013Verify all functional requirements and edge casesP1codex + claude24 min
T-014Desktop should not scrollP1codex → claude (review)~15 min

T-004, T-009, T-010, T-011 completed in under a minute — the previous agent had already implemented the functionality. The pickup agent verified it was done and marked review.

T-014 was added by the human after the initial build was complete — a real-world post-launch fix.

How Coordination Worked

Phase 1: Scaffolding (codex)

Codex picked up T-001, T-002, T-003 — the P0 structural tasks. These had no dependencies. Codex built the HTML shell, CSS grid layout, and JavaScript state initialization in parallel.

While codex worked on structure, claude couldn’t pick up game logic tasks yet — they depended on the scaffolding. The deps_met flag kept them in backlog.

Phase 2: Game Logic (claude)

Once codex moved T-001/T-002/T-003 to review, dependencies resolved automatically. Claude picked up T-005 (movement), T-006 (input), T-007 (food), T-008 (collision) in sequence.

Some tasks (T-004, T-009, T-010) were already implemented by codex as part of the scaffolding. Claude verified them and marked done.

Phase 3: Polish + Testing (both)

Codex handled responsive layout (T-012). Claude handled visual styling (T-011). Then both agents tackled T-013 — the verification task.

The Bug

T-013 was the most interesting task. Claude ran 13 test scenarios and found a real bug:

The problem: The keyboard input handler checked direction reversals against direction (the last applied direction) instead of nextDirection (the buffered input). This meant rapid key presses could queue an opposite direction before the game loop applied the first one, causing the snake to reverse into itself.

// Before (buggy):
if (newDir === OPPOSITES[direction]) return;

// After (fixed):
if (newDir === OPPOSITES[nextDirection]) return;

Codex picked up the fix and re-verified. Both agents confirmed the bug was resolved.

This is exactly the kind of edge case that emerges when two agents work on the same codebase — one writes the input handler, another tests it thoroughly, and the bug surfaces because the tester isn’t anchored to the original implementation assumptions.

Phase 4: Post-Launch Fix (human → codex → claude)

After the game shipped, the human noticed the desktop layout scrolled on shorter viewports. Instead of fixing it manually, they added T-014 directly from the dashboard UI — clicked ”+ Add Task”, typed the title, set P1 priority, done.

The human then pointed Codex at the project and asked it to check for open tasks. Codex queried the board, found T-014 unassigned, and picked it up at 15:43.

The debugging loop: Codex didn’t guess at the CSS. It spun up a local server, launched Playwright via MCP, and measured actual DOM geometry — scrollHeight, innerHeight, getBoundingClientRect() on every layout element. It found the root cause: --board-size: min(78vmin, 560px) was purely width-driven, so on shorter desktop viewports the shell exceeded the screen height.

The fix required three iterations, each verified by re-measuring in the browser:

  1. First attemptcalc(100dvh - 360px): Scroll eliminated, but status text clipped 12px below the card bottom. Measured via status.bottom - shell.bottom.
  2. Second attemptcalc(100dvh - 392px): Status visible, but game-over text collided with restart button. Measured via status.bottom - actions.top.
  3. Final fixcalc(100dvh - 440px) + gap reduced from 14px to 8px: Full stack (header, HUD, board, status, button) fits cleanly. Both desktop (1440x900) and mobile (390x844) verified.

The knowledge base workaround: After completing the task at 15:47, codex wanted to record its detailed debugging process. It tried cpk task update to append notes, but discovered that the command doesn’t support adding notes after completion. Rather than giving up, it independently chose to write a learning doc to the knowledge base at 15:49:

cpk docs write --type learning \
  --title "T-014 Debug and Verification Notes" \
  --tags "T-014,layout,css,verification" \
  --author codex \
  --body "..."

The doc includes the full debugging path: how it identified the root cause, why each iteration failed, the exact CSS properties changed, and the Playwright assertions used for verification. No human told it to do this — the agent hit a tool limitation, adapted, and found an alternative path to record its work.

The review: The human then asked Claude to review T-014. Claude read the CSS changes, verified the three-layer approach (dynamic board cap, body overflow: hidden, mobile overflow: auto restoration), checked codex’s verification screenshots, noted the magic number 440px as a known tradeoff, and approved the task at 15:52:

cpk task done T-014 --agent claude \
  --notes "Code review passed. Three-layer no-scroll approach verified."

The full T-014 timeline from the event log:

TimeEventActor
15:41Task createdhuman
15:43Task picked upcodex
15:47Task completed → reviewcodex
15:49Debugging notes written to KBcodex
15:52Review approved → doneclaude

T-014 demonstrated the full post-launch lifecycle: human spots an issue → adds a task → agent self-serves → iterates on the fix with real browser verification → writes learnings to KB → another agent reviews → done. No coordination overhead. No context-passing between agents. The board is the single source of truth.

What Shipped

A complete, playable Snake game:

  • 20x20 CSS grid board
  • Snake movement at 150ms tick rate
  • Arrow keys + WASD controls
  • Food spawning + snake growth
  • Wall and self-collision detection
  • Score tracking
  • Restart without page reload
  • Dark retro theme with accessibility
  • Responsive (desktop + mobile, no desktop scroll)
  • Zero external dependencies
  • 3 files: index.html, style.css, script.js

By the Numbers

MetricValue
Total time (initial build)~50 minutes
Total time (with post-launch fix)~65 minutes
Agents2 (claude, codex)
Tasks created14
Tasks completed14
Bugs found1 (reversal prevention)
Post-launch fixes1 (desktop scroll)
Merge conflicts0
Task collisions0
External dependencies0
KB docs created by agents1 (debugging notes)
Lines of code~400 (HTML + CSS + JS)

Takeaways

Atomic pickup prevents wasted work. Neither agent ever worked on the same task. The BEGIN IMMEDIATE transaction guarantee meant zero conflicts despite both agents running simultaneously.

Dependencies keep the pipeline sane. Claude couldn’t start game logic until codex finished scaffolding. The server handled this automatically — no human needed to manually sequence work.

Verification tasks catch real bugs. T-013 (verification) found a legitimate input handling bug that would have shipped without coordinated testing. The agent that wrote the code wouldn’t have caught it — a fresh perspective from another agent did.

Not every task needs real work. Four tasks completed in under a minute because the work was already done. The agent verified and moved on. This is the right behavior — better to have granular tasks and skip what’s done than to miss something.

The board outlives the initial build. T-014 was added after the game shipped. The human dropped a task on the board, pointed codex at it, and the fix went from pickup to reviewed-and-done in 11 minutes. The coordination system works for maintenance, not just greenfield.

Agents adapt to tool limitations. Codex couldn’t append notes to a completed task via the CLI. Without being told, it wrote a detailed debugging doc to the knowledge base instead — root cause analysis, iteration log, Playwright assertions, everything. The agent found the constraint and worked around it using available tools.

Cross-agent review works. The human asked Claude to review codex’s CSS fix. Claude read the diff, verified the approach, noted the magic number tradeoff, and approved. Two agents, zero coordination overhead — the board handled the handoff.

50 minutes for a complete game. From cpk init --prd to a deployed, responsive, accessible Snake game with no human code. The human’s job was writing the PRD and watching the dashboard.