Docs Case Studies

AI Interview Platform — Codepakt on a Production Codebase

Codepakt managing feature work on a production AI interview platform — a real codebase, real users, and a real deployment pipeline. Not a demo. Not a greenfield build. The codepakt board tracks tasks across backend microservices, a frontend SPA, database migrations, message queues, cloud storage, and caching infrastructure. Tasks are added incrementally, completions reviewed from the dashboard, and the knowledge base enables asynchronous code review between agents.

Result: 41 tasks across 8 epics managed through the codepakt board. 38 completed, 3 remaining. The board is still active — this is ongoing feature work, not a one-shot project.

Why This Case Study Matters

The previous case studies showed codepakt managing greenfield projects — start from zero, build to done. This one answers a harder question: does the board work for ongoing production feature work?

Production code is different:

  • Existing architecture — agents must understand and extend existing patterns, not invent new ones
  • Multi-service — changes span backend microservices, API gateways, message queues, and frontend
  • Real infrastructure — cloud storage, caching layers, database migrations, message brokers
  • Incremental features — tasks arrive as priorities shift, not all at once from a PRD
  • Ongoing work — the board doesn’t “finish.” New epics get added as the product evolves

The Setup

This is an existing SaaS platform with:

  • Backend microservices communicating via message queue
  • API gateway routing to internal services
  • Frontend SPA consuming the gateway
  • Cloud blob storage for file handling
  • Redis for caching and real-time state
  • Relational database with migration-managed schemas

The project was already running. Codepakt was introduced to coordinate feature work across the existing codebase.

The Agents

AgentRoleTasks completed
claudeFull-stack — backend services, frontend components, database migrations, API design30
codexBackend + frontend — API endpoints, database queries, enum migrations, frontend toggles, code review8

Claude handled the core architecture and most full-stack features. Codex contributed across backend and frontend — API gateway endpoints, database query enrichment, enum migrations, client-side violation reporting, UI label updates, and admin settings toggles. Codex also acted as a reviewer, writing a KB doc with feedback that informed later work.

Epics and Tasks

Work was organized into 8 epics, created incrementally over 3 days as features were prioritized:

File Handling + CRUD Operations

No epic (ad hoc tasks): 6 tasks covering file upload API exploration, cloud blob storage integration, download URL wiring, and evaluation CRUD operations (backend response changes, frontend API functions, UI wiring with delete buttons).

Tasks arrived individually as needs were identified. The board evolved organically — not from a single PRD decomposition.

Bulk Upload Support

Fresher ATS Integration (5 tasks): ZIP file upload support across the full stack. Frontend file input changes, backend extraction service, routing through document processing pipeline, file size limit increases, and upload progress UI.

Dependency chain enforced build order: frontend upload → backend extraction → routing → then polish tasks (size limits, progress UI). Several tasks auto-resolved — the agent found the work already done by an earlier task in the chain.

Processing Visibility

Resume Processing Visibility (2 tasks): Redis-based tracking for active document processing. Backend added a Redis index + lookup endpoint; frontend added page-load checks and progress display.

Clean two-task epic: backend creates the infrastructure, frontend consumes it.

Token-Based Access

Per-Candidate Invite Tokens (9 tasks): The largest single epic. Full implementation of a token-based access system: database entity + migration, DTO changes, service-layer CRUD, message queue handlers, token consumption logic, API gateway endpoints, frontend API functions, and two UI flows (candidate-facing and admin-facing).

The dependency chain was deep (7 levels) and linear — each layer depended on the one below it: DB entity → service methods → MQ handlers → consumption logic → gateway endpoints → frontend API → UI flows.

Violation Monitoring

Proctoring Violations Viewer (7 tasks): Hybrid data retrieval from cloud JSON storage with database fallback, message queue handler, API gateway endpoint, database query enrichment, frontend API function, modal component, and admin UI integration.

This was the first epic where both agents worked simultaneously: Codex picked up a database query task while Claude built the core retrieval logic. After Claude completed the service layer, Codex handled the API gateway endpoint. Claude then built the frontend chain: API function → modal component → UI integration. A second pass fixed the service after review feedback.

Cross-agent review: Codex wrote a knowledge base document with review feedback on Claude’s service implementation. Claude returned, picked up the task again, and applied the fixes. The KB doc served as an asynchronous code review — no direct agent-to-agent communication needed.

When an Agent Can’t Solve It — Triage via the Board

During the violation monitoring epic, Claude hit a real debugging wall: client-side violations were being detected and sent via the message queue, but weren’t persisting to the database. Logs showed “Persisted 1 violations” on the sender side, but the DB had zero rows.

Claude investigated and identified three suspected causes — a silent error in the MQ response handling, a race condition between two concurrent save methods overwriting each other’s counts, and possible file watcher errors on the same session. But it couldn’t confirm the root cause without production data.

Instead of guessing at a fix, Claude did two things:

  1. Created T-037 as a backlog triage task — with the full investigation context baked into the description: what was observed, the three hypotheses, and four concrete triage steps (add structured logging, fix the MQ client error handling, fix the race condition, test on prod).

  2. Wrote a learning doc to the knowledge base via cpk docs write — the raw debugging observations with the specific conversation ID, exact behavior (“Saved ?” in RMQ response), and the cross-reference to T-037 for follow-up.

Both the task and the KB entry are visible from the dashboard. The board captured the structured triage plan, the KB captured the raw investigation notes. Whoever picks up T-037 next has both — the what-to-do and the what-was-observed — without re-investigating from scratch.

This is the board working as intended. Not every problem gets solved in one pass. When an agent hits a wall, the right move is to capture what’s known and put it back on the board for later triage.

Client-Side Persistence

Client-Side Violation Persistence (7 tasks): Backend enum extension + migration, new API endpoint, bug fix, frontend API function, two UI integration tasks for sending violations, and a modal label update. Work split across both agents — Codex handled the enum migration, frontend API, violation sending, and modal labels. Claude handled the backend endpoint, bug fix, and the other violation sender.

Feature Toggles

Proctoring Toggle (4 tasks): Adding configuration toggles to enable/disable proctoring at the college and job level. Codex built both frontend admin toggles. Claude handled the backend flag reading and frontend gating logic. 2 tasks in review — this epic is the current working edge.

How Coordination Worked

Incremental task creation

Unlike the greenfield case studies where an agent decomposed a PRD into all tasks at once, this project added tasks incrementally across multiple sessions:

  • 1 exploratory task → 2 follow-ups based on findings → 3 tasks for CRUD operations
  • 5 tasks for a bulk upload epic
  • 2 tasks for a visibility feature
  • 9 tasks for a token-based access system
  • 7 tasks for a monitoring feature
  • 7 tasks for a persistence feature
  • 4 tasks for feature toggles

The board grew organically. Features were scoped, decomposed, and added as the developer identified priorities — exactly how a real team works.

Deep dependency chains

The token system epic (T-014 through T-022) had a 7-level dependency chain: DB migration → service CRUD → MQ handlers → token consumption → gateway endpoints → frontend API → UI flows. Codepakt’s dependency resolver enforced the correct build order automatically.

Parallel multi-agent work

The proctoring epic was the first time both agents worked on the same feature simultaneously. Codex started on a database query enrichment task (T-026) while Claude built the core service logic (T-023). No conflicts — the tasks touched different layers of the stack.

Asynchronous code review via KB

After completing T-025, Codex wrote a knowledge base document with review feedback on Claude’s T-023 implementation. When Claude returned to the task, it read the feedback and applied fixes. The KB system acted as an asynchronous review channel — structured, persistent, and without requiring real-time coordination.

By the Numbers

MetricValue
Total tasks41
Tasks completed38
Tasks remaining3 (2 in review, 1 backlog)
Agents2 (Claude: 30, Codex: 8)
Epics8
DurationMultiple sessions (incremental)
Merge conflicts0
Task collisions0
KB docs created3 (API reference, review feedback, triage notes)
Stack layers touched6 (DB, services, MQ, gateway, frontend API, UI)

Takeaways

The codepakt board works on production codebases, not just demos. The same cpk task pickup → work → cpk task done → human review cycle that managed toy projects scales to a real multi-service architecture with message queues, cloud storage, and database migrations.

Steer with tasks, not micromanagement. No need to tell agents how to implement features. Add tasks to the board with clear scope, set priorities and dependencies, and review results. Codepakt makes this workflow structured — the board is the interface between intent and execution.

Incremental task creation matches real workflows. Not every project starts with a complete PRD. Tasks were added across multiple sessions as priorities shifted. The board handled this naturally — new tasks slotted into the dependency graph alongside existing work. This is how real teams operate.

The dependency graph enforces correct build order. A 7-level chain (DB → service → MQ → gateway → frontend → UI) built correctly without manual sequencing. Dependencies wired at task creation; codepakt enforced them automatically.

The knowledge base enables asynchronous code review. Codex reviewed Claude’s work by writing a cpk docs write entry with feedback. Claude read the feedback and applied fixes. No real-time coordination needed — the KB is a structured, persistent review channel that’s also visible from the dashboard.

The board captures what agents can’t solve yet. Not every task gets resolved in one pass. When Claude couldn’t confirm a root cause without production data, it created a triage task with the full investigation — three hypotheses, observed behavior, and concrete next steps. The board becomes the handoff document. Whoever picks it up next has everything without re-investigating.

The board outlives any single session. Come back across multiple sessions and the board is exactly where you left it — task state, dependencies, agent assignments, and the full event log preserved. This is the value of a persistent task board over ad hoc agent conversations.

The dashboard gives you visibility across agents. Both agents working on the same feature? Visible on the board. Task blocked by a dependency? Visible. Agent idle? Visible. This observability is what makes multi-agent work manageable — you stay in control.