BOOK A CALL →
Case study · Code agent

Repo-aware coding agent for an internal platform team

A repo-aware coding agent that plans, edits, and opens pull requests with passing tests — no hand-holding chat sessions.

+3.2×
PR velocity on refactors
client Internal platform teamindustry Softwareteam 14 engineerstimeline 6 weekscodename Forge
01 · Problem

A coding agent that knows the repo

The team wanted PR drafts that pass CI on the first try, not a chat window that requires babysitting. Refactors and dependency bumps were eating senior engineering time.

02 · Why it mattered

The cost of leaving it alone

Maintenance work was crowding out roadmap work. The backlog of safe-but-tedious changes grew every sprint, and the team’s flake-prone test suite made every change more expensive than it should have been.

03 · Architecture

Plan, edit, verify, ship

Built on Claude Code with custom MCP servers for the team’s monorepo: code ownership, lint cache, and a flake-aware test runner.

  • Repo-aware planner builds a step list before touching a file
  • Flake-aware runner re-runs only relevant suites and skips known flakes
  • Draft PRs ship with rationale and a risk assessment
  • Self-corrects on CI failure, capped at three cycles before human escalation

Stack: Claude Code · MCP · Git · CI

04 · Implementation

How it was built

  • Week 1–2: MCP servers for codeowners, lint cache, and test selection
  • Week 3–4: agent loop with planning, editing, and CI feedback
  • Week 5: eval suite of 40 historical PRs to benchmark quality
  • Week 6: rollout to the team with usage guardrails and a review policy
05 · Results

What the numbers say

PR velocity
3.2×
CI pass rate
91%
flakes auto-skipped
14/wk
06 · After launch

What happened next

Agent-authored PRs now handle most refactors and dependency bumps; critical-path features still start with humans. The eval suite runs on every agent update, so capability changes are measured, not vibes.

This system is an example of AI Agents & Internal Assistants work.

$ erick --find-bottleneck 

Need a similar system?

Let's talk through your version of this — same architecture thinking, scoped to your operations and tools.

30 minutes · no pitch deck · reply within 24h if you write instead

Book a call →About AI Agents