Case study · Code agent

Repo-aware coding agent for an internal platform team

A repo-aware coding agent that plans, edits, and opens pull requests with passing tests — no hand-holding chat sessions.

+3.2×

PR velocity on refactors

client Internal platform teamindustry Softwareteam 14 engineerstimeline 6 weekscodename Forge

01 · Problem

A coding agent that knows the repo

The team wanted PR drafts that pass CI on the first try, not a chat window that requires babysitting. Refactors and dependency bumps were eating senior engineering time.

02 · Why it mattered

The cost of leaving it alone

Maintenance work was crowding out roadmap work. The backlog of safe-but-tedious changes grew every sprint, and the team’s flake-prone test suite made every change more expensive than it should have been.

03 · Architecture

Plan, edit, verify, ship

Built on Claude Code with custom MCP servers for the team’s monorepo: code ownership, lint cache, and a flake-aware test runner.

Repo-aware planner builds a step list before touching a file
Flake-aware runner re-runs only relevant suites and skips known flakes
Draft PRs ship with rationale and a risk assessment
Self-corrects on CI failure, capped at three cycles before human escalation

Stack: Claude Code · MCP · Git · CI

04 · Implementation

How it was built

Week 1–2: MCP servers for codeowners, lint cache, and test selection
Week 3–4: agent loop with planning, editing, and CI feedback
Week 5: eval suite of 40 historical PRs to benchmark quality
Week 6: rollout to the team with usage guardrails and a review policy

05 · Results

What the numbers say

PR velocity

3.2×

CI pass rate

91%

flakes auto-skipped

14/wk

06 · After launch

What happened next

Agent-authored PRs now handle most refactors and dependency bumps; critical-path features still start with humans. The eval suite runs on every agent update, so capability changes are measured, not vibes.

This system is an example of AI Agents & Internal Assistants work.

$ erick --find-bottleneck

Need a similar system?

Let's talk through your version of this — same architecture thinking, scoped to your operations and tools.

30 minutes · no pitch deck · reply within 24h if you write instead

Book a call →About AI Agents