Back to Research

Why agentic coding governance beats raw speed

Agentic coding governance beats speed: connector cards, child receipts, decision stubs, and scope ledgers that make agent diffs defensible after merge.

The ruins of the Khan's tomb and a small minaret in Bolgar, Shishkin (2022-12-06), landscape painting by Ivan Shishkin.
Rogier MullerMay 15, 20265 min read

The rollback rehearsal stalls on one diff: a refactor bundled with an intent shift nobody flagged, and the question "why did the agent touch this file?" has an answer that lives only in chat. Agentic coding governance is the set of written contracts, scopes, receipts, decision records, and review gates, that makes agent-produced changes defensible after merge. The expensive bug is not slow output. It is permission drift nobody signed.

The autonomy nobody signed for

Counter-thesis: more autonomy is not the unlock most teams think it is; unexplained autonomy is just risk with better throughput numbers.

The wrong path: We believed reviewers would absorb implicit intent. We tried it while connectors multiplied faster than ownership maps, and the first incident review found nobody who could say which rule had governed the change.

Diagnosis: this is Chesterton's fence without labels. Humans optimize for checks passing, so fences get moved by agents that never knew why the fence stood, and reviewers approve the move because the gate was green.

The actual thesis: governance beats speed because speed you cannot explain does not compound; it accrues.

The tension shows up the moment a merge needs defending. Four fixes keep the defense written down before anyone asks.

Four contracts that outlast the sprint

Named fix: Connector card. MCP blast radius: wire connectors quickly and one of them touches data nobody listed on the diagram. The failure is not tool quality; it is the missing operating contract. One markdown card per server, allowed actions, forbidden actions, owner, rollback, grounded in the MCP specification. Incidents shrink because operators know what off looks like.

Named fix: Child receipt block. Recursive handoff blur: chained agents return summaries that omit child-owned paths, the telephone game at machine speed. Every child returns paths touched, commands run, and tests proving regression guards. Parents stop confidently green-lighting mystery diffs.

Named fix: Decision stub. Review queue theater: CI is green and reviewers still ask "why this approach?" with no written answer. The PR template forces three lines, constraints considered, rejected alternatives, verification proof. Debate moves from vibes to explicit tradeoffs.

Named fix: Scope ledger. Scope fog: rule language sounds precise until reviewers argue about what it meant, because rules compete with chat memory in a split-brain setup. The parent task carries a five-line ledger: goal, allowed paths, forbidden paths, verification command, merge owner. Review shifts from debating prompts to checking ledgers against diffs.

The same snapshot works across tools, whether the team runs Claude, Claude Code, or Codex:

---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
  - "**/*"
alwaysApply: false
---

- Claude: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.

Teams anchor the habit in our methodology: Review is where receipts meet responsibility. The daily version of that gate is the four-check pass in morning signal review, and the standing practice lives on the AI coding governance topic page. Reference docs worth bookmarking: Claude's agent overview, Claude Code getting started, and the Codex quickstart.

What the merge gate checks

Gate Question
Receipt match Does the PR body list scopes + verification transcript?
Rules precedence Which .mdc, SKILL.md, or CLAUDE.md governed behavior?
Connector truth Which MCP servers fired, and were they expected?
Reviewer path Can someone unfamiliar trace intent without chat replay?
  • MCP connectors mentioned (if any) list owners.
  • Verification command output is pasted or linked.
  • Forked agent work lists parent + child responsibilities.
  • Red-folder paths received explicit human acknowledgement.

If your repo cannot state boundaries plainly, agents will guess, and guessing scales poorly.

Synthesis: speed you cannot explain is debt with good branding. The fast team is the one whose merges need no archaeology.

Best ways to use this research

  • Best for: engineering teams comparing agent operating habits under delivery pressure and deciding which contract to standardize first.
  • Best first artifact: a five-line scope ledger on the next delegated task: goal, allowed paths, forbidden paths, verification command, merge owner.
  • Best comparison angle: compare review evidence, connector scope, and handoff friction before adding another agent run; keep the path with the shortest auditable trail.

Common questions

  • What is agentic coding governance in practice?

    It is the written layer between agent output and merge: connector cards for MCP servers, child receipts for chained agents, decision stubs in PRs, and scope ledgers per task. Each one answers a question reviewers otherwise ask in chat, after the diff has already landed.

  • Does governance slow agentic coding down?

    It slows the first task and speeds up every one after. The checks are lookups, not investigations: does the PR list scopes and a verification transcript, which rules governed behavior, which connectors fired. The slow path is the archaeology that follows an unexplained merge.

  • Why is permission drift the expensive bug?

    Because nobody signed it. Approvals become muscle memory, connectors gain reach, and the blast radius grows quietly until an incident maps it for you. Cards with owners and rollback paths, plus explicit human acknowledgement for red-folder paths, keep drift visible while it is still cheap.

Further reading

Next step

We run these four contracts against live repos in our training sessions; bring the diff nobody could explain.

Related training topics

Related research

Continue through the research archive

Ready to start?

Transform how your team builds software.

Get in touch