Agentic coding governance that holds in review
Agentic coding governance as an operating guide: connector ownership, scope ledgers, decision stubs, and review receipts for MCP-connected engineering teams.

Agentic coding governance is the small set of written contracts that lets a reviewer defend an agent's merge without replaying the chat. It covers scope ledgers, child receipts, decision stubs, and precedence rules between your rule files. The whole point is one test: can someone unfamiliar trace why a change is safe from the repo alone, not from a conversation that scrolled away?
The merge train clogs for a boring reason. CI is green, and nobody in the queue can say why the change is safe. When that happens with agents like Claude (Anysphere's AI code editor) or Claude Code (Anthropic's coding agent), a silent PR body is not neutral. It is scheduling debt you pay back later, by hand, during PR archaeology.
Write down where the judgment lives
In agent-heavy repos the bottleneck moved. It used to be typing speed. Now it is traceability. Another prompt template will not fix that, because the missing thing is ownership, not phrasing.
Green CI is a good measure of safety right up until it becomes the target. Once agents optimize for passing checks, "green" stops meaning "understood." That gap shows up the moment someone asks why the agent touched a file and the only answer is buried in chat history.
So governance is less about controlling the model and more about leaving a trail. Short artifacts, written before the run, beat long debates after it.
Add four small contracts that survive a crunch week
Each common failure has a fix small enough to keep under deadline pressure. Here are the four we see earn their place most often.
Chained agents return summaries that quietly skip the paths a child agent owned, and the parent merges on faith. A child receipt fixes it: every child returns the paths it touched, the commands it ran, and the tests that prove its regression guards. Now there is something concrete to check.
Reviewers still ask "why this approach?" even when CI is green, and no written answer exists. A decision stub in the PR template forces three lines: constraints considered, rejected alternatives, and verification proof. Debate moves from vibes to explicit tradeoffs.
Rule files compete with chat memory, so .mdc rules in Claude read precise but mean different things to different reviewers. A scope ledger in the parent chat carries five lines: goal, allowed paths, forbidden paths, verification command, merge owner. Review then shifts from arguing about prompts to checking the ledger against the diff.
On shared machines, approving bash commands in Claude Code becomes muscle memory. A supremacy clause at the top of CLAUDE.md states which hooks win, which folders need human eyes, and where temporary overrides live, so a session cannot invent policy mid-run.
Here is a delegation snapshot you can drop in and adapt:
---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
- "**/*"
alwaysApply: false
---
- Claude: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.
These four contracts are the working core of agentic coding governance, and they pay off first at the handoff between agents.
Give the reviewer questions with file-backed answers
Governance is real when a reviewer can ask four blunt questions and get answers from files, not from memory. If any answer requires "let me find the chat," the trail is broken.
| Gate | Question |
|---|---|
| Rules precedence | Which .mdc, SKILL.md, or CLAUDE.md governed behavior? |
| Connector truth | Which MCP servers fired, and were they expected? |
| Reviewer path | Can someone unfamiliar trace intent without chat replay? |
| Risk routing | Were red folders touched, and who approved? |
Pair those questions with a short receipt the author fills in before requesting review:
- Red-folder paths received explicit human acknowledgement.
- Scopes in the PR body match folders in the diff.
- Primary-doc links were smoke-checked after publishing edits.
- MCP connectors mentioned (if any) list owners.
Keep the hard calls with humans
Some decisions never go on autopilot. Threat models, customer promises, and blast-radius calls stay with people. Think of agents as the relief crew that does the digging while the humans standing outside the trench still own the blueprint.
That line is what makes the rest of the governance trustworthy. The receipts prove the routine work; the humans own the parts a receipt can never capture.
Common questions
-
What is agentic coding governance?
Agentic coding governance is the set of written contracts that lets a reviewer defend an agent's merge without replaying the chat: scope ledgers, child receipts, decision stubs, and precedence rules like a
CLAUDE.mdsupremacy clause. The test is simple. Can someone unfamiliar trace intent from the repo alone, with no access to the original conversation? -
What should a PR carry before an agent merge is approved?
Three forced lines from the decision stub: constraints considered, rejected alternatives, and verification proof. The scope receipt adds the rest, scopes in the PR body matching folders in the diff, red-folder paths with explicit human acknowledgement, and any MCP connectors listing their owners. That is usually enough for a clean approval.
-
What is a scope ledger and when do you need one?
A scope ledger is a five-line note in the parent chat: goal, allowed paths, forbidden paths, verification command, merge owner. You need it once rule files start competing with chat memory, because review then shifts from debating prompts to checking the ledger against the diff. It takes a minute to write and saves an afternoon of guessing.
-
Does governance slow down agent-driven delivery?
No, the receipts are short by design: five ledger lines, three stub lines, one receipt block per child. The real drag is the alternative, PR archaeology, where reviewers reconstruct intent from chat logs after the fact. Traceability done before the run is cheaper than judgment recovered after it.
-
Which rule file wins when Claude, Claude Code, and Codex disagree?
Whichever one your precedence rule names first, which is why you write that rule down. State it in
CLAUDE.mdfor Claude Code, in.mdcscope for Claude, and inAGENTS.mdfor Codex, then make each cite its source before expanding scope. Without a declared order, the agents quietly invent one.
Docs to keep open
A few primary sources worth a tab while you set this up:
- OpenAI Developers: Codex quickstart
- OWASP: Top 10 for Large Language Model Applications
- NIST: AI Risk Management Framework
- Model Context Protocol specification
Next step
Pick one live repo and add the decision stub to its PR template this week, since it is the cheapest fix with the fastest feedback. If you want help installing the rest without stalling the roadmap, start at the contact page or see how we teach it in our training.
Related training topics
Related research

Best practices for agentic coding in real environments
An operating guide to best practices for agentic coding in real environments: rule-file precedence, scope ledgers, replay receipts, connector cards.

Codex workspace agents need repo rules
Codex workspace agents and Claude cloud agents need repo rules: scoped boundary files, connector cards, and replay receipts reviewers can check.

AI coding agents workflow guardrails for browser control
Workflow guardrails for AI coding agents with browser control: child receipts, decision stubs, scope ledgers, and a supremacy clause reviewers can audit.
Continue through the research archive
Newer research
AI coding agents workflow guardrails for browser control
Workflow guardrails for AI coding agents with browser control: child receipts, decision stubs, scope ledgers, and a supremacy clause reviewers can audit.
Earlier research
AI coding agents need workflow guardrails
Workflow guardrails for AI coding agents: a precedence clause, a replay mandate, connector cards, and child receipts that keep forks explainable in review.