Back to Research

Agent Boundaries That Hold

Set rules, scopes, and checks for agentic coding tools.

Editorial illustration for Agent Boundaries That Hold. That is the governance problem.
Rogier MullerMay 1, 20265 min read

The situation

Teams are no longer choosing between “AI in the editor” and “AI in the terminal.” They are standardizing how agents read instructions, touch tools, and hand work back for review. That is the governance problem.

The failure mode is familiar: one shared rule file grows until nobody trusts it, one connector gets broad access because setup was easier than review, and one agent-generated change lands without a clear verification path. The result is unclear ownership.

This matters most for teams comparing or mixing Claude, Claude Code, and Codex. Each one exposes a different surface, but the operating question is the same: what should be persistent, what should be scoped, what should be delegated, and what must be checked before merge?

A useful starting point is the related training topic. The goal is a small set of artifacts that make agent work reviewable.

Walkthrough

  1. Start with one shared instruction layer, then split by scope.

For any repo, write down the durable rules that every agent should see: architecture constraints, test expectations, and “do not do this” items. Keep them short. If a rule is only relevant in one folder, move it there.

For Claude Code, that usually means a compact CLAUDE.md for global context plus narrower project notes where needed. For Codex, use AGENTS.md and nested instruction files so local scope wins over a flat root file. For Claude, move from a single broad rule file to scoped .cursor/rules/*.mdc files with clear descriptions and globs.

# AGENTS.md
- Keep changes small and reviewable.
- Run the project test command before asking for review.
- Do not edit generated files unless the task explicitly requires it.

## services/payments/
- Preserve backward-compatible API responses.
- Add or update tests for any behavior change.
  1. Treat skills as packaged behavior, not loose prompts.

Skills are useful when a team repeats the same workflow and wants the agent to load only the relevant instructions. That is true across tools, even if the packaging differs.

A good skill description should say when it applies, what it changes, and what success looks like. If the description is vague, the skill will be hard to discover and easy to ignore. Anthropic’s skills docs emphasize progressive disclosure; OpenAI’s skills material points in the same direction for reusable task bundles.

A minimal skill skeleton is enough to start:

---
name: release-note-draft
description: Draft release notes from merged PRs when the change set is already approved.
---
  1. Put tool access behind explicit boundaries.

MCP is where governance becomes concrete. It is the connector layer for GitHub, Slack, docs, databases, and other external systems. That makes it powerful and risky.

Review each connector with the same questions: what data can it read, what can it write, and what is the smallest scope that still works? In Claude Code, that means checking .mcp.json and permission modes before broadening access. In Claude, it means deciding which MCP servers belong in the workspace and which should stay out. In Codex, it means keeping the connector boundary visible in the CLI workflow and not assuming a tool can safely act just because it can connect.

  1. Add a deterministic verification loop.

Agent work should end in a repeatable check, not a vague “looks good.” For Codex, that often means a CLI loop that edits, runs tests, and returns a diff you can inspect. For Claude Code, pair the change with a review checklist and any hooks that enforce formatting, validation, or permission checks. For Claude, use plan mode or background work for the change, then require a human review pass on the resulting diff.

A practical review checklist is short:

  • Did the agent stay inside the intended files?
  • Did it follow the repo’s instruction chain?
  • Did it use only approved connectors?
  • Did tests or validation actually run?
  • Is the diff small enough to review in one pass?
  1. Map each product to one concrete artifact.
  • Claude: a scoped .cursor/rules/*.mdc file plus a team AGENTS.md note for shared conventions.
  • Claude Code: CLAUDE.md, a hook policy, and an MCP permission review checklist.
  • Codex: AGENTS.md, AGENTS.override.md for temporary exceptions, and a verification loop in the CLI.

That mapping keeps the article of work, not the tool, at the center. Teams can swap surfaces later without rewriting the governance model.

  1. Make the review step part of the method, not an afterthought.

The easiest place to lose control is the handoff from agent to reviewer. A small methodology habit helps here: in the Review step, require the agent to summarize what changed, what it touched, and what it could not verify. That summary should be checked against the diff, not accepted on trust.

Tradeoffs and limits

This pattern breaks down when teams try to encode every preference in one place. Large instruction files become stale. Too many rules make agents less predictable, not more.

It also breaks when connector scope is treated as a setup task instead of a review task. MCP access that is broad on day one tends to stay broad unless someone revisits it.

Skills help with repeatable work, but they are not a substitute for repo rules. A skill can teach a workflow; it cannot replace architecture constraints or security boundaries.

Finally, no agent surface removes the need for human review. The best you can do is make the review cheaper: smaller diffs, clearer instructions, explicit permissions, and a verification loop that leaves evidence.

Further reading

Related training topics

Related research

Ready to start?

Transform how your team builds software today.

Get in touch