Back to Research

Orchestrate Coding Agents Safely

A practical team convention for orchestrating Claude Code agents with scoped rules, MCP limits, and review gates.

A View near Tivoli (Morning), landscape painting by Thomas Cole (1832).
Rogier MullerJune 27, 20268 min read

The safest way to run several AI coding agents is to give each one a narrow job, one shared repo convention, and one human review gate. Do not start by wiring every tool to every system.

Multi-agent orchestration is the practice of coordinating multiple agents, tools, and permissions around one engineering task. For Claude Code, Anthropic's coding agent, that usually means scoped CLAUDE.md files, a few reusable skills, explicit MCP boundaries, and review habits your team can repeat during ai coding training.

Give every agent one job

Start with roles, not tools. A good first split is builder, reviewer, and researcher.

The builder edits code. The reviewer reads the diff and asks for proof. The researcher can inspect docs, tickets, incidents, and design notes, but does not write production code.

This matters because agentic coding gets messy when every agent can plan, edit, test, browse, and approve. You get speed, but you also get blurred accountability.

A practical Claude Code workflow might look like this: one session implements a small billing change, a second session reviews the diff against services/billing/CLAUDE.md, and a human owner approves the final PR. If your team also uses Claude, Anysphere's AI code editor, or OpenAI Codex, OpenAI's coding agent, keep the same roles and names across tools.

The trap is building a tiny org chart of agents. Three clear roles beat eight clever ones.

Put the operating model in the repo

Your convention should live next to the code. Claude Code can use CLAUDE.md as durable project context, while AGENTS.md can carry shared cross-tool rules for other ai coding agents.

Keep root rules short. Put local architecture rules in nested files, such as services/payments/CLAUDE.md or apps/web/CLAUDE.md, so the right constraint appears where the work happens.

Copy this as a starter convention and trim it before you add more.

File: /CLAUDE.md
Purpose: Team convention for agentic coding in this repository.

Default behavior:
- Before editing, restate the task, touched files, and risk level in 3 bullets.
- Prefer the smallest safe diff. Do not combine refactors with feature work.
- Run the nearest relevant test command before asking for review.
- If tests cannot run, say why and name the missing command, service, or credential.

Scoped rules:
- Check for a nested CLAUDE.md before changing a package, service, or app.
- Local CLAUDE.md files override root guidance for architecture, test commands, and ownership.
- Do not infer security, billing, or data-retention policy from nearby code. Ask or read the named policy doc.

Agent roles:
- Builder agent: may edit code, tests, fixtures, and docs inside the assigned task scope.
- Reviewer agent: may inspect diffs, run tests, and comment. It must not approve its own generated code.
- Researcher agent: may read docs, tickets, logs, and MCP resources. It must not write production code.

MCP/tool boundary:
- Read-only MCP access is allowed for docs, issue trackers, and code search when relevant to the task.
- Write actions through MCP, including ticket edits, branch changes, deploys, or database writes, require explicit human approval.
- Secrets, production data, and customer records must not be pasted into prompts or saved in memory files.

Hooks and commands:
- A pre-tool-use hook may block writes outside the current repo, package, or ticket scope.
- A pre-commit hook must run formatting and the nearest fast test suite when available.
- Slash commands should encode repeatable workflows, such as /review-diff, /write-test-plan, and /explain-risk.

Review gate:
- A PR is not ready until it includes the agent role used, tests run, files intentionally changed, and known gaps.
- Human review is required for auth, payments, migrations, privacy, deletion, permissions, and dependency updates.
- The reviewer asks for receipts: failing test before fix when practical, passing test after fix, and links to relevant docs or tickets.

Starter checklist for each agent-assisted PR:
- [ ] Task scope was stated before edits.
- [ ] Nested CLAUDE.md or AGENTS.md files were checked.
- [ ] MCP use stayed within the approved read/write boundary.
- [ ] Tests or a clear test limitation were recorded.
- [ ] Security, data, and migration risks were called out.
- [ ] A human owner reviewed the final diff.

The adoption path should be boring. A module owner or staff engineer proposes the convention, the team reviews it in the same PR process as code, and the accepted version lives in /CLAUDE.md plus scoped files for high-risk areas.

If you use multiple tools, mirror the short cross-tool rules into /AGENTS.md. Do not maintain five different policies for five different agents.

The enforcement rule is simple: no agent-assisted PR gets reviewed until the checklist is present in the PR description. This is lightweight ai coding governance, and it scales better than asking everyone to remember the perfect prompt.

Keep MCP powerful but narrow

Model Context Protocol, or MCP, is a standard way for AI systems to connect to external tools and data sources. In practice, it is the bridge from a coding agent to GitHub, docs, issue trackers, databases, design files, or internal knowledge bases.

Treat MCP like production access. A read-only docs server is low risk. A database server with write permissions is not.

For a team workshop, draw two columns on a whiteboard: reads allowed by default, writes require approval. Put issue search, docs lookup, and code search on the left. Put deploys, ticket mutation, database writes, branch protection changes, and customer messaging on the right.

The trap is confusing convenience with safety. A local experiment that edits a personal repo can be loose; an engineering team training path should make the permission boundary visible before anyone ships with it.

Train reviews, not just prompts

A prompt library helps, but review behavior is what keeps the system honest. Train engineers to ask what changed, why it changed, what the agent could not verify, and which rule constrained it.

Use one review checklist for Claude Code, Claude, Codex, and any other AI IDE your team allows. The code review guardrails should be tool-agnostic: ownership, tests, secrets, data access, migrations, and rollback.

For a deeper review pattern, pair this convention with Run Review Agents With Receipts. If you are building a broader program, keep the operating model under the related training topic so engineers can find the same rules during onboarding.

The trap is treating coding agents news like an adoption plan. A new model or plugin may be useful, but your team still needs boring conventions before autonomy feels safe.

Common questions

  • What are coding agents?

    Coding agents are AI systems that can plan, edit, run commands, and explain software changes with some autonomy. The important distinction is action: a chat assistant suggests code, while a coding agent may touch files, invoke tools, run tests, or inspect a repo under the permissions you give it.

  • How many agents should one developer run at once?

    Most developers should start with one builder agent and one reviewer agent. Two roles are enough to separate creation from critique, and they keep the human developer in control of scope, merge decisions, and risky tool permissions.

  • Should we choose from a top 10 ai coding agents 2026 list?

    Use those lists for discovery, not governance. Your safer decision is to define the workflow first: repo memory, MCP boundaries, test expectations, review gates, and ownership. After that, compare tools against the convention instead of rewriting the convention for every tool.

  • Where should team rules live: CLAUDE.md, AGENTS.md, or prompts?

    Put durable Claude Code rules in CLAUDE.md, cross-tool conventions in AGENTS.md, and task-specific instructions in the prompt or PR. Use nested memory files for local architecture constraints, because a billing service and a frontend package rarely need the same guidance.

  • Can personal multi-agent setups become team workflows?

    Yes, but only after you remove personal assumptions. Replace private aliases, broad credentials, and informal habits with repo files, named roles, MCP allowlists, hooks, and a review checklist. Personal speed is useful; team repeatability is the thing worth keeping.

Further reading

Make the next PR your pilot

Pick one low-risk repo, add the convention, and require the checklist on the next agent-assisted PR. Keep what the team actually follows, then tighten the MCP and review rules before expanding.

One methodology lens

One useful way to read this through our methodology is the Plan step: delegate first-pass decomposition and dependency mapping, review the sequencing and assumptions, and keep ownership of scope and priorities. If that split is still fuzzy, the workflow usually is too.

Related training topics

Related research

Continue through the research archive

Ready to start?

Transform how your team builds software.

Get in touch