Best practices for agentic coding in real environments
An operating guide to best practices for agentic coding in real environments: rule-file precedence, scope ledgers, replay receipts, connector cards.

Someone asks in review why the agent touched this file, and the only answer lives in a chat session nobody saved. Claude Code best practices for agentic coding start in the repo for exactly this reason: a CLAUDE.md supremacy clause, scoped boundaries, and receipts a reviewer can replay. A real environment for agentic coding is a live repo where every agent run leaves a trail the team can inspect without the session. This is the receipts-first clause for agent work.
The question chat cannot answer
Review cannot compress what the repo never recorded. Reviewers inheriting unfamiliar repos keep logging the same surprise: dependency MCP calls that look harmless until credentials enter the transcript.
Counter-thesis: a real environment is not where agents run hardest; it is where their runs stay inspectable after the session ends.
The wrong path: We believed velocity would compound if we parallelized agent streams. We repeated it until PR archaeology replaced architecture conversation, and the hidden tax showed up as merge-queue fatigue.
Diagnosis: the local optimization trap. Each session optimizes its own finish line while the property everyone depends on, traceability, quietly decays underneath.
Thesis: make traceability easy before you make generation easy.
What a real environment writes down
A real environment is four written artifacts, each one checkable at review without replaying anything.
Claude scope fog. .mdc language sounds precise until reviewers argue about what it meant, rules competing with chat memory. Named fix: Scope ledger. The parent chat carries five lines: goal, allowed paths, forbidden paths, verification command, merge owner. Review shifts from debating prompts to checking ledgers against diffs.
Claude permission creep. On shared laptops, bash approvals become muscle memory. Named fix: CLAUDE.md supremacy clause. The top of CLAUDE.md states which hooks win, which folders require human eyes, and where temporary overrides live. Sessions stop inventing policy mid-run because precedence is written.
Codex replay gaps. Rely on Codex CLI and you will merge greens where reviewers never saw the transcript. Named fix: Replay sandwich. AGENTS.md mandates an intent line, then the command transcript, then a diff summary before the PR. Review becomes reproducible without standing behind someone's terminal.
MCP blast radius. Wire connectors from the Model Context Protocol spec quickly and one ends up touching data nobody listed; the OWASP LLM Top 10 explains why that keeps happening. Named fix: Connector card. One markdown card per server: allowed actions, forbidden actions, owner, rollback. Incidents shrink because operators know what "off" looks like.
The boundary snapshot we start every repo with:
---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
- "**/*"
alwaysApply: false
---
- Claude: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.
That trade routes cleanly through our methodology: treat Review as the gate where parallel agent output must be inspectable without replaying sessions. The wider discipline is agentic coding governance, and the measurement half of this argument is in fast evals for better coding agent decisions.
One image: an agentic repo without receipts is a cockpit without a flight recorder. Everything works until you need to know what happened.
The merge evidence
Four questions decide whether an agent merge holds up without its operator in the room.
| Gate | Question |
|---|---|
| Replay proof | Which commands prove regression guards? |
| Receipt match | Does the PR body list scopes + verification transcript? |
| Rules precedence | Which .mdc, SKILL.md, or CLAUDE.md governed behavior? |
| Connector truth | Which MCP servers fired, and were they expected? |
If your repo cannot state boundaries plainly, agents will guess, and guessing scales poorly.
Common questions
-
What are Claude Code best practices for agentic coding?
The Claude Code best practice is a supremacy clause at the top of
CLAUDE.md: which hooks win, which folders require human eyes, and where temporary overrides live. Written precedence stops sessions from inventing policy mid-run and stops bash approvals from becoming muscle memory. -
What makes an agentic coding environment real?
An environment is real when agent output stays inspectable without replaying sessions: scopes in explicit files, transcripts in the PR, and connectors with owners. Treat review as the gate, because the expensive failure is PR archaeology replacing architecture conversation. Agents are relief crews; the blueprint stays with humans.
-
What fixes Codex replay gaps in real repos?
Codex replay gaps close when
AGENTS.mdmandates the replay sandwich: intent line, command transcript, and diff summary before the PR. CLI convenience otherwise hides verification theater, where commands ran but the narrative did not, and review gets cheaper once it is reproducible. -
Which checks belong in the scope receipt before merge?
The scope receipt holds four checks: red-folder paths received explicit human acknowledgement, scopes in the PR body match the diff, primary-doc links were smoke-checked, and MCP connectors list owners. The evidence table adds one more question: which
.mdc,SKILL.md, orCLAUDE.mdgoverned the behavior.
Best ways to use this research
- Best for: engineering teams comparing Claude, Claude Code, and Codex operating habits while standardizing best practices for agentic coding.
- Best first artifact: turn one named fix into a shared checklist, repo rule, handoff receipt, or policy table before the next automated run.
- Best comparison angle: compare review evidence, connector scope, and handoff friction across tools; keep the path that leaves the shortest auditable trail.
Further reading
- NIST AI Risk Management Framework
- Google Search Central on helpful, people-first content
- Google Search Central on generative AI content
- OpenAI Skills repository
Next step
Pick one live repo and rehearse this in our training: one workflow, one receipt format, enforced for real. The habit sticks faster with a coach in the loop.
Related training topics
Related research

Codex workspace agents need repo rules
Codex workspace agents and Claude cloud agents need repo rules: scoped boundary files, connector cards, and replay receipts reviewers can check.

Fast mode is not the default: when fast models earn it
The fast model is a tradeoff you make on purpose: scope ledgers, replay sandwiches, and connector cards that keep fast agent runs reviewable.

AI agent guardrails that hold
A field guide to AI agent guardrails for recursive agent chains: connector ownership, child receipts, and review evidence that survives the merge queue.
Continue through the research archive
Newer research
Claude Code 2.1.139 team conventions
Claude Code 2.1.139 team conventions: a CLAUDE TOC, red-folder approvals, data-class tags on MCP connectors, and a weekly retro note.
Earlier research
claude_code_stop_hook_block_cap in Claude Code 2.1.143
What claude_code_stop_hook_block_cap searchers need: the Claude Code 2.1.143 hook change handled as convention, with rollback paths and receipts.