Source: MachineLearningMastery.com

5 Agentic Coding Tips & Tricks
Image by Editor
Introduction
Agentic coding only feels “smart” when it ships correct diffs, passes tests, and leaves a paper trail you can trust. The fastest way to get there is to stop asking an agent to “build a feature” and start giving it a workflow it cannot escape.
That workflow should force clarity (what changes), evidence (what passed), and containment (what it can touch). The tips below are concrete patterns you can drop into daily work with code agents, whether you are using a CLI agent, an IDE assistant, or a custom tool-using model.
1. Use A Repo Map To Prevent Blind Refactors
Agents get generic when they do not understand the topology of your codebase. They default to broad refactors because they cannot reliably locate the right seams. Give the agent a repo map that is short, opinionated, and anchored in the parts that matter.
Create a machine-readable snapshot of your project structure and key entry points. Keep it under a few hundred lines. Update it when major folders change. Then feed the map into the agent before any coding.
Here’s a simple generator you can keep in tools/repo_map.py:
|
from pathlib import Path INCLUDE_EXT = {“.py”, “.ts”, “.tsx”, “.go”, “.java”, “.rs”} SKIP_DIRS = {“node_modules”, “.git”, “dist”, “build”, “__pycache__”} root = Path(__file__).resolve().parents[1] lines = [] for p in sorted(root.rglob(“*”)): if any(part in SKIP_DIRS for part in p.parts): continue if p.is_file() and p.suffix in INCLUDE_EXT: rel = p.relative_to(root) lines.append(str(rel)) print(“n”.join(lines[:600])) |
Add a second section that names the real “hot” files, not everything. Example:
Entry Points:
api/server.ts(HTTP routing)core/agent.ts(planning + tool calls)core/executor.ts(command runner)packages/ui/App.tsx(frontend shell)
Key Conventions:
- Never edit generated files in
dist/ - All DB writes go through
db/index.ts - Feature flags live in
config/flags.ts
This reduces the agent’s search space and stops it from “helpfully” rewriting half the repository because it got lost.
2. Force Patch-First Edits With A Diff Budget
Agents derail when they edit like a human with unlimited time. Force them to behave like a disciplined contributor: propose a patch, keep it small, and explain the intent. A practical trick is a diff budget, an explicit limit on lines changed per iteration.
Use a workflow like this:
- Agent produces a plan and a file list
- Agent produces a unified diff only
- You apply the patch
- Tests run
- Next patch only if needed
If you are building your own agent loop, make sure to enforce it mechanically. Example pseudo-logic:
|
MAX_CHANGED_LINES = 120 def count_changed_lines(unified_diff: str) -> int: return sum(1 for line in unified_diff.splitlines() if line.startswith((“+”, “-“)) and not line.startswith((“+++”, “—“))) changed = count_changed_lines(diff) if changed > MAX_CHANGED_LINES: raise ValueError(f“Diff too large: {changed} changed lines”) |
For manual workflows, bake the constraint into your prompt:
- Output only a unified diff
- Hard limit: 120 changed lines total
- No unrelated formatting or refactors
- If you need more, stop and ask for a second patch
Agents respond well to constraints that are measurable. “Keep it minimal” is vague. “120 changed lines” is enforceable.
3. Convert Requirements Into Executable Acceptance Tests
Vague requests can prevent an agent from properly editing your spreadsheet, let alone coming up with proper code. The fastest way to make an agent concrete, regardless of its design pattern, is to translate requirements into tests before implementation. Treat tests as a contract the agent must satisfy, not a best-effort add-on.
A lightweight pattern:
- Write a failing test that captures the feature behavior
- Run the test to confirm it fails for the right reason
- Let the agent implement until the test passes
Example in Python (pytest) for a rate limiter:
|
import time from myapp.ratelimit import SlidingWindowLimiter def test_allows_n_requests_per_window(): lim = SlidingWindowLimiter(limit=3, window_seconds=1) assert lim.allow(“u1”) assert lim.allow(“u1”) assert lim.allow(“u1”) assert not lim.allow(“u1”) time.sleep(1.05) assert lim.allow(“u1”) |
Now the agent has a target that is objective. If it “thinks” it is done, the test decides.
Combine this with tool feedback: the agent must run the test suite and paste the command output. That one requirement kills an entire class of confident-but-wrong completions.
Prompt snippet that works well:
- Step 1: Write or refine tests
- Step 2: Run tests
- Step 3: Implement until tests pass
Always include the exact commands you ran and the final test summary.
If tests fail, explain the failure in one paragraph, then patch.
4. Add A “Rubber Duck” Step To Catch Hidden Assumptions
Agents make silent assumptions about data shapes, time zones, error handling, and concurrency. You can surface those assumptions with a forced “rubber duck” moment, right before coding.
Ask for three things, in order:
- Assumptions the agent is making
- What could break those assumptions?
- How will we validate them?
Keep it short and mandatory. Example:
- Before coding: list 5 assumptions
- For each: one validation step using existing code or logs
- If any assumption cannot be validated, ask one clarification question and stop
This creates a pause that often prevents bad architectural commits. It also gives you an easy review checkpoint. If you disagree with an assumption, you can correct it before the agent writes code that bakes it in.
A common win is catching data contract mismatches early. Example: the agent assumes a timestamp is ISO-8601, but the API returns epoch milliseconds. That one mismatch can cascade into “bugfix” churn. The rubber duck step flushes it out.
5. Make The Agent’s Output Reproducible With Run Recipes
Agentic coding fails in teams when nobody can reproduce what the agent did. Fix that by requiring a run recipe: the exact commands and environment notes needed to repeat the result.
Adopt a simple convention: every agent-run ends with a RUN.md snippet you can paste into a PR description. It should include setup, commands, and expected outputs.
Template:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
## Run Recipe Environment: – OS: – Runtime: (node/python/go version) Commands: 1) <command> 2) <command> Expected: – Tests: <summary> – Lint: <summary> – Manual check: <what to click or curl> Example for a Node API change: ## Run Recipe Environment: – Node 20 Commands: 1) npm ci 2) npm test 3) npm run lint 4) node scripts/smoke.js Expected: – Tests: 142 passed – Lint: 0 errors – Smoke: “OK” printed |
This makes the agent’s work portable. It also keeps autonomy honest. If the agent cannot produce a clean run recipe, it probably has not validated the change.
Wrapping Up
Agentic coding improves fast when you treat it like engineering, not vibe. Repo maps stop blind wandering. Patch-first diffs keep changes reviewable. Executable tests turn hand-wavy requirements into objective targets. A rubber duck checkpoint exposes hidden assumptions before they harden into bugs. Run recipes make the whole process reproducible for teammates.
These tricks do not reduce the agent’s capability. They sharpen it. Autonomy becomes useful once it is bounded, measurable, and tied to real tool feedback. That is when an agent stops sounding impressive and starts shipping work you can merge.
