When to trust an agent loop, and when to pull the plug
Autonomous agents can run for hours and produce great work — or quietly burn $40 of API credit going in circles. Here is how to tell them apart.
Letting an AI agent run unattended is the highest-leverage move in this whole space — and the easiest one to get wrong. Some loops produce remarkable work overnight. Others spend hundreds of tool calls re-reading the same file. Here is how to set yourself up for the first kind.
Tasks that are agent-friendly
- Mechanical refactors with a clear endpoint and an automated test that proves correctness.
- Migrations: rename a library, update a config across many files, replace a deprecated API.
- Test backfills against well-isolated modules.
- Doc generation: rewrite README sections, generate API docs, normalize JSDoc.
Tasks that are not
- Anything where 'done' is subjective ('make this feel snappier').
- Cross-system changes that touch infra, secrets, or third-party accounts you have not given the agent.
- Bug investigations on flaky systems — the agent will mistake intermittent failures for progress.
- Greenfield product work without a clear spec.
Signals the loop is going wrong
Watch for repeated reads of the same file, ballooning context windows, tool calls that say 'let me try a different approach' more than twice in a row, and any pattern where the agent claims success and then immediately starts editing again.
These are the cues that the loop has lost the plot. The kindest thing you can do is stop it, summarize what was accomplished, and restart with a tighter prompt.
Guard rails that pay for themselves
- A hard wall-clock budget. 30 minutes for routine tasks, 2 hours for migrations, never overnight without a spend cap.
- A spending cap at the API level. Most providers offer one; turn it on.
- A required test target. The agent is done when a specific command passes, not when it says so.
- An approval gate before any destructive action (git reset, file deletion, schema change).
The mindset shift
Treat agents like a contractor you do not know well yet. Small, well-scoped tasks. Clear acceptance criteria. A budget. Review the output before merging. Once they have earned trust on the small things, give them bigger ones.