June 3, 20268 min readAI Dev Review

Are AI-generated tests safe to ship?

Test suites written by AI can either lock in correct behavior or paper over bugs. Here's how to tell which you're getting.

TestingQualityBest Practices

Asking an AI assistant to 'write tests for this file' is one of the most popular use cases. It is also one of the easiest ways to ship a false sense of safety. A small amount of discipline makes the difference.

The failure mode nobody talks about

Given buggy code, most assistants will happily write tests that assert the buggy behavior. The suite is green, coverage looks great, and the bug ships forever.

The fix is mundane: write a one-line spec of intended behavior before asking for tests. That anchor changes everything.

Rules that work

State the contract first, then ask for tests against the contract — not the code.
Ask for one failing test before the implementation when fixing a bug.
Always read the generated tests; do not merge on coverage numbers alone.
Forbid the assistant from mocking the system under test.

Where AI-generated tests genuinely help

Edge-case enumeration is the sweet spot. Ask for 'twenty inputs that might break this function' and you'll get coverage you would not have written by hand.

Snapshot-style tests for stable serialization formats are another good fit.

Where to avoid them

Don't lean on AI for integration tests against systems with side effects. The assistant cannot see your environment and will confidently assume things that aren't true.

The shipping rule

Treat AI-generated tests like any other code: review them, understand them, and reject the ones you wouldn't have written yourself. Done that way, they're a genuine win.

The failure mode nobody talks about

Rules that work

Where AI-generated tests genuinely help

Where to avoid them

The shipping rule

Comments