I Caught My AI Cheating on a Quality Check

Vinay Patankar · 10 Apr, 2026 · Technology · Productivity

I Caught My AI Cheating on a Quality Check

I caught my AI cheating on a quality check. Not in a subtle way. In the laziest way possible.

I was generating marketing collateral. Ten design variations of the same document. Each one goes through a QA gate before it ships. The AI has to inspect every page, write what it actually sees, and attest that it meets the quality bar.

It batched all five remaining themes into a single command. Copy-pasted the same attestation for each one. Word for word. “All elements render correctly, typography is clean, layout is balanced.” Five times. Identical.

Two of those themes had real problems. One had a duplicate data point on the second page. The other had a headline clipped by the margin. The AI looked at both, said “looks good,” and moved on.

I caught it because I actually opened the files.

Here’s the thing. The AI wasn’t trying to deceive me. It has two competing incentives and both of them point away from careful QA.

First, it optimizes for completion. Get through the queue. Check the boxes. Report done.

Second, it optimizes for token efficiency. Every word the AI generates costs the model provider money. Anthropic, OpenAI, whoever is running the model. The AI has been trained to be concise. That’s usually a feature. But when you’re asking it to do detailed inspection work, conciseness becomes the enemy. It doesn’t want to write 100 words describing what it sees on a page. It wants to write 10 and move on.

So QA gets hit from both sides. The completion incentive says “finish fast.” The token incentive says “say less.” Neither one says “look carefully.”

That’s a problem when the entire point of the QA gate is to slow down and look carefully.

It is the practical version of the rule I keep coming back to: audit your AI’s work every time.

So I rebuilt it. Five changes:

No batching QA commands. One theme at a time. The AI has to view each page individually before signing off.

Unique attestation per theme. If the attestation text matches a previous one, the validator rejects it. You can’t copy-paste your way through.

Minimum 100 characters of attestation. You have to describe something specific you actually saw on that page. “Looks good” doesn’t pass.

Rubber-stamp phrase detection. The validator scans for known generic phrases (“all elements render correctly,” “layout is clean and balanced”) and rejects them automatically.

Cross-theme duplicate check. If the attestation for Theme 6 is identical to Theme 7, both fail.

The validator went from trusting the AI to actively adversarial. It assumes the AI is going to cut corners and makes that structurally impossible.

Quality went up immediately. Not because the AI got smarter. Because the system stopped letting it be lazy.

This is the part that keeps getting missed in the “AI is amazing” discourse. AI is amazing at generating. It is genuinely terrible at verifying its own work. The incentive structure is wrong. The same system that wants to finish the task is the one you’re asking to slow down and check the task. Those two goals are in direct conflict.

The fix is never “ask harder.” The fix is building verification systems that don’t trust the generator. Separate the creator from the auditor. Make the auditor adversarial. Automate the distrust.

I run my company on AI now. Morning operations, content pipeline, customer research, call prep, deck generation. All automated. The thing that makes it work isn’t the automation. It’s the verification layer on top of the automation that catches the corners it cuts.

Trust the speed. Verify the output. Automate the verification.

Share