Why AI-generated code has bugs — and it's not the AI's fault

Every week, a new article appears: "AI-generated code is full of bugs." "AI coding tools produce insecure code." "You still need human developers to fix AI output."

These articles are right about the observation but wrong about the cause. The bugs aren't a model quality problem. They're an input quality problem.

The garbage in, garbage out principle

AI coding agents are fundamentally input-output machines. The quality of the output is bounded by the quality of the input. This isn't a limitation — it's a law.

When you give an AI agent:

A vague Jira ticket title → You get a vague implementation
A copy-pasted Slack thread → You get an implementation based on one person's interpretation
A 50-page PRD → You get an implementation of whatever the agent decides is important
A structured spec → You get an implementation that matches the spec

The AI isn't hallucinating features. It's filling in the gaps you left.

The five types of "AI bugs"

After analyzing hundreds of "AI bug" reports, we've categorized them into five types. None of them are model failures:

1. Scope bugs (40%)

The AI builds features that weren't requested or misses features that were implied but not stated.

Root cause: No explicit Scope and Out of Scope sections. The agent guesses what's in scope.

2. Integration bugs (25%)

The AI-generated code works in isolation but fails when integrated with the existing system.

Root cause: The agent doesn't know about the existing system's constraints, conventions, or interfaces. No shared context.

3. Edge case bugs (20%)

The happy path works, but edge cases (empty inputs, concurrent access, network failures) aren't handled.

Root cause: No acceptance criteria that specify edge case behavior. The agent implements the obvious path.

4. Convention bugs (10%)

The code works but doesn't follow the team's conventions (naming, architecture, error handling patterns).

Root cause: No project memory or system prompt defining team conventions.

5. Actual model errors (5%)

The AI genuinely produces incorrect logic — wrong algorithm, misunderstood API, etc.

Root cause: Actual model limitation. The only category that's the AI's "fault."

95% of "AI bugs" are input bugs, not model bugs. Fixing the input fixes the output.

Why better models won't solve this

Model improvements address type 5 bugs (5% of the total). They don't help with types 1–4 because those bugs come from missing information, not insufficient reasoning.

GPT-5, Claude 5, Gemini 3 — none of them can implement features you didn't describe. No model can guess your team's conventions if you don't provide them. No model can handle edge cases you didn't mention.

The ceiling for AI code quality is set by spec quality, not model quality.

The fix: Structure your input

Each bug type has a corresponding fix in the spec:

| Bug Type | Fix | Spec Section | |----------|-----|-------------| | Scope bugs | Explicit scope and boundaries | Scope + Out of Scope | | Integration bugs | System context and constraints | Project Memory + Approach | | Edge case bugs | Explicit scenarios | Acceptance Criteria (Given/When/Then) | | Convention bugs | Team standards | Project Memory | | Model errors | Better models | (Wait for AI labs) |

A structured spec with Project Memory and Acceptance Criteria eliminates 95% of "AI bugs" before a single line of code is written.

A real example

Without a spec (typical AI bug report): ``` "I asked the AI to add a delete button to the user profile page. It added the button, but clicking it deletes the user without confirmation. It also doesn't check permissions — any user can delete any other user." ```

The developer blames the AI. But the instruction was "add a delete button." The AI added a delete button. It worked. The "bugs" are requirements the developer didn't specify.

With a spec: ``` Scope:

Add "Delete Account" button to user profile settings page
Only visible to the account owner (not admins, not other users)
Clicking shows confirmation modal: "This action cannot be undone"
User must type their email to confirm
Deletion is soft-delete (data retained for 30 days)

Acceptance Criteria: Given a user is viewing their own profile settings When they click "Delete Account" Then a confirmation modal appears requiring email input

Given a user is viewing another user's profile When they look for a Delete button Then no Delete button is visible ```

Same feature. Same AI. Dramatically different result. The spec is the fix.

FAQ

Q: If the spec is the problem, why do we blame the AI? A: Because the AI is the visible agent. When code is wrong, we see the AI wrote it. We don't see the invisible absence of a spec. It's a classic attribution error.

Q: Isn't writing detailed specs slower than just fixing AI bugs? A: Writing a spec takes 30–60 minutes. Each rework cycle takes 2–4 hours. The math is clear.

Q: What about exploratory coding where you don't know the spec upfront? A: Vibe coding and exploratory coding are valid for prototyping. But when you move from prototype to production — when other people will work on this code — write the spec.

Why AI-generated code has bugs — and it's not the AI's fault

Why AI-generated code has bugs — and it's not the AI's fault

The garbage in, garbage out principle

The five types of "AI bugs"

1. Scope bugs (40%)

2. Integration bugs (25%)

3. Edge case bugs (20%)

4. Convention bugs (10%)

5. Actual model errors (5%)

Why better models won't solve this

The fix: Structure your input

A real example

FAQ

팀과 합의된스펙을 만드세요.

팀과 합의된
스펙을 만드세요.