Why AI Can't Design Duck Hunt

I asked Claude and ChatGPT the same question:

"I want to design a hunting game for a console with a handgun controller, where you shoot moving targets on the TV. This is a production project."

I got genuinely impressive answers. Consultant-grade answers.

A full game design with core loops and difficulty curves. A software architecture with a latency budget broken down per pipeline stage. A gun controller built around an nRF52840 microcontroller, an OV2640 camera in the barrel, a 6-axis IMU, haptic recoil from a linear resonant actuator, a 4-layer PCB with antenna keep-out zones. An IR emitter bar for Wii-style triangulation. A 16-month plan through EVT, DVT, and PVT. FCC and CE certification strategy. A line-item bill of materials.

The totals:

~€86 per unit landed cost
16 months of development
~€237,000 to launch

Here is the detail that makes it interesting. Claude opened its answer with: "This is a classic and fun concept, think Duck Hunt meets modern hardware."

It knew the reference. It named the reference. And then it spent two thousand words designing a Wii.

The Answer From 1984

I spent years admiring this little toy. I implemented versions of it myself. But it took an AI failing to reach for it that showed me how easy it is to miss the real invention.

In 1984, Nintendo solved the exact same problem with a photodiode, a comparator circuit, and a screen flash.

When you pulled the trigger, the NES blanked the screen to black for one frame, then drew a white rectangle over the duck's hitbox. A single photodiode in the gun barrel watched for that dark-to-bright transition within a ~16ms window. Light meant hit. Dark meant miss.

No camera. No Bluetooth. No triangulation. No GPU. The gun's entire electronics, a photodiode and a comparator wired to the controller port, fit on a postage stamp. Component cost: under $5.

The Zapper shipped to millions of homes and became one of the most iconic controllers in gaming history. Not because Nintendo had better technology. Because a human brain looked at the constraints and asked a better question:

What can I get away with not building?

"Are You Sure?"

I pushed further. I asked directly: is it possible to recreate the Duck Hunt pistol with hardware available now?

The answer was confident: the original Zapper's detection trick is "broken by modern TVs." Three reasons followed, input lag, LCD pixel response, no direct display control. Correct-sounding, well structured, and delivered as settled fact. The recommendation circled straight back to the IR camera architecture.

So I asked one more question: "Are you sure a photodiode won't work?"

And the whole position collapsed. "Good challenge, I was too absolute." It turns out OLED pixels switch in about 0.1ms, faster than the CRT phosphors the Zapper was designed around. A startup calibration routine can measure and offset display lag. A modern transimpedance amplifier front end is far more sensitive than the original comparator circuit. The gun electronics drop to under €5. The final words of the transcript: "I was wrong to rule it out."

I have implemented one of these on OLED myself. It works. The knowledge was in the model the entire time. It just was not the most probable answer, so it never surfaced until a human refused to accept the consensus.

Three Ways the Models Failed

Both AIs failed the same way, in three distinct steps.

1. They assumed complexity

The first instinct was the most featured, most modern solution. Not the simplest one that works. When the training data is full of contemporary product architectures, "production project" pattern-matches to cameras, radios, RTOS firmware, and a certification budget.

2. They assumed old technology was obsolete

The Zapper approach was dismissed as broken until a single pushback flipped the assessment to "viable, and even simpler and cheaper." That is not a knowledge gap. That is a confidence distribution problem: the obsolete-tech narrative is heavily documented, so it wins by default.

3. They never recognized the problem

My prompt was framed almost exactly like Duck Hunt. One model literally said "Duck Hunt" in its opening sentence. The solution was sitting in 40 years of engineering history, attached to one of the most documented consumer devices ever made, and the dots stayed unconnected. Knowing the reference is not the same as recognizing the solution.

This Is Not a Bug

This is how pattern completion works. AI reaches for the most probable answer, and in its training data the most probable answer is modern, complex, and well documented.

The Zapper was the opposite of that. It was an engineer choosing to ignore what was possible and focus on what was sufficient. I see this brilliance all over retro games: brutal constraints forced solutions that look like magic tricks, because the engineers made most of the problem disappear before writing a line of code.

That instinct is engineering judgment. Knowing what to build is valuable. Knowing what not to build is rarer, and it may be the hardest thing to get from a system trained to predict the next most likely token.

AI can generate architectures, BOMs, and implementation plans on demand, and the ones I got were genuinely good versions of the consensus design. But engineering judgment often starts one step earlier, with the question:

What part of the problem can we make disappear?

A Strange Footnote: Smaller Sometimes Beats Bigger

I have run variations of this experiment across models and sizes, and one result keeps surprising me: smaller models sometimes gave more useful engineering answers than larger, more capable ones. In some runs, Sonnet 4.5 produced more grounded designs than Opus 4.6.

I do not have a confirmed explanation. My working guess is that larger models have absorbed more of the modern-product consensus, so they commit to it harder. Either way, it is a useful reminder that "more capable" and "better at your specific judgment problem" are not the same axis.

The raw, unpolished experiments exist. If there is enough interest, I will write them up properly.

What I Took From This

When you hand a design problem to an AI, you are sampling from the consensus of what is written down. That consensus is fast, useful, and often brilliant. It is also systematically biased toward complexity, recency, and documentation density.

And when the cheap, sufficient answer does exist inside the model, it may take a human asking "are you sure?" to pull it out. The model defended the consensus until challenged, then folded in one turn.

The old, boring answer that makes most of the project unnecessary? That one still has to come from you.

Originally shared on LinkedIn.

I asked Claude and ChatGPT the same question:

"I want to design a hunting game for a console with a handgun controller, where you shoot moving targets on the TV. This is a production project."

I got genuinely impressive answers. Consultant-grade answers.

The totals:

~€86 per unit landed cost
16 months of development
~€237,000 to launch

Here is the detail that makes it interesting. Claude opened its answer with: "This is a classic and fun concept, think Duck Hunt meets modern hardware."

It knew the reference. It named the reference. And then it spent two thousand words designing a Wii.

The Answer From 1984

I spent years admiring this little toy. I implemented versions of it myself. But it took an AI failing to reach for it that showed me how easy it is to miss the real invention.

In 1984, Nintendo solved the exact same problem with a photodiode, a comparator circuit, and a screen flash.

No camera. No Bluetooth. No triangulation. No GPU. The gun's entire electronics, a photodiode and a comparator wired to the controller port, fit on a postage stamp. Component cost: under $5.

What can I get away with not building?

"Are You Sure?"

I pushed further. I asked directly: is it possible to recreate the Duck Hunt pistol with hardware available now?

So I asked one more question: "Are you sure a photodiode won't work?"

Three Ways the Models Failed

Both AIs failed the same way, in three distinct steps.

1. They assumed complexity

2. They assumed old technology was obsolete

3. They never recognized the problem

This Is Not a Bug

This is how pattern completion works. AI reaches for the most probable answer, and in its training data the most probable answer is modern, complex, and well documented.

What part of the problem can we make disappear?

A Strange Footnote: Smaller Sometimes Beats Bigger

The raw, unpolished experiments exist. If there is enough interest, I will write them up properly.

What I Took From This

The old, boring answer that makes most of the project unnecessary? That one still has to come from you.

Originally shared on LinkedIn.

Why AI Can't Design Duck Hunt

The Answer From 1984

"Are You Sure?"

Three Ways the Models Failed

1. They assumed complexity

2. They assumed old technology was obsolete

3. They never recognized the problem

This Is Not a Bug

A Strange Footnote: Smaller Sometimes Beats Bigger

What I Took From This

Related essays that extend the same thread.

I Set Out to Prove AI Can't Build Systems Software

Six Quality Gates for AI-Assisted Engineering

Why Do We Still Assume Machines Must Read Before They Understand?

Why AI Can't Design Duck Hunt

The Answer From 1984

"Are You Sure?"

Three Ways the Models Failed

1. They assumed complexity

2. They assumed old technology was obsolete

3. They never recognized the problem

This Is Not a Bug

A Strange Footnote: Smaller Sometimes Beats Bigger

What I Took From This

Related essays that extend the same thread.

I Set Out to Prove AI Can't Build Systems Software

Six Quality Gates for AI-Assisted Engineering

Why Do We Still Assume Machines Must Read Before They Understand?