Published
May 16, 2026
AI
I asked Claude and ChatGPT to design a hunting game with a gun controller. I got an €86 bill of materials, a 16-month plan, and a €237,000 launch budget. In 1984, Nintendo solved the same problem with a photodiode, a comparator circuit, and a screen flash. One model even name-checked Duck Hunt in its first sentence, then designed a Wii anyway.
Published
May 16, 2026
Reading time
7 min read
Author
Fadi Labib

I asked Claude and ChatGPT the same question:
"I want to design a hunting game for a console with a handgun controller, where you shoot moving targets on the TV. This is a production project."
I got genuinely impressive answers. Consultant-grade answers.
A full game design with core loops and difficulty curves. A software architecture with a latency budget broken down per pipeline stage. A gun controller built around an nRF52840 microcontroller, an OV2640 camera in the barrel, a 6-axis IMU, haptic recoil from a linear resonant actuator, a 4-layer PCB with antenna keep-out zones. An IR emitter bar for Wii-style triangulation. A 16-month plan through EVT, DVT, and PVT. FCC and CE certification strategy. A line-item bill of materials.
The totals:
Here is the detail that makes it interesting. Claude opened its answer with: "This is a classic and fun concept, think Duck Hunt meets modern hardware."
It knew the reference. It named the reference. And then it spent two thousand words designing a Wii.
I spent years admiring this little toy. I implemented versions of it myself. But it took an AI failing to reach for it that showed me how easy it is to miss the real invention.
In 1984, Nintendo solved the exact same problem with a photodiode, a comparator circuit, and a screen flash.
When you pulled the trigger, the NES blanked the screen to black for one frame, then drew a white rectangle over the duck's hitbox. A single photodiode in the gun barrel watched for that dark-to-bright transition within a ~16ms window. Light meant hit. Dark meant miss.
No camera. No Bluetooth. No triangulation. No GPU. The gun's entire electronics, a photodiode and a comparator wired to the controller port, fit on a postage stamp. Component cost: under $5.
The Zapper shipped to millions of homes and became one of the most iconic controllers in gaming history. Not because Nintendo had better technology. Because a human brain looked at the constraints and asked a better question:
What can I get away with not building?
I pushed further. I asked directly: is it possible to recreate the Duck Hunt pistol with hardware available now?
The answer was confident: the original Zapper's detection trick is "broken by modern TVs." Three reasons followed, input lag, LCD pixel response, no direct display control. Correct-sounding, well structured, and delivered as settled fact. The recommendation circled straight back to the IR camera architecture.
So I asked one more question: "Are you sure a photodiode won't work?"
And the whole position collapsed. "Good challenge, I was too absolute." It turns out OLED pixels switch in about 0.1ms, faster than the CRT phosphors the Zapper was designed around. A startup calibration routine can measure and offset display lag. A modern transimpedance amplifier front end is far more sensitive than the original comparator circuit. The gun electronics drop to under €5. The final words of the transcript: "I was wrong to rule it out."
I have implemented one of these on OLED myself. It works. The knowledge was in the model the entire time. It just was not the most probable answer, so it never surfaced until a human refused to accept the consensus.
Both AIs failed the same way, in three distinct steps.
The first instinct was the most featured, most modern solution. Not the simplest one that works. When the training data is full of contemporary product architectures, "production project" pattern-matches to cameras, radios, RTOS firmware, and a certification budget.
The Zapper approach was dismissed as broken until a single pushback flipped the assessment to "viable, and even simpler and cheaper." That is not a knowledge gap. That is a confidence distribution problem: the obsolete-tech narrative is heavily documented, so it wins by default.
My prompt was framed almost exactly like Duck Hunt. One model literally said "Duck Hunt" in its opening sentence. The solution was sitting in 40 years of engineering history, attached to one of the most documented consumer devices ever made, and the dots stayed unconnected. Knowing the reference is not the same as recognizing the solution.
This is how pattern completion works. AI reaches for the most probable answer, and in its training data the most probable answer is modern, complex, and well documented.
The Zapper was the opposite of that. It was an engineer choosing to ignore what was possible and focus on what was sufficient. I see this brilliance all over retro games: brutal constraints forced solutions that look like magic tricks, because the engineers made most of the problem disappear before writing a line of code.
That instinct is engineering judgment. Knowing what to build is valuable. Knowing what not to build is rarer, and it may be the hardest thing to get from a system trained to predict the next most likely token.
AI can generate architectures, BOMs, and implementation plans on demand, and the ones I got were genuinely good versions of the consensus design. But engineering judgment often starts one step earlier, with the question:
What part of the problem can we make disappear?
I have run variations of this experiment across models and sizes, and one result keeps surprising me: smaller models sometimes gave more useful engineering answers than larger, more capable ones. In some runs, Sonnet 4.5 produced more grounded designs than Opus 4.6.
I do not have a confirmed explanation. My working guess is that larger models have absorbed more of the modern-product consensus, so they commit to it harder. Either way, it is a useful reminder that "more capable" and "better at your specific judgment problem" are not the same axis.
The raw, unpolished experiments exist. If there is enough interest, I will write them up properly.
When you hand a design problem to an AI, you are sampling from the consensus of what is written down. That consensus is fast, useful, and often brilliant. It is also systematically biased toward complexity, recency, and documentation density.
And when the cheap, sufficient answer does exist inside the model, it may take a human asking "are you sure?" to pull it out. The model defended the consensus until challenged, then folded in one turn.
The old, boring answer that makes most of the project unnecessary? That one still has to come from you.
Originally shared on LinkedIn.
Keep reading

I ran Anthropic's AI-written C compiler through my novelty-scoring pipeline, expecting to confirm my public position that GenAI can't do systems programming. The data forced me to retune my own metrics. What I found instead was a sharper question for anyone running an engineering team: what's your ratio?

Before AI: same input, same output, every time. After AI: same input, different output, every time. Same quality gates. After a year of daily Claude, Codex, and Gemini CLI use, these are the six gates I run on every AI-assisted task, and the five numbers I measure to know whether they work.

Two research papers from Google and DeepSeek landed in October from completely different domains. One processes speech, the other processes documents. Neither bothers converting anything to text first. This exposes something fundamental about how we have been training perception systems for decades.