AI Field Tests: the Field Lab for AI tooling, tested on real engineering work
AI is very good at producing work that looks finished. I check whether it survives contact with reality. From an engineer with 15 years in automotive, robotics, and embedded systems.
I asked Claude and ChatGPT to design a hunting game with a gun controller. I got an €86 bill of materials, a 16-month plan, and a €237,000 launch budget. In 1984, Nintendo solved the same problem with a photodiode, a comparator circuit, and a screen flash. One model even name-checked Duck Hunt in its first sentence, then designed a Wii anyway.
VERDICT: OVERENGINEERED7 min readRead →Before AI: same input, same output, every time. After AI: same input, different output, every time. Same quality gates. After a year of daily Claude, Codex, and Gemini CLI use, these are the six gates I run on every AI-assisted task, and the five numbers I measure to know whether they work.
6 min readRead →I ran Anthropic's AI-written C compiler through my novelty-scoring pipeline, expecting to confirm my public position that GenAI can't do systems programming. The data forced me to retune my own metrics. What I found instead was a sharper question for anyone running an engineering team: what's your ratio?
6 min readRead →Two research papers from Google and DeepSeek landed in October from completely different domains. One processes speech, the other processes documents. Neither bothers converting anything to text first. This exposes something fundamental about how we have been training perception systems for decades.
6 min readRead →Carmakers now call themselves "Software-Defined Vehicle" companies. Nobody calls a smartphone "software-defined". Phones were software-first from day one, so the label was never needed. The SDV prefix is automakers announcing they have to rebuild around software two decades after mobile did.
8 min readRead →Open source software powers 96% of all codebases and would cost $8.8 trillion to rebuild, yet just 5% of developers create 96% of its value. Google Test alone saves companies billions. Imagine 2,000 companies each burning money to build their own testing framework, then to maintain it. That's billions down the drain, solving the same problem thousands of times. Meanwhile, bugs caught early save hundreds of thousands per year, and engineers get to build actual products instead of reinventing basic tools. Tech giants aren't sharing code out of generosity, they've figured out that giving away millions in development costs them less than the alternative.
6 min readRead →Every carmaker chasing software-defined vehicles qualifies the same foundational tools on its own: operating systems, toolchains, LLVM. The work is duplicated across the industry and none of it is a differentiator. The fix is to qualify those shared foundations together and compete on the product instead.
2 min readRead →Nokia launched the first smartphone in 1996, 11 years before the iPhone, and had superior technology with a massive R&D budget, thousands of patents, and advanced features like GPS and 5MP cameras. Yet they failed. Why? Not because of technology, but because they couldn't transform from a hardware company to a software company. Developers abandoned them for platforms that took software seriously. Today's carmakers are repeating Nokia's mistake: spending billions on research but focusing on the wrong things, while Tesla and Chinese EVs play by software-first rules, just as Apple and Samsung did against Nokia.
5 min readRead →Open source powers 96% of all codebases and would cost $8.8 trillion to rebuild, yet 5% of developers create most of its value. That imbalance is fragile. Here is the economics of why engineers give their best work away for free, and what it would take to keep the system running.
6 min readRead →Every architectural decision was optimized for the IT departments that bought the phones, while consumers chose the iPhone.
BlackBerryRead →MCAS was a single point of failure designed to save training costs. 346 people died.
Boeing 737 MAXRead →The architecture served control, not storytelling. Iger killed it on day one.
Disney (Strategic Planning)Read →Katzenberg, Jobs, Roy Disney: all pushed out. 43% voted no confidence.
Disney (Eisner Era)Read →The architecture generated $10 billion a year in film revenue. Pivoting meant dismantling it.
KodakRead →Stock: $58 in 2000, $37 in 2014. Fourteen years of organizational friction.
Microsoft (Ballmer Era)Read →They saw the smartphone coming years before Apple. The architecture created civil war.
NokiaRead →TikTok and YouTube already owned short-form. Netflix owned premium.
QuibiRead →800 employees. $9 billion valuation. The business layer was fiction.
TheranosRead →"Always be hustlin." The same values that won markets created lawsuits and mass resignations.
Uber (Kalanick Era)Read →11 million vehicles. $30+ billion in fines. Engineers chose fraud over honesty.
VolkswagenRead →The cross-sell metric was perfectly aligned on paper. In practice, it inverted the mission.
Wells FargoRead →"Community-adjusted EBITDA" was not a metric. It was a mask.
WeWorkRead →Mayer had the right strategy in an organization that could not hear it.
Yahoo (Mayer Era)Read →