The Model Is Not the Brain
Every few weeks the cycle repeats. A new model drops, Twitter erupts, benchmarks get dissected, and someone asks: “Is this finally AGI?” Meanwhile, the people actually shipping AI products barely look up. They’re too busy wiring tools together.
I think the fixation on model intelligence is looking at the wrong variable. The bottleneck right now is not smarter models. It’s better tooling around the models we already have.
A model is a neuron, not a brain
Here’s the analogy that reframed my thinking: a model is not a brain. It’s a neuron.
A single neuron in isolation doesn’t do much. It fires or it doesn’t. But wire enough neurons together into the right architecture and you get cognition, memory, reasoning. The magic is not in the individual unit. It’s in the connections.
The same is true for LLMs. A single model, responding to a single prompt, will always underperform compared to a model embedded in a system: multiple agents coordinating, tools feeding it real-time data, orchestration deciding what to do next. When we benchmark a model in isolation, we’re measuring the neuron and drawing conclusions about the brain. That’s the wrong level of analysis.
The question “do we finally have a PhD-level model in our pocket?” is less interesting than “can we build a system of mediocre models that outperforms a single brilliant one?” The answer, increasingly, is yes.
Tooling is the real multiplier
Consider the simplest example: a model can’t tell you who the current president is. Not because it’s not smart enough, but because it doesn’t have access to the information. Add web search and the problem disappears. That’s not a model intelligence problem. It’s a tooling problem. The model was always smart enough. It just didn’t have the right tools.
This pattern repeats everywhere:
- Code execution. A model asked to compute something complex will hallucinate the answer. Give it a code interpreter and it writes a script, runs it, and returns the correct result. Same model, wildly different outcome.
- MCPs (Model Context Protocol). Instead of hoping the model memorizes your API schema, you give it structured access to external systems. It can query databases, read files, call APIs. The model doesn’t get smarter. It gets more capable.
- Agent loops. A model that gives a wrong answer on the first try can self-correct if you let it iterate: run code, check the output, adjust, try again. The intelligence didn’t change. The process around it did.
Look at what’s happened with AI-assisted coding. Claude Code and Cursor don’t succeed because they use a fundamentally smarter model than what you’d get through a plain chat interface. They succeed because they wrap the model in tools: file system access, code execution, search, iterative feedback loops. The model is the same. The experience is transformatively different.
We’ve seen this before
There’s a historical parallel worth drawing, even if it’s imperfect.
Think about the explosion of SaaS and web software. It didn’t happen because CPUs got dramatically faster. By the time the SaaS boom hit, processors were already more than powerful enough for web applications. What changed was the ecosystem: frameworks like Rails and Django, package managers like npm, deployment platforms like Heroku and AWS, and the developer experience improvements that made it possible for a small team to build and ship software that previously required an enterprise IT department.
The raw compute was already there. The tooling unlocked it.
I’ll be the first to admit this analogy has limits. CPUs were genuinely “solved” for web workloads by the mid-2000s. LLMs, on the other hand, still have real capability gaps in some domains. They hallucinate, they struggle with certain types of reasoning, they have context window limitations. So I’m not claiming models are fully solved the way CPUs were.
But the directional point holds: the biggest unlocks right now are coming from the ecosystem around the models, not from the models themselves. We’re in the “Rails moment” for AI, where the tooling is what turns raw capability into practical value.
We’re just getting started
If tooling is the real bottleneck, where does that leave us? Early.
MCPs are just getting started. Most teams haven’t adopted them. Agent frameworks exist but they’re immature, fragile, and hard to debug. Multi-agent orchestration is mostly experimental. The infrastructure for building reliable AI systems (not just clever demos) is still being figured out.
This is actually good news. It means there’s a massive surface area for improvement that doesn’t require waiting for some mythical next-generation model. The models we have today are good enough for an enormous range of applications. What’s missing is the scaffolding to make them reliable, composable, and useful in production.
What this means if you’re building
If you’re building with AI right now, the practical implication is simple: invest more in tooling and orchestration than in chasing the latest model.
That means:
- Give your models tools, not just prompts. Web search, code execution, database access, file systems. A well-tooled average model will outperform a poorly-tooled frontier model.
- Think in systems, not in single calls. One model call is a neuron firing. An orchestrated workflow with iteration, validation, and tool use is a brain thinking.
- Build for composability. The AI stack is going to look a lot more like software engineering than like machine learning. Modules, interfaces, error handling, observability. The skills that matter are shifting.
The people obsessing over benchmark scores are optimizing the neuron. The people building the connective tissue around models are building the brain.
The neuron is good enough. Start building the brain.