Large Language Models (LLMs) don’t “reason.” They compute the next most likely output, given a prompt and their training. That’s it. No internal model of the world. No deliberation. No causal inference. Just a massively scaled-up autocomplete that’s really good at sounding confident.
This isn’t a nitpick. It is a foundational distinction. When people say AI “reasons,” they invite misplaced trust. Real reasoning involves weighing alternatives, constructing intermediate representations, checking consistency. LLMs do none of that.
As Apple’s researchers put it bluntly in The Illusion of Thinking (2023), LLMs don’t reason in any meaningful human sense. They simply predict. And the more possible continuations they’re allowed to consider, the less reliable their outputs become. The illusion of intelligence breaks down when too many degrees of freedom are involved. Every “choice” becomes a branching path into potential incoherence.
That’s why LLMs often do well when the prompt is tight and the options are constrained. This is also the reason LLMs often fail spectacularly when asked to solve open-ended problems from scratch.
But there’s good news: we can engineer around this.
Good News
A counter-approach called ReAct (Yao et al., 2022) attempts to produce something like reasoning by scaffolding language models with structured thought and tool use. Instead of letting the model guess freely, ReAct interleaves reasoning steps with external actions. It guides the model through a sequence of constrained steps. Essentially limiting choice at each point and thereby improving reliability along the way.
Another approach, one I’ve been developing, is something I call GRAIL: Goal-Resolution, Affordance-Informed Logic. Instead of pretending the LLM can reason, GRAIL strips away unnecessary options and asks the agent to simply follow the affordances; the actions that are currently possible given the current state. By doing so, GRAIL shifts the burden from “thinking” to “navigating.” No more wide-open decision trees to chase down. No fragile hallucination-prone paths to manage. Just consistent, tractable progress toward a goal, one constrained step at a time.
I’ve started releasing some of the results of my work on GRAIL on github. Be sure to follow me there for updates. — MikeA
The lesson from recent research is clear: The more choices you hand an LLM, the less likely it is to succeed. No choices? Almost guaranteed success. A dozen branching possibilities? Roll the dice.
Key Takeaway
So if you're building AI agents, don't treat LLMs like thinkers. Treat them like probability engines. Reduce their uncertainty by limiting their options. Wrap LLMs in defined workflows. Let them follow the offered affordances and get out of their own way.
Because in AI, the path to 'reasoning' isn’t through more freedom. It’s through fewer choices.