← Blog··Updated 7 Jun 2026·7 min read

What 'agentic' actually means, and the agent-washing problem

'Agentic' was the word of the year, and Gartner reckons only about 130 of the thousands of vendors selling agents are shipping one. Here is the precise definition — a loop with tools, planning, state, and retry — the line that separates a real agent from a single LLM call wearing a cost...

AI-assisted postDrafted with help from Claude, edited and fact-checked by Mart. See transparency policy →
A vintage patent-medicine advertisement

Dr. Miles Nervine, a patent-medicine advertisement — an ordinary product sold with extraordinary claims. Public domain.

The word that ate the industry

"Agentic" was Dictionary.com's 2025 word of the year, and by early 2026 CNN was writing about how the word had taken on a life of its own inside the industry — which is the linguistic equivalent of a stock chart going vertical right before someone uses the word "bubble." When a technical adjective becomes a marketing primitive, it stops meaning anything, and "agentic" is most of the way there. The word is now attached to chatbots, to scheduled scripts, to RPA macros, to a single API call with a fancier prompt. If everything is agentic, the word is doing no work, and the only honest response is to define the thing precisely and then count how many products actually clear the bar.

Gartner did the counting. Their answer is brutal: of the thousands of vendors marketing agentic AI, only about 130 are shipping something that meets the technical definition. Gartner calls the gap agent washing — rebranding assistants, chatbots, and RPA as "agents" without the substance — and in the same line of analysis forecasts that over 40% of agentic-AI projects will be cancelled by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls. Two numbers, 130 real out of thousands and 40% cancelled, are the entire emotional content of this post. The rest is just making the definition sharp enough that you can do the counting yourself.

The definition, with the fat trimmed off

Strip the marketing and "agentic" names a specific architectural shape, not a vibe. The convergent definition across the people who use the word carefully — and it is convergent — is that an agentic system autonomously tackles a goal by planning, acting through tools, observing results, and adapting. Unpacked, that is four components, and you need all four:

  • A goal, not an instruction. You give the system a desired end state ("get this test suite passing"), not a single transformation ("rewrite this function"). The system decides the steps.
  • Tool use. The system can take actions in the world — read files, run commands, call APIs, query a database — and not merely emit text. This is the leg MCP standardised, and it is necessary but nowhere near sufficient.
  • Planning. The system decomposes the goal into steps and chooses an order, rather than executing a fixed script someone else wrote.
  • A loop with state and retry. The system acts, observes the result, compares it to the goal, and decides what to do next — including trying again when something fails. State carries across iterations.

That last component is the one that does the real separating, so it deserves emphasis. The defining feature of an agent is the decide-act-observe-adapt loop: it runs, looks at what happened, and adjusts. Remove the loop and you have a single forward pass — a function call dressed up. A real agent is a control loop with a language model in the decision seat, tools as effectors, and memory as the thing that lets one iteration inform the next.

flowchart TD
    G["Goal / end state"]
    P["Plan: decompose into steps"]
    A["Act: invoke a tool"]
    O["Observe: read the result"]
    D{"Goal met?"}
    R["Adapt: revise plan,<br/>retry, or try a new step"]
    DONE["Done"]
    G --> P --> A --> O --> D
    D -- no --> R --> A
    D -- yes --> DONE
    M[("State / memory<br/>persists across iterations")] -.-> P
    M -.-> A
    M -.-> D
    R -.-> M

The single LLM call has none of this loop. Goal in, text out, done. It can be excellent — most of the value people get from these tools day to day is a single excellent forward pass — but it is not an agent, and calling it one is the washing.

The line in one example

Concrete is clearer than taxonomy. Take "fix the failing test."

The non-agent version: you paste the failing test and the source into a chat window, the model reads it, and emits a patch. One forward pass. If the patch is wrong, you notice, you paste the new error, you prompt again. The intelligence in the loop is yours; the model is a very good text transformer you are operating by hand. This is most of what "AI coding" actually is in practice, and there is nothing wrong with it. It is just not agentic.

The agentic version: you give a system the goal "make the suite pass," and it runs the suite itself, reads the failure, forms a hypothesis, edits a file, runs the suite again, sees a new failure, revises, and loops until green or until it gives up — maintaining state about what it has already tried so it does not cycle. The model is now sitting inside a control loop that closes on observed reality. The difference is not that the second model is smarter. It is the same model. The difference is the harness around it: the loop, the tool access, the state, the retry logic. That harness is the agent. The model is a component of it.

This is exactly why, in the post on why every AI IDE is the same model with a different system prompt, the substance of a 2026 coding tool comes down to a system prompt, a tool registry, and an agent harness — and why the harness, not the model, is where the agentic-ness lives. Two products running the identical model can differ enormously in how agentic they are, depending entirely on the quality of the loop wrapped around it.

Why most "agents" are a prompt with a fancy name

Agent washing works because the costume is cheap and the substance is expensive. Adding the word "agent" to a chatbot costs a marketing meeting. Building a real decide-act-observe-adapt loop that does not spiral into nonsense costs a great deal of engineering, and most of that engineering is the unglamorous part: error handling, state management, knowing when to stop. Gartner's framing — rebranding existing assistants, RPA, and chatbots without substantial agentic capabilities — describes a rational economic response to a hot word, not a conspiracy. The incentive is to relabel, and the bar is invisible to buyers, so relabelling wins.

There is also a deeper reason the real agents are rarer than the demos suggest, and it is the same structural problem I have been circling in this series. A loop amplifies whatever the model does each iteration, including its errors. A single bad forward pass in a chat is wrong text you can dismiss. A bad forward pass inside a 20-iteration loop becomes a premise the next iteration builds on — the context poisons, the errors compound, and the loop confidently constructs an edifice on a hallucinated foundation. I worked through that failure mode in detail in the post on why AI cannot guardrail against AI: the loop is precisely the structure that turns a one-off mistake into a cascade, and the only reliable brakes are deterministic and external to the model — tests that actually run, type checkers, schema validators, the loop's own ability to observe ground truth rather than its own narration.

Which is the quiet test for whether a product is genuinely agentic or just washed: what does its loop observe? If the loop closes on real, deterministic feedback — the test passed or it didn't, the query returned rows or an error, the file compiled — there is a real agent there, because the loop is grounded. If the loop closes on the model's own assessment of whether it succeeded ("I believe this looks correct"), it is theatre, because the verifier and the generator share blind spots and the loop is just the model agreeing with itself in a circle. A lot of what gets sold as agentic is the second kind, and the 40% cancellation forecast is, I suspect, mostly that second kind running into production reality.

The team version of the same trap

Stack the washing and you get the scenario the rest of this series has been worried about. The pitch is that if one engineer plus an agent is two-engineers-fast, then twenty engineers each with an agent is forty-engineers-fast — and as I argued in twenty LLMs do not make a team, the arithmetic breaks because the agents produce diffs without producing shared comprehension, and the bus factor quietly collapses to one. That argument assumed the agents were real — actual loops doing actual work. Agent washing makes it worse, because now half the "agents" in the org chart are single-call prompts with a status dashboard, generating volume with no loop, no grounding, and no observation of whether their output is correct. The dashboard says twenty agents are working. Several of them are a POST request with a logo.

And the review side compounds it. When the output of these systems lands as a pull request, the team needs to review it as the model's output rather than a colleague's craft — distributed across reviewers, paired with a deterministic map of what actually changed, which is the practice argued for in reviewing LLM diffs as a team. A washed "agent" that emitted a 300-line diff in one ungrounded forward pass is exactly the kind of artefact that most needs that scrutiny and is least likely to get it, because the word "agent" on the PR carries an unearned air of having been handled autonomously and correctly.

A short close

"Agentic" has a real, precise meaning: a loop that pursues a goal by planning, acting through tools, observing real results, and adapting, with state carried across iterations. The load-bearing word in that sentence is observing — observing ground truth, not the model's own opinion of its work. By that definition Gartner's count of roughly 130 real vendors out of thousands is believable, and the 40%-cancelled-by-2027 forecast reads less like pessimism and more like arithmetic on a population of products that never had a grounded loop to begin with.

The practical move is to ignore the word entirely and ask three questions of anything sold to you as an agent. Does it run a loop, or is it one forward pass? Can it take real actions through tools, or does it only emit text? And the one that separates the 130 from the thousands: when it checks whether it succeeded, does it observe something deterministic in the world, or does it ask itself? A real agent is a loop plus tools plus state plus a grounded check. Most "agents" are a prompt with a fancy name, and now you can tell which is in front of you without trusting the name at all.

Read next