Most of the AI agents demoed in 2025 never made it to production. Gartner now expects more than 40% of agentic AI projects to be scrapped by the end of 2027 — not because the models got worse, but because the things built around the models were never engineered for reality. The anatomy is wrong. The agents are trying to do too much, with too little structure, for use cases no one was actually asking for.
This is an entry in our build-in-public series, originally written in August 2025 as “Inside Anjin #24”. We've kept the original case study — a small internal agent called Snippet to Summary — because it still represents the cleanest example we have of what a useful agent looks like, and we've updated the framing for what the industry has since learned about production-grade agent design. If you've worked through our later piece on agent chaining flows, this one is the upstream primer: the single-agent anatomy that makes chaining work at all.
If you've ever built, bought, or been sold an AI agent that didn't survive contact with a real workflow, this one's for you.
What 2026 taught us about agent reliability
The defining insight of the last twelve months is the split between the model and the harness. The model is the intelligence — Claude, GPT, Gemini, whichever. The harness is everything else: the state, the tools, the guardrails, the retries, the feedback loop, the constraints on what the model is allowed to attempt. LangChain's engineering team put it bluntly: an LLM becomes an agent only when the harness gives it things like state, tool execution, feedback loops, and enforceable constraints.
In 2025, a lot of agent demos were really just clever prompts in a loop. Pretty to watch. Useless by week two. By 2026, the pattern that's actually shipping looks more like this:
- Probabilistic reasoning inside deterministic workflows. The agent isn't improvising the whole process from scratch. It's making a judgement call at one or two decision points inside a workflow we already know works.
- Memory that outlives the session. Large context windows aren't a substitute for persistent memory. Performance scales with what the agent remembers across runs, not how many tokens you can cram into one.
- Trajectory metrics, not just outcome metrics. “Did it work?” isn't enough. The grown-up teams are now measuring trajectory_precision and trajectory_recall — did the agent take a sensible path, not just land in a sensible place?
- Smaller surface area, tighter constraints. The agents surviving contact with production do one thing per run, with clear inputs, clear outputs, and a refusal behaviour when the input is off-piste.
That's the backdrop. Now the specific example.
The agent: Snippet to Summary
Snippet to Summary takes a fragment of text — a sentence, a paragraph, a half-formed thought — and turns it into a structured summary block: a one-line headline, two or three sentences of context, and a suggested use case (social post, email intro, deck headline, blog opener).
That's it. No conversation. No follow-up questions. No “let me think step by step.” Input goes in. Three labelled outputs come out.
It was the first agent inside Anjin that we actually used every day without being asked to test it. Which turned out to be the giveaway.
What made it work
Four things, none of them clever.
One: clear input expectations. The UI tells you what to paste in. A snippet. Not a brief, not a prompt, not a 2,000-word document. The agent's reliability comes from the fact that the input surface is narrow.
Two: structured outputs that look how a human would write them. A single-line summary, a short expansion, a use case. Not a wall of markdown. Not seventeen bullet points. The shape of the output is the shape you'd use.
Three: genuine time savings. The output is 90% usable as-is. Editing it takes seconds, not minutes. If you have to rewrite the output every time, the agent is a tax, not a tool.
Four: no configuration theatre. No tone dropdowns. No “brand voice” slider. No seventeen toggles. Sensible defaults that work for the job.
A useful agent, anatomically
If we broke Snippet to Summary down into the parts that every useful agent in 2026 seems to share, it looks like this:
- A narrow input contract. One shape of input, clearly described. The agent refuses or redirects if it gets something else.
- A tool layer the model doesn't control directly. The model decides what to attempt. The tool layer decides what's allowed. Those are two different jobs.
- A deterministic output schema. Not “write something useful.” A named, structured output the rest of your stack can rely on.
- A short feedback loop. One retry at most, with a different prompt or a cleaner input. Not five agents arguing in a loop for 90 seconds.
- An honest refusal behaviour. When the input is wrong, it says so. It doesn't hallucinate a plausible-looking answer.
- Observability baked in. Every run logs what went in, what came out, and what the model decided along the way — so when something breaks, you can see where.
None of that is new. All of it is the opposite of what most 2025-era “autonomous marketing agents” actually shipped.
What it taught us about good agent design
Reading across the agent research that's come out in the last six months — Anthropic's Building Effective Agents, LangChain's harness work, the move to graph-based orchestration in LangGraph — the lesson keeps landing in the same place:
Useful agents are boring on purpose. They're not trying to be assistants, companions, or digital employees. They're trying to do one defined job so reliably that you stop noticing them. The day Snippet to Summary became invisible inside our workflow is the day we knew it was working.
The corollary is harder to hear: most agents shouldn't exist. If a feature works as a one-shot prompt, it doesn't need to be an agent. If a workflow already runs on rails, it doesn't need autonomy. Agents earn their complexity by doing something a deterministic function can't — usually, reasoning over ambiguous inputs inside a well-bounded problem.
Why this matters for marketers right now
Marketing is getting pitched more “AI agents” per week than any other function in the business. Content agents. SEO agents. Social agents. Outreach agents. “Full-stack autonomous marketing agents.”
Most of them are prompt chains in a trench coat. They demo beautifully. They collapse the first time a real brief hits them — a real brand, a real campaign, a real deadline.
The test isn't “can it do the task.” The test is:
- Will a real marketer use this tomorrow, without being asked, because it's faster than the alternative?
- Is the output 90% usable, or will I end up rewriting it?
- When it fails, does it fail loudly — or does it hand me something confidently wrong?
- Does it get better when I give it more context over time, or does every session start from zero?
If the answer to any of those is wrong, you don't have an agent. You have a demo. Once each individual agent passes that bar, the next question is how you chain them together without the whole flow collapsing — which is its own discipline.
Anjin: The Marketing Operating System built on useful agents
Anjin is the Marketing Operating System. That means a shared home for briefs, brand, content, campaigns and reporting — with agents embedded at the points where they actually earn their keep.
We don't ship an “autonomous marketing agent” that tries to run your whole funnel while you watch. Anjin is a platform where the boring stuff — the briefs, the assets, the tracking, the reporting — is handled, and where small, constrained agents like Snippet to Summary do the specific bits that benefit from reasoning over ambiguous input.
That's the bet: a Marketing OS, composed of useful agents, rather than one big agent pretending to be an OS.
Agencies are our launch audience because they feel the pain first — twenty clients, twenty brand systems, twenty sets of briefs, and a team that's drowning in coordination. But the same logic applies to any founder, in-house marketer or ops lead who's tired of stitching together fifteen tools and pretending that counts as a workflow.
The £888 Lifetime License — Offer Closing Soon
Lifetime access to Anjin for a one-time payment of £888. Not a subscription. Not a seat. Not a trial. One payment, unlimited use, for as long as Anjin exists.
The average marketing team spends £888 in about three working days on tooling, freelancers and coordination software. You're buying the platform that replaces most of it — once.
This price will not be offered again once we close our early-access cohort.
Claim your £888 Anjin lifetime license →Founders, agency owners and in-house marketers — this is how you run marketing at AI speed without the team, the burn, or another year of waiting.
Sources: Kore.ai — AI Agents in 2026, LangChain — The Anatomy of an Agent Harness, Anthropic — Building Effective Agents, orq.ai — AI Agent Architecture, MarTech — AI agents in marketing 2026, Machinebrief — State of AI Agents 2026, MetaComp — AI agent governance framework





