Holo3 Beats GPT-5.4 at Computer Use — for 1/10th the Cost

When we first covered Hugging Face's Open Computer Agent last year, it was a scrappy demo — an open alternative to Operator and Claude computer use that clicked buttons, filled forms and fell over a lot. The headline in April 2026 is very different: the open community just won the benchmark. On 1 April 2026, Hcompany released Holo3-35B-A3B on Hugging Face, and within days it posted 78.9% on OSWorld-Verified — ahead of GPT-5.4 and Claude Opus 4.6, at roughly one-tenth the cost per run. The weights are open, the API is live, and a Chrome extension called HoloTab already puts a state-of-the-art computer-use agent inside any browser. That is the story — and everything in this post is the context around it.
Hugging Face unveils open computer agent for web automation – Anjin AI Insights

When we first covered Hugging Face's Open Computer Agent last year, it was a scrappy demo — an open alternative to OpenAI's Operator and Anthropic's Claude computer use, interesting mostly as a statement of intent. It clicked buttons, it filled forms, it fell over a lot. The headline then was “the open community is trying.” The headline in April 2026 is very different: the open community just won the benchmark.

On 1 April 2026, Hcompany released Holo3-35B-A3B on Hugging Face. Within days, it posted 78.9% on OSWorld-Verified — ahead of GPT-5.4 (around 75%) and Claude Opus 4.6 — at roughly one-tenth the cost per run. The weights are open. The API is live. And the browser extension built on top of it, HoloTab, puts a state-of-the-art computer-use agent inside Chrome for anyone who can install an extension.

That is the story. Everything else in this post is the context around it.

The Open-Weight Upset

For most of 2025, computer-use agents were a closed-lab flex. OpenAI had Operator. Anthropic had Claude computer use. Both were expensive, rate-limited, and only partially reliable on real tasks — but they were the state of the art, and the assumption was that the frontier would keep belonging to the labs with the biggest GPUs and the biggest moats.

Holo3-35B-A3B broke that assumption in a weekend. An open-weight model, released on Hugging Face, scoring 78.9% on OSWorld-Verified — the benchmark the frontier labs have been publicly competing on — at a fraction of the cost. The cost delta is the part most people are missing: a GPT-5.4 OSWorld run that costs roughly $X costs roughly $X/10 on Holo3. Multiply that across a production workload and the maths stops being close — and independent analyses suggest the gap is closer to one-tenth the cost per run at scale.

This is the inflection the open-source AI community has been predicting since Llama 2. The question was never whether an open model would catch the frontier — it was when, and in which capability. The answer, for computer use, is now.

What Holo3-35B-A3B Actually Is

Holo3 is a vision-language model (VLM) specifically optimised for GUI agents. That phrasing matters. It isn't a general-purpose chat model bolted onto a screenshot loop. It's a model designed from the ground up to look at a rendered interface — web, desktop, mobile — and decide what to click, type or drag to complete a task.

The practical implications:

  • It reads screens the way a human does. Rendered pixels, not DOM trees.
  • It generalises across operating systems and sites. Web forms, desktop apps, mobile UIs — same model, same behaviour pattern.
  • It's 35B parameters with an active-3B MoE configuration, which is why inference is cheap. You don't fire the whole model on every click.
  • The weights are on Hugging Face. You can run it yourself, fine-tune it, or hit the hosted API.

For context, this is the capability that justified Operator's $200/month tier in 2025. It now runs on open weights.

HoloTab: The Chrome Extension That Ships It to Anyone

Benchmarks are one thing. Distribution is another. Hcompany shipped both.

HoloTab is a Chrome extension built directly on Holo3. It navigates live websites, automates multi-step tasks, and — critically — works for non-technical users with zero setup. You install it the way you'd install a password manager. You type what you want done. It does it.

Hcompany described HoloTab as the proving ground for Holo3: a real product that demonstrates the model's capabilities on real sites, not just curated benchmark environments. In practice, it's the thing that turns an open-weight research release into something a marketing coordinator can actually use on a Tuesday morning.

This is the bit that should worry every SaaS company whose moat was “we have an API and a dashboard.” When an agent can drive any website on your behalf, the surface area of what counts as a “tool” collapses into a single conversational interface.

Why OSWorld Matters

OSWorld-Verified is the benchmark that's come to define computer-use progress. It's a standardised suite of real computer tasks — file management, web navigation, multi-app workflows, spreadsheet work — graded on whether the agent actually completes them. No multiple choice. No synthetic scoring. The task either gets done or it doesn't.

That's why Holo3's result matters more than a lot of AI benchmark news. Beating GPT-5.4 on MMLU is an academic brag. Beating GPT-5.4 on OSWorld-Verified means the open model is more likely to successfully book your travel, update your CRM, or run your weekly reporting pipeline. These are the tasks knowledge workers actually care about.

The End of Proprietary Computer-Use Moats

The strategic implication is blunt: computer-use agents are no longer a proprietary moat. A year ago, if you wanted a reliable agent that could drive a browser and complete multi-step work, you paid OpenAI or Anthropic and accepted their limits. Today, you can run an equivalent-or-better model on open weights, at roughly 10% of the cost, with no per-seat licensing and no data egress concerns.

That doesn't mean OpenAI and Anthropic are in trouble — they'll keep pushing the frontier, and enterprise deals will keep flowing to them for reasons that have nothing to do with raw benchmark scores. But the “only proprietary frontier models can drive a computer” narrative is over, as the Hcompany announcement made clear. The next 12 months of computer-use competition will happen across open and closed models roughly at parity on capability, with the battleground shifting to orchestration, reliability, and integration — which is a different fight.

What This Means for Marketing Teams

Here's where this lands for anyone running a marketing function.

In 2025, “have an AI agent do your marketing ops” was a vision statement. The models were capable but expensive, the infra was flaky, and the people who could wire it all together were rare and £180k/year. Most teams stayed with humans plus SaaS plus spreadsheets.

In 2026, with Holo3-class open models, the maths changes:

  • You can have an agent log into your CMS, pull last week's posts, run them through an SEO checker, and refresh metadata — for cents per job, not dollars.
  • You can point an agent at your ad dashboards every morning and have it flag anomalies before your coffee.
  • You can automate the cross-platform publishing dance — LinkedIn, X, Meta, YouTube — without a Zapier tax or a VA.
  • You can do all of this with models you can self-host if you need to, which matters the moment your legal team asks about data residency.

The teams that win in this shift won't be the ones with the biggest model budget. They'll be the ones who orchestrate these agents properly — who have a single place where brand voice, content pipelines, distribution channels and performance data live, so an agent doesn't just do the work but does it on brand and against a plan.

Cheap, capable computer-use agents are now a commodity. What isn't a commodity is the operating system that coordinates them — and that's exactly where Anjin sits.

Anjin: The Marketing Operating System for an Open-Agent World

Anjin is the Marketing Operating System — a single platform where your brand, content, campaigns, distribution and performance data live, with an agent layer that operates across all of them. When a model like Holo3 can drive any website, the question stops being “what can the agent do?” and becomes “what should the agent do next, on brand, against our plan?” That's the gap Anjin fills.

What Anjin replaces in an open-agent world:

  • The scattered stack of “AI-powered” point tools that don't talk to each other
  • The content agency that can't ship fast enough to compete with an agent loop
  • The SEO consultant, the paid media planner, the analytics dashboard babysitter
  • The £8–15k/month you're currently spending to coordinate all of the above

What Anjin does that a raw open model can't:

  • Understands your brand voice, tone and rules, and enforces them across every asset
  • Plans campaigns, not just tasks — so agents work against objectives, not prompts
  • Measures what worked and feeds it back into the next cycle automatically
  • Runs 24/7 with the consistency a freelancer or agency can't match

The point isn't that Anjin replaces Holo3 or GPT-5.4 or Claude. The point is that Anjin is the layer that turns any capable agent into a marketing team that ships on Monday morning. When the underlying agents get cheaper and better — as they just did — Anjin gets more valuable, not less.

The £888 Lifetime License — Offer Closing Soon

Lifetime access to Anjin for a one-time payment of £888. Not a subscription. Not a seat. Not a trial. One payment, unlimited use, for as long as Anjin exists.

The average marketing team spends £888 in about three working days on tooling, freelancers and coordination software. You're buying the platform that replaces most of it — once.

This price will not be offered again once we close our early-access cohort.

Claim your £888 Anjin lifetime license →

Founders, agency owners and in-house marketers — this is how you run marketing at AI speed without the team, the burn, or another year of waiting.

Sources: Hugging Face — Holo3 announcement, Hugging Face — HoloTab, Holo3-35B-A3B model card, The New Stack, AIToolly, Hcompany on X.

Continue reading