When we first wrote about GPT-5 last year, it was a leak. A set of rumoured specs, a whispered context window, a release date nobody at OpenAI would confirm. The internet — us included — ran with it. A year on, the speculation has resolved into a shipped product, a priced API, and a set of capabilities that are quietly rewriting how serious teams build on top of large language models. GPT-5.4 is here, and the interesting story is no longer whether a million-token context window is coming. It's what happens to your marketing, your agents and your workflows now that it's the default.
From Leak to Launch
The original version of this post speculated that OpenAI would ship a 1M-token context window, a native computer-use capability, and tighter reasoning controls within twelve months. All three landed. GPT-5.4 was officially announced by OpenAI on 5 March 2026, with the full launch notes published on the company's own site. The standard context window ships at 272K tokens — already 2x previous generation — and Codex users can configure the model up to the full 1M-token ceiling. The "leak" framing was right directionally, but the reality is more useful than the rumour: it's documented, priced, and already running in production at companies that rebuilt their stacks around it.
What changed between the leak and the launch isn't really the headline numbers. It's the operational surface. GPT-5.4 ships with five discrete reasoning-effort levels, a native Computer Use API, a new Tool Search feature that loads tool definitions on demand, and full-resolution vision integrated into the same model. You no longer route between a "reasoning" model and a "fast" model. You dial a slider. That change alone collapses half the orchestration code most teams were writing in 2025.
The 1M Context Window in Practice
Context windows are one of those specs that sound abstract until you try to use one. A million tokens is roughly 750,000 words — about ten full-length novels, or the entire content library of a mid-sized B2B brand. For developers, OpenAI documented the 1M-token ceiling as configurable specifically inside Codex, with 272K as the default everywhere else. That split matters. The company clearly learned from 2025: a million-token call is expensive, latency-heavy, and overkill for 90% of consumer traffic. Making it opt-in for agentic coding workloads is the right product decision.
In practice, the 1M window does three things that the old 128K/200K windows couldn't. It lets a single agent hold an entire codebase, brand book or legal corpus in working memory without retrieval-augmented generation gymnastics. It lets a content team feed a full editorial archive into a single prompt and ask "find every contradiction we've ever published about this topic." And it lets long-running agents maintain continuity across sessions that used to require external memory stores.
The cost of that isn't free — more on pricing below — but the architectural simplification is significant. Half the RAG pipelines built in 2024 are now just "put it in the prompt."
Native Computer Use: The Agent Unlock
The Computer Use API is, quietly, the bigger story. GPT-5.4 scored 75% on the OSWorld benchmark — the industry-standard test of whether a model can actually operate a desktop operating system, navigate GUIs, click buttons and complete real-world tasks. That's a step-change from the 2025 numbers, and it's the first time a frontier model has been credibly positioned as a general-purpose computer operator rather than a chat interface.
Native Computer Use means the model can, via API, control a browser or a virtual desktop: fill forms, download reports, reconcile two SaaS tools, pull a CRM export and cross-reference it against a paid media dashboard. The things interns do. The things marketing operations teams have been trying to automate with brittle Zapier chains and Playwright scripts for years. OpenAI's own launch notes and third-party analyses at DataCamp and Applying AI walk through agentic workflows that would have required a bespoke RPA vendor twelve months ago.
For marketing teams, this is the capability that makes "an AI that runs your campaign end-to-end" a practical claim rather than a pitch-deck one.
Accuracy and Reasoning Controls — The Quiet Upgrade
The upgrade nobody put on a billboard is the accuracy one. OpenAI's own figures claim GPT-5.4 responses are 33% less likely to contain a false statement compared to GPT-5.2, and full responses are 18% less likely to contain errors overall. That's not a model that writes better. It's a model that lies less. For anyone running regulated content — financial services, pharma, legal, health — the error-rate delta is a bigger deal than the context window.
Paired with that: five-level reasoning-effort control. You can now dial the same model from "answer in 400ms" to "think hard for thirty seconds before replying," which means the trade-off between cost, latency and quality becomes an API parameter rather than a separate product. Tool Search, shipped alongside, loads tool definitions on demand — so agents with dozens of tools no longer pay the token cost of schema-bloating every call. Small thing. Huge bill impact at scale.
Pricing and the Flagship Race
$2.50 per million tokens. That's the input price OpenAI set for GPT-5.4, and it's competitive enough to reset the flagship market. For reference, that's below Claude Sonnet 4.6/4.7's standard rate and in the same neighbourhood as Gemini 3.1 Pro. The price tells you what OpenAI thinks the rest of 2026 looks like: a three-way race where model quality is increasingly fungible and the winner is decided by distribution, tooling and total cost of workflow — not raw intelligence.
Speculatively, OpenAI is already briefing partners on GPT-5.5 — codename "Spud" — rumoured for an April 2026 release. Treat that as rumour, not roadmap. But the cadence tells you something: these capabilities are now shipping in quarterly cycles, not annually. Whatever you're building this month assumes a better model exists by the time you ship it.
What This Means for Marketing Teams
Strip away the benchmarks and here's the operational reality. A million-token context window means an AI can hold your entire brand book, every piece of content you've ever published, every customer-service transcript, every campaign brief and every performance report in a single working session. That is not a better chatbot. That is a memory substrate for marketing itself.
Native Computer Use means the same AI can then go log into your CMS, pull draft posts, cross-reference them against the brand book it just read, open your analytics dashboard, identify underperforming pages, rewrite them with consistent voice, and schedule them back — in one agent run, without the sixteen-tool Zapier graph you built in 2024.
The 33% accuracy improvement means you can finally trust a draft to ship with less human review on low-stakes surfaces. The five-level reasoning dial means you can pay cheap for social captions and pay premium for pillar content in the same API. And the $2.50/Mtok pricing means the unit economics of "AI writes and optimises most of our marketing" have crossed into obviously-profitable territory for any brand spending more than £5k/month on content operations.
The teams moving fastest are the ones that stopped treating this as a content-generation problem and started treating it as an operating-system problem. Which is exactly the shift Anjin was built for.
Anjin: The Marketing Operating System Built for a Million-Token World
Anjin is the Marketing Operating System — a single platform that runs your marketing end-to-end on top of frontier models like GPT-5.4. Not a wrapper around a chat window. An OS. It holds your entire brand context — every asset, every campaign, every tone-of-voice decision — inside the model's working memory, and it pairs that with agents that can actually operate your CMS, your ads platforms, your analytics and your distribution channels.
What a million-token context window unlocks for Anjin specifically: every piece of content the platform generates is produced against your entire brand history, not a 2,000-word style guide excerpt. Every campaign is reasoned about with full performance context. Every agent action is decided with the whole operational picture in view. That's the difference between an AI that writes and an AI that runs marketing.
Anjin replaces the content agency, the SEO consultant, the paid media planner, the distribution workflow and most of the coordination software underneath them — and it gets materially better every time OpenAI, Anthropic or Google ships a new model, because Anjin is the operating layer, not the intelligence layer. GPT-5.4 made the platform faster, cheaper and more accurate overnight. GPT-5.5, whenever it lands, will do it again.
Sources: OpenAI — Introducing GPT-5.4, NxCode, ApiYi, Applying AI, DataCamp, TrendingTopics.




