Claude Sonnet 5: What's New, Pricing, Benchmarks

Facebook X

Claude Sonnet 5 landed on June 30, 2026, and Anthropic is positioning it as the most agentic Sonnet yet: near-flagship performance on coding, terminal work, and computer use, delivered at the mid-tier Sonnet price rather than the Opus one. It ships with a 1 million token context window, arrives as the new default model for Claude’s Free and Pro plans, and is available everywhere Anthropic sells models, from the Claude API to Claude Code, AWS Bedrock, Google Cloud, and GitHub Copilot.

The short version: Sonnet 5 is a drop-in upgrade to Sonnet 4.6 that closes much of the gap to Opus 4.8 on the benchmarks that matter for agents, at the same per-token list price. That is the headline, and it is a real one. But there are two catches worth knowing before you switch every workload over: Sonnet 5 uses a new tokenizer that produces roughly 30 percent more tokens for the same text, so per-token price parity does not mean per-task cost parity, and it introduces a few breaking API behavior changes that will surprise anyone who was setting sampling parameters or requesting extended thinking manually. This piece covers what Sonnet 5 is, the benchmark picture and how much to trust it, the pricing and the tokenizer catch, what changed for developers, where it sits against Sonnet 4.6 and Opus 4.8, and what it means for teams building on Claude.

What Claude Sonnet 5 is

Sonnet 5 is the mid-tier model in Anthropic’s current lineup, sitting below the Opus flagship and above the small, fast Haiku tier. Anthropic describes it as the most agentic Sonnet to date, with what it calls top-tier intelligence for coding and everyday professional work. In practice the pitch is that agentic capability, the ability to plan a task, execute it, check its own output, and correct course without being told, is now the baseline expectation at the mid tier rather than a flagship-only feature.

It is a genuine release across the whole ecosystem rather than an API-only preview. Sonnet 5 is the new default model for Free and Pro users in the Claude apps, and the new default in Claude Code for Pro users. It is available on the Max, Team, and Enterprise plans, through the Claude API with the model ID claude-sonnet-5, on Amazon Bedrock and Google Cloud’s Vertex AI, and it went generally available for GitHub Copilot the same day. The feature surface matches Sonnet 4.6: vision input, tool calling, extended and adaptive reasoning, prompt caching, web search, structured JSON-schema outputs, and computer use.

For readers new to the family, our coverage of Claude Fable 5 explains the "5" generation Anthropic has been rolling out through 2026, and our Claude Opus 4.8 explainer covers the flagship that Sonnet 5 is measured against.

The benchmark story, and how much to trust it

Anthropic’s core claim is that Sonnet 5 brings near-Opus performance to the Sonnet price point, and the reported numbers support that framing, with the usual caveats about vendor-published benchmarks.

The most consistent and striking result across sources is on knowledge work. On GDPval, Anthropic’s professional-work evaluation, Sonnet 5 reportedly scores around 1,618, edging just past Opus 4.8’s roughly 1,615. A mid-tier model matching the flagship on a broad professional-work benchmark is the kind of result that gets attention, and it is the clearest evidence for the "near-Opus" positioning.

On computer use, the figures line up cleanly too: Sonnet 5 posts 81.2 percent on OSWorld-Verified, up from Sonnet 4.6’s 78.5 percent. Terminal work shows one of the largest jumps of the release, with Terminal-bench scores climbing well into the high 70s or low 80s depending on the harness, a double-digit gain over Sonnet 4.6. On agentic coding, Sonnet 5 closes much of the distance to Opus 4.8 without quite matching it; Anthropic and independent write-ups place it in the low-to-mid 80s on the classic SWE-bench Verified set and in the low 60s on the harder SWE-bench Pro, against Opus 4.8’s high 60s on the same Pro set.

A word of caution that DM applies to every model launch: these are vendor-reported or vendor-adjacent figures, and the exact numbers vary noticeably between sources depending on the harness, the prompt scaffolding, and which variant of a benchmark is measured. The SWE-bench Verified figure in particular is reported anywhere from the low 70s to the mid 80s across different write-ups. Treat the specific decimals as directional rather than definitive. The safe read is the shape of the story, not any single number: Sonnet 5 is a clear step up from Sonnet 4.6 on agentic coding, terminal, and computer-use tasks, and it narrows the gap to the flagship enough that many teams will not feel they are giving much up by staying on the mid tier.

Pricing, and the tokenizer catch

Sonnet 5 launches with introductory pricing of 2 dollars per million input tokens and 10 dollars per million output tokens, in effect through August 31, 2026. From September 1, 2026, standard pricing takes over at 3 dollars per million input and 15 dollars per million output, which is the same list price Sonnet 4.6 carried. The full 1 million token context window is available at standard pricing, with up to 128,000 output tokens per response.

Here is the catch that the per-token numbers hide. Sonnet 5 uses a new tokenizer, and the same input text produces roughly 30 percent more tokens than it did on Sonnet 4.6. Because you are billed per token, a task that cost a certain amount on Sonnet 4.6 can cost meaningfully more on Sonnet 5 even at identical per-token rates, simply because the same words now count as more tokens. If you budget by tokens, or if you have contracts and dashboards tuned to Sonnet 4.6’s token counts, re-measure on real workloads before assuming cost parity. Our primer on tokens explains why tokenizer changes like this ripple through both cost and context budgeting.

The practical upshot: the introductory window makes Sonnet 5 genuinely cheap to trial through the end of August, but model the real per-task cost, tokenizer included, before you commit production budgets.

What changed for developers

Anthropic frames Sonnet 5 as a drop-in upgrade for Sonnet 4.6, and for most callers it is. But there are three behavior changes that can break existing integrations, and they are worth flagging clearly because they fail loudly rather than silently.

First, adaptive thinking is on by default. The model decides when to spend reasoning effort rather than requiring you to toggle it. Second, requesting manual extended thinking now returns a 400 error rather than being honored, because adaptive thinking supersedes it. Third, setting sampling parameters such as temperature to non-default values also returns a 400 error. If your code pins a temperature or top-p, or explicitly requests extended thinking, those calls will fail until you remove the overrides.

None of these is hard to fix, but they are the kind of change that surprises teams who upgrade a model ID expecting pure backward compatibility. Test your integration against claude-sonnet-5 before flipping production traffic, especially any code path that customizes sampling.

Where it sits: Sonnet 5, Sonnet 4.6, and Opus 4.8

The clean way to think about the current tiers: Opus 4.8 remains the flagship for the hardest long-horizon work and still leads on the toughest coding benchmarks; Sonnet 5 is the new workhorse that gets most of the way there on agentic tasks at a fraction of the flagship cost; and Haiku remains the fast, cheap option for high-volume, lighter work.

Against Sonnet 4.6 specifically, Sonnet 5 is a straightforward upgrade: better at planning and completing multi-step agent tasks, stronger on terminal and computer use, and better at catching and correcting its own mistakes mid-task. The main reasons to stay on Sonnet 4.6 are the tokenizer cost consideration above and the behavior changes, both of which are migration friction rather than capability regressions. For a sense of how fast this line is moving, our look at the Opus 4.6-to-4.7 cycle shows the same pattern of frequent, incremental upgrades across the family.

The deeper signal in this release is strategic. By putting near-flagship agentic performance at the mid tier, Anthropic is making agentic capability the default expectation at every price point, and pricing agents to run cheaply at scale. Our AI agents pillar covers why that matters for the economics of building agentic products.

Availability

Sonnet 5 is available immediately across Anthropic’s surfaces and its cloud partners:

Claude apps: the new default for Free and Pro; available on Max, Team, and Enterprise.
Claude Code: the new default for Pro users.
Claude API: model ID claude-sonnet-5.
Cloud platforms: Amazon Bedrock and Google Cloud’s Vertex AI.
Third-party tools: generally available for GitHub Copilot as of launch day.

What it means for builders

For most teams already on Sonnet 4.6, the move to Sonnet 5 is worth making, but do it deliberately. The wins are real on agentic coding, terminal automation, and computer-use workflows, and the introductory pricing makes the trial cheap through August. Before you migrate production, do three things: test your integration against the new behavior changes, re-measure real per-task cost with the new tokenizer rather than trusting per-token parity, and confirm on your own evaluations that the quality gain holds for your specific workloads rather than relying on the published benchmarks.

For teams choosing a tier, Sonnet 5 raises the floor. The cases where you genuinely need Opus are narrower than they were a week ago, mostly the longest-horizon, highest-stakes reasoning and the very hardest coding problems. For the large middle of agentic work, running Sonnet 5 and reserving Opus for the hard cases is now a defensible default rather than a compromise.

Frequently Asked Questions

What is Claude Sonnet 5?

Claude Sonnet 5 is Anthropic’s mid-tier model, released June 30, 2026, positioned as the most agentic Sonnet to date. It offers near-flagship performance on coding, terminal work, and computer use at the Sonnet price point, ships with a 1 million token context window, and is the new default model for Claude’s Free and Pro plans. The API model ID is `claude-sonnet-5`.

When was Claude Sonnet 5 released?

June 30, 2026. It launched simultaneously across the Claude apps, Claude Code, the Claude API, Amazon Bedrock, Google Cloud’s Vertex AI, and GitHub Copilot.

How much does Claude Sonnet 5 cost?

Introductory pricing runs through August 31, 2026 at 2 dollars per million input tokens and 10 dollars per million output tokens. From September 1, 2026, standard pricing is 3 dollars per million input and 15 dollars per million output, the same list price as Sonnet 4.6. Note that a new tokenizer produces roughly 30 percent more tokens for the same text, so real per-task cost can be higher than the per-token parity suggests.

How does Claude Sonnet 5 compare to Sonnet 4.6?

Sonnet 5 is a drop-in upgrade that is clearly stronger on agentic coding, terminal work, and computer use, and better at planning, self-checking, and correcting its own output. The tradeoffs are migration friction rather than regressions: the new tokenizer changes cost math, and a few API behavior changes can break existing integrations.

How does Claude Sonnet 5 compare to Opus 4.8?

Sonnet 5 closes much of the gap to Opus 4.8 on agentic benchmarks and reportedly edges past it on the GDPval professional-work evaluation, while Opus 4.8 still leads on the hardest coding benchmarks and the longest-horizon reasoning. For most agentic work, Sonnet 5 at Sonnet pricing is now a strong default, with Opus reserved for the hardest cases. Treat the specific benchmark figures as vendor-reported and directional.

What API behavior changes should developers know about?

Three. Adaptive thinking is on by default; requesting manual extended thinking now returns a 400 error; and setting sampling parameters like temperature to non-default values also returns a 400 error. If your code pins sampling parameters or requests extended thinking explicitly, update it before switching to `claude-sonnet-5`.

Why does the new tokenizer matter?

Because you pay per token. Sonnet 5’s tokenizer produces about 30 percent more tokens for the same input text than Sonnet 4.6’s, so a task can cost more on Sonnet 5 even at identical per-token rates. It also affects how much content fits in the context window. Re-measure real workloads rather than assuming cost or context parity with Sonnet 4.6.

Where can I use Claude Sonnet 5?

In the Claude apps (default for Free and Pro; available on Max, Team, and Enterprise), in Claude Code (default for Pro), through the Claude API as `claude-sonnet-5`, on Amazon Bedrock and Google Cloud’s Vertex AI, and in GitHub Copilot, which made it generally available on launch day.

Should I switch my production workloads to Sonnet 5?

For most teams on Sonnet 4.6, yes, but deliberately. Test against the new API behavior changes, re-measure real per-task cost with the new tokenizer, and validate the quality gain on your own evaluations before migrating production traffic. The introductory pricing through August makes a trial inexpensive.

Claude Sonnet 5: Anthropic’s Most Agentic Sonnet, at Sonnet Prices

What Claude Sonnet 5 is

The benchmark story, and how much to trust it

Pricing, and the tokenizer catch

What changed for developers

Where it sits: Sonnet 5, Sonnet 4.6, and Opus 4.8

Availability

What it means for builders

Frequently Asked Questions

One Sharp Round-Up, Once a Month

Related Reading

ChatGPT Bidi 1: What the Leaked OpenAI Bidirectional Voice Model Apparently Does

The 2026 Agentic Browser Landscape: ChatGPT Atlas, Perplexity Comet, Claude for Chrome, and the Rest

What Is Llama? Meta’s Open-Weight AI Model Family That Became the De Facto Default