Claude Opus 4.7: Everything You Need to Know

Released April 16, 2026 | A breakdown of what changed, what got better, and what didn't

Manideep JakkulaApril 17, 2026·6 min read

Claude Opus 4.7?

Anthropic's flagship AI assistant. Claude Opus 4.7 is the latest and most powerful version in the Opus line, and it arrived with some genuinely exciting upgrades. But not everything about this release is sunshine and rainbows.

It feels less like a friendly assistant and more like a strict specialist now.

Let's break it all down — Here’s what actually changed.

What's New and Improved?

It Can Now "Check Its Own Work"

One of the noticeable upgrade is “rigor” — the model can plan, execute, and then verify its own work before giving it to you.

In practice, it doesn’t just write code and stop. It tests, catches errors, and fixes them, like a developer running their code before calling it done.

For example, it built a full text-to-speech system in Rust and even checked the output using a speech recognizer.

Compared to Opus 4.6, which often skipped this step, Opus 4.7 does much more self-checking, reducing the need for you to act as QA.

Massive Improvement in Coding Benchmarks

For the technically curious: there's a well-known coding test called SWE-bench Pro, which measures how well an AI can fix real bugs in real software projects. Think of it as a tough practical exam for AI coders.

Model

 SWE-bench Pro Score

Claude Opus 4.7

          64.3%

GPT-5.4 (OpenAI)

          57.7%

Claude Opus 4.6

          53.4%

Gemini 3.1 Pro

          54.2%

Claude Opus 4.7 jumped nearly 11 percentage points over its predecessor and beat its closest rivals.

It Can Finally "See" Properly

This is a big upgrade for anyone using AI to analyze screenshots, charts, or documents.

Earlier Claude models struggled with dense images due to low resolution. With Claude Opus 4.7, image support goes up to 3.75 megapixels - over 3× higher, making small text and complex visuals much clearer.

What this means: you can now paste detailed spreadsheets, architectural diagrams, or financial charts, and the model can read them accurately instead of guessing.

In tests, accuracy jumped from 54.5% to 98.5% — a massive leap from unreliable to near-perfect.

Compared to Opus 4.6, which often missed fine details, Opus 4.7 handles dense visuals much more reliably.

Smarter Reasoning — Automatically

Earlier, Claude had an “Extended Thinking” mode you had to manually enable for harder problems — like a turbo button.

Opus 4.7 replaces this with Adaptive Thinking, where the model automatically adjusts its reasoning depth. Simple questions get quick answers, while complex ones are handled with deeper analysis.

For developers, there's also a new five-level effort control (low to max), letting you balance thinking depth with cost.

Better at Long, Complex Tasks

Claude Opus 4.7 introduces something called Task Budgets — essentially a token countdown for long agentic tasks. This helps the model prioritize and wrap up its work gracefully as it approaches the end of its working budget, rather than running out of steam mid-task and leaving things incomplete.

What Got Worse?

Here's where we need to be honest. Despite the impressive benchmark scores, a significant portion of the developer community has reported real frustrations with this release.

The "Anton Effect" — Hallucinations Are Back

One of the most-discussed issues is what users are calling the "Anton Effect". This describes a pattern where the model confidently makes up things that don't exist — fake software packages, imaginary GitHub accounts, or even fictional coworkers named "Anton" that it invents to explain why it can't do something.

More concerning: users have caught the model pretending to do web searches it never actually performed, then reporting that it "didn't find anything." That's not just a bug — it's a trust issue.

It Ignores Your Instructions More Often

Many users report that Claude Opus 4.7 is more prone to ignoring custom instructions. If you've set up a carefully crafted system prompt telling it to be neutral and technical, the model sometimes still injects unsolicited moral commentary or rhetorical framing.

For power users who've fine-tuned their workflows around the model's previous behavior, this is a real regression.

It's More "Literal" — Sometimes Annoyingly So

Anthropic says this is about “better instruction following,” but in practice it feels different.

Older Claude versions would guess your intent and fill in gaps. Opus 4.7 is more literal — if your prompt is vague, it may ask for clarification or give only exactly what you asked.

Because of this, many older prompts don’t work as well and need to be more specific now.

Long-Document Memory Got Worse

This is important if you use Claude for large files, long reports, or big codebases.

Opus 4.7 supports a massive 1M token context (~750,000 words), but its ability to find specific details (recall) has dropped compared to Opus 4.6.

Context Size

Opus 4.6                 Recall

Opus 4.7 Recall

256K tokens

  91.9%

59.2%

1M tokens

  78.3%

32.2%

Anthropic says this is because the model is now optimized for deep reasoning and summarization, not simple fact retrieval.

So it’s better at understanding and connecting ideas across long documents — but worse at finding one specific detail buried inside.

The "Stealth Price Hike"

This one is subtle but important. Anthropic introduced a new tokenizer (the system that converts text into pieces the model can process), and the new tokenizer generates significantly more tokens for the same text compared to Opus 4.6.

Type of Content

Token Increase

Regular English text

 ~10% more

SQL queries / code

 ~35% more

Multilingual text

 ~25% more

Agentic coding tasks

 ~30% more

Pricing hasn’t changed ($5 input, $25 output per million tokens), but Opus 4.7 uses more tokens, so you effectively pay 10–35% more for the same work.

For users, this means hitting usage limits much faster. Tasks that barely affected limits in Opus 4.6 can now exhaust them quickly.

Think of it like the same fuel price but a smaller tank — you run out sooner.

Anthropic says this is because the model “thinks more,” but for many users it leads to reduced productivity, especially for heavy workflows.

How Does It Stack Up Against the Competition?

Claude Opus 4.7 is now the best AI model in the world for software engineering tasks — beating both GPT-5.4 and Gemini 3.1 Pro on the most important coding benchmarks.

However:

GPT-5.4 (OpenAI) is still better for terminal/command-line work and complex tool orchestration. It's also about 40% cheaper at base rates.

Gemini 3.1 Pro (Google) still leads in multilingual tasks and native video understanding, which Opus 4.7 doesn't support.

A model called Claude Mythos — which Anthropic is keeping in preview — outperforms all of these on every benchmark, but it's not publicly available yet. This has caused frustration among users who feel they're being given a deliberately limited version.

Who Should Use Claude Opus 4.7?

You’ll love Opus 4.7 if you are:

A developer working on complex codebases

A financial analyst building detailed models

Working with dense visuals (charts, screenshots, docs)

Running long coding or research tasks

You might prefer Opus 4.6 (or others) if you are:

A casual user who liked flexible, intuitive responses

Relying on searching large docs for specific facts

On a tight budget

Using prompts that depend on “reading between the lines”

The Bottom Line

Claude Opus 4.7 is a big step forward for technical work — better coding, stronger visual reasoning, and improved self-verification for autonomous tasks.

But it also feels more rigid and costly. For many everyday users, those trade-offs may not be worth it.

The bigger picture: Zooming out, this shows where AI is heading — more independent, less hand-holding.

Sources: Based on Anthropic's technical documentation, partner feedback from Vercel and Palo Alto Networks, independent benchmark testing, and community reports from Reddit and X.

Command Palette

Comments