April 17, 2026
Anthropic released Claude Opus 4.7 on April 16, 2026, making it the company’s most capable generally available model. Opus 4.7 scores 87.6% on SWE-bench Verified (up from 80.8% on Opus 4.6), processes images at 3.75 megapixels (over 3x previous Claude models), and introduces a new xhigh effort level for fine-grained reasoning control. The model is available now across the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry at unchanged pricing of $5 per million input tokens and $25 per million output tokens.
What is Claude Opus 4.7?
Claude Opus 4.7 is the latest iteration of Anthropic’s flagship model line, positioned as a meaningful upgrade over Opus 4.6 across coding, instruction following, and long-context reasoning. The model retains the same 1 million token context window and the same API identifier format (claude-opus-4-7), making it a drop-in replacement for teams already using Opus 4.6.
Anthropic describes Opus 4.7 as a model that “handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and devises ways to verify its own outputs before reporting back.” The release comes alongside two new features: a public beta of task budgets (which let developers set a token target for entire agentic loops) and an xhigh effort level that sits between the existing high and max settings.
Claude Opus 4.7 benchmarks and technical specs
On SWE-bench Verified, Claude Opus 4.7 scores 87.6%, a nearly 7-point improvement over Opus 4.6’s 80.8% and ahead of Gemini 3.1 Pro (80.6%). SWE-bench Pro, which tests harder multi-language engineering tasks, jumps from 53.4% to 64.3%. On GPQA Diamond, the model scores 94.2%, up from 91.3% on Opus 4.6, though GPT-5.4 (94.4%) and Gemini 3.1 Pro (94.3%) hold a slight edge on this particular benchmark.
In computer use tasks measured by OSWorld, Opus 4.7 reaches 78.0% (up from 72.7%). Visual reasoning on CharXiv improves from 69.1% to 82.1%, reflecting the model’s expanded vision capabilities. The model also scores 77.3% on MCP-Atlas, making it the best-in-class option for multi-tool orchestration workflows. Financial analysis performance rises from 60.1% to 64.4%, and Anthropic reports 21% fewer errors on OfficeQA Pro compared to Opus 4.6.
Not every metric improved. On Terminal-Bench 2.0, Opus 4.7 scores 69.4%, trailing GPT-5.4’s 75.1%. BrowseComp also shows some softening at 79.3% compared to GPT-5.4’s 89.3%. One trade-off to note: Opus 4.7 uses a new tokenizer that produces 1.0 to 1.35x more tokens for equivalent input, which may affect costs for some workloads despite unchanged per-token pricing.
How does Claude Opus 4.7 compare to GPT-5.4 and Gemini 3.1 Pro?
Claude Opus 4.7 leads in agentic coding tasks. Its 64.3% on SWE-bench Pro is ahead of GPT-5.4 (57.7%) and Gemini 3.1 Pro (54.2%). On multi-tool orchestration (MCP-Atlas at 77.3%), Opus 4.7 is currently the top-performing model. The 87.6% SWE-bench Verified score also puts it above both competitors in standard software engineering benchmarks.
GPT-5.4 retains advantages in terminal-based tasks (75.1% vs 69.4% on Terminal-Bench 2.0) and web browsing comprehension (89.3% vs 79.3% on BrowseComp). On GPQA Diamond, all three models cluster closely: GPT-5.4 at 94.4%, Gemini 3.1 Pro at 94.3%, and Opus 4.7 at 94.2%. The practical difference for most knowledge work is negligible at that level.
It is worth noting that Anthropic’s own unreleased Mythos Preview model scores 77.8% on SWE-bench Pro, well above Opus 4.7’s 64.3%. Anthropic has stated that Mythos will not be publicly released due to cybersecurity concerns, as the model demonstrated the ability to find zero-day vulnerabilities at scale.
Claude Opus 4.7 vision and new features
Opus 4.7 is the first Claude model with high-resolution image support. Maximum image resolution increases from 1,568 pixels on the long edge (roughly 1.15 megapixels) to 2,576 pixels (roughly 3.75 megapixels). Anthropic reports that one early-access partner testing computer vision for autonomous penetration testing saw visual acuity jump from 54.5% on Opus 4.6 to 98.5% on Opus 4.7.
The new xhigh effort level gives developers more granular control over reasoning depth and latency. Anthropic recommends starting with xhigh for coding and agentic use cases, and using at least high for most intelligence-sensitive tasks. Task budgets, now in public beta, allow developers to set a rough token target for an entire agentic loop. The model sees a running countdown and uses it to prioritize work and wrap up gracefully as the budget runs out.
Claude Opus 4.7 availability and pricing
Claude Opus 4.7 is available now across all Claude products and the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. The API model identifier is claude-opus-4-7. Pricing remains at $5 per million input tokens and $25 per million output tokens, with no long-context premium on the 1 million token context window.
Anthropic has also released a /ultrareview slash command in Claude Code for dedicated code review sessions, and extended Auto mode to Max plan users. Cybersecurity safeguards are built into Opus 4.7 to automatically detect and block requests indicating prohibited or high-risk cybersecurity uses. Security professionals who need access to the model’s full capabilities can apply for the Cyber Verification Program.
For full details, see Anthropic’s official announcement at anthropic.com/news/claude-opus-4-7.