Moonshot AI Releases Kimi K2.6

22-04-2026

Moonshot AI releases Kimi K2.6, a 1T parameter open-weight model scoring 58.6% on SWE-Bench Pro and 54.0 on HLE with tools.

Written by:

Jorick van Weelie

Marketing Lead at DataNorth | AI Enthusiast & Tech Storyteller

Published: April 22, 2026

Moonshot AI released Kimi K2.6 on April 21, 2026, an open-weight large language model with 1 trillion parameters that outperforms both GPT-5.4 and Claude Opus 4.6 on several major coding and agentic benchmarks. Kimi K2.6 uses a Mixture-of-Experts architecture with 32 billion active parameters per token and a 256K token context window. The model is available on Hugging Face, the Moonshot API, and Ollama, with input pricing at $0.60 per million tokens.

What is Kimi K2.6 and what makes it different?

Kimi K2.6 is Moonshot AI’s latest flagship model and the successor to Kimi K2.5. It is built on a Mixture-of-Experts (MoE) architecture containing 1 trillion total parameters, of which 32 billion are activated per forward pass. The model routes each token through 8 of its 384 available experts, plus one shared expert that is always active. This design allows Kimi K2.6 to deliver frontier-level performance at a fraction of the compute cost of dense models of comparable quality.

Kimi K2.6 natively supports text, image, and video input through its integrated MoonViT vision encoder, a 400-million-parameter module that processes visual data alongside language. The model uses Multi-head Latent Attention (MLA) as its attention mechanism, SwiGLU as its activation function, and operates with a vocabulary of 160,000 tokens. Its 256K token context window places it among the longer-context open-weight models currently available.

Kimi K2.6 benchmarks and how it compares to GPT-5.4 and Claude Opus 4.6

Moonshot AI reports that Kimi K2.6 leads on five of eight major benchmarks when compared to GPT-5.4, Claude Opus 4.6, and Gemini 3.1 Pro. The most notable results are in agentic and coding tasks.
On Humanity’s Last Exam (HLE) with tools, Kimi K2.6 scores 54.0, ahead of Claude Opus 4.6 at 53.0 and GPT-5.4 at 52.1.
On SWE-Bench Pro, which measures real-world GitHub issue resolution, Kimi K2.6 scores 58.6% compared to GPT-5.4 at 57.7% and Claude Opus 4.6 at 53.4%. On SWE-Bench Verified, it reaches 80.2%.

On pure reasoning benchmarks, GPT-5.4 retains the lead. It scores 99.2% on AIME 2026 versus Kimi K2.6’s 96.4%, and 92.8% on GPQA-Diamond versus 90.5% for Kimi K2.6. In multimodal understanding (MMMU-Pro), Gemini 3.1 Pro leads at 83.0%, followed by GPT-5.4 at 81.2% and Kimi K2.6 at 79.4%. These results position Kimi K2.6 as the strongest open-weight model available, competitive with the top closed-source systems across most categories.

Kimi K2.6 Agent Swarm: scaling to 300 sub-agents

One of the most distinctive features of Kimi K2.6 is its Agent Swarm system, which allows the model to distribute complex tasks across up to 300 independent sub-agents executing up to 4,000 coordinated steps simultaneously. This is a significant upgrade from Kimi K2.5, which was limited to 100 sub-agents and 1,500 steps. Each sub-agent runs its own tool-call chain and can handle a different skill: one might analyze a flame graph, another rewrites a hot path, and a third runs benchmarks and reports results.

In practice, this means Kimi K2.6 can operate continuously for over twelve hours on long-horizon coding tasks in languages including Rust, Go, and Python. Moonshot AI offers four deployment variants: Instant (low-latency responses), Thinking (extended reasoning), Agent (single-agent tool use), and Agent Swarm (multi-agent orchestration). All four are available through Kimi Chat and the Moonshot API.

Kimi K2.6 availability, pricing, and how to access it

Kimi K2.6 is available immediately through multiple channels. The open weights are published on Hugging Face at moonshotai/Kimi-K2.6. Developers can access the model through the Moonshot API at api.moonshot.ai, which is OpenAI-compatible, as well as through Kimi.com, the Kimi App, the Kimi Code CLI, and Ollama for local deployment. The model is also available on Cloudflare Workers AI.

Pricing on the official Moonshot API is $0.60 per million input tokens and $4.00 per million output tokens, with cached tokens at $0.16 per million. For comparison, Claude Opus 4.7 costs $5.00 per million input tokens, making Kimi K2.6 roughly 88% cheaper for input. The model is also available through third-party providers on OpenRouter.

Kimi K2.6 represents a notable step forward for open-weight AI models, narrowing the gap with closed-source systems on coding and agentic tasks.

The official announcement and technical details are available on the Kimi K2.6 blog post and the model weights can be downloaded from Hugging Face.