MiniMax releases M2.7: a self-evolving model that automates part of its own training

20-03-2026

MiniMax has launched M2.7, a model featuring a native self-evolution mechanism that allows it to analyze failure trajectories and plan changes to its own scaffolding code. In internal testing, this autonomous iterative loop improved performance by approximately 30% on internal benchmarks.

Written by:

Jorick van Weelie

minimax releases m2 7 self evolving ai model Sign up for our Newsletter

Publication date: 20 March 2026

MiniMax has released M2.7, a large language model with a distinct self-evolution mechanism that allows it to participate in its own reinforcement learning and iterative optimization. The company states that the model can handle 30 to 50 percent of the workload in certain research and development scenarios autonomously. M2.7 is available immediately through the MiniMax Agent platform and the MiniMax API.

What M2.7 does

M2.7 is built for complex agentic workflows. It natively supports Agent Teams, a multi-agent architecture in which multiple model instances collaborate on a task, and includes a dynamic tool search capability that allows it to identify and integrate relevant tools during task execution. The model targets three primary domains: software engineering, professional office work, and multi-agent coordination.

Its self-evolution capability is the most distinctive feature. M2.7 can run iterative loops in which it analyzes failure trajectories, plans changes to its own scaffolding code, executes those changes, and reruns evaluations. In internal testing, MiniMax reports that the model completed over 100 of these autonomous cycles, achieving performance improvements of roughly 30 percent on internal benchmarks.

Technical performance

On the SWE-Pro software engineering benchmark, M2.7 scored 56.22 percent, matching GPT-5.3-Codex. On Terminal Bench 2, a benchmark for system comprehension in live coding environments, it scored 57.0 percent. For agentic tasks, M2.7 achieved 62.7 percent on MM Claw and 46.3 percent on Toolathon. On machine learning research tasks (MLE Bench Lite), it averaged a medal rate of 66.6 percent across three separate 24-hour trials, placing second after Claude Opus 4.6.

For office and document work, M2.7 earned an ELO score of 1495 on GDPval-AA, placing second among the 45 models tested in that evaluation. On the AA-Omniscience Index, a hallucination metric, M2.7 scored +1, a significant improvement over M2.5 which scored -40 on the same measure.

Comparison with M2.5 and competitors

The hallucination reduction from M2.5 to M2.7 is the most numerically clear improvement. The jump from -40 to +1 on the AA-Omniscience Index represents a meaningful shift in factual reliability, which is relevant for professional document generation and research assistant use cases.

Compared to frontier models, M2.7 sits slightly below Claude Opus 4.6 on most benchmarks, but above GPT-5.3-Codex in software engineering tasks. The self-evolution feature has no direct equivalent in currently available models from OpenAI, Anthropic, or Google, though those companies conduct their own automated optimization internally before releasing models. What makes M2.7 different is that the self-evolution loop is part of the deployed model’s runtime behavior, not just a training-time feature.

Use cases and availability

M2.7 is suited for teams running automated software development pipelines, research workflows that involve repeated evaluation and iteration, and document-heavy office tasks. The Agent Teams feature makes it viable for organizations building multi-agent systems where different instances handle distinct parts of a workflow.

The model is accessible at agent.minimax.io and through the MiniMax API at platform.minimax.io. Coding-specific plans are available separately. MiniMax has not announced specific pricing in the public release notes.

The full release announcement is available on the MiniMax news page.