Microsoft Launches MAI-Thinking-1 and MAI-Code-1-Flash

03-06-2026

MAI-Thinking-1 and MAI-Code-1-Flash together signal that Microsoft is building its own model stack across reasoning, code, image, voice, and transcription, giving developers an alternative to OpenAI within the Microsoft ecosystem.

Written by:

Jorick van Weelie

Marketing Lead at DataNorth | AI Enthusiast & Tech Storyteller

microsoft launches mai thinking 1 and mai code 1 flash

Published: June 3, 2026

Microsoft announced MAI-Thinking-1 and MAI-Code-1-Flash at its Build 2026 developer conference on June 2, marking the company’s first major in-house foundation models built entirely without OpenAI technology. MAI-Thinking-1 is a 35-billion active parameter reasoning model trained on commercially licensed data, while MAI-Code-1-Flash is a 5-billion parameter coding model now rolling out inside GitHub Copilot. Together, the two models represent Microsoft’s clearest step yet toward reducing its reliance on OpenAI for core AI capabilities.

What are MAI-Thinking-1 and MAI-Code-1-Flash?

MAI-Thinking-1 is a mid-sized sparse Mixture of Experts model with 35 billion active parameters and a 256,000-token context window. Microsoft trained the model from scratch on enterprise-grade, commercially licensed data, without distillation from any third-party model. The model is designed for complex multi-step instructions, long-context reasoning, and code generation, with built-in support for function calling and developer-defined instructions.

MAI-Code-1-Flash is a smaller, faster model at 5 billion parameters, purpose-built for coding tasks inside GitHub Copilot. It was trained directly on production Copilot harnesses and licensed data, which means the model learned to interact with the surrounding tools and systems used in real agentic coding workflows. Microsoft describes it as having adaptive thinking: it stays concise for simple requests and spends more reasoning budget on complex tasks.

MAI-Thinking-1 and MAI-Code-1-Flash benchmark results

MAI-Thinking-1 scores 97.0% on AIME 2025 and 94.5% on AIME 2026, both benchmarks that test mathematical and multi-step scientific reasoning.

On SWE-Bench Pro, Microsoft says the model matches Claude Opus 4.6 on coding tasks. In blind side-by-side evaluations conducted by Surge, Microsoft’s independent human rating partner, MAI-Thinking-1 was preferred over Claude Sonnet 4.6.

MAI-Code-1-Flash outperforms Claude Haiku 4.5 across all four core coding benchmarks tested, including a 16-point lead on SWE-Bench Pro (51.2% vs. 35.2%).

The model can solve harder coding tasks with up to 60% fewer tokens on SWE-Bench Verified.

Microsoft also claims that after fine-tuning the model for consulting firm McKinsey, it outperformed OpenAI’s GPT-5.5 with 10 times better cost efficiency.

How do Microsoft’s MAI models compare to OpenAI and Anthropic?

The MAI model family positions Microsoft as a direct competitor to its own partner OpenAI, as well as Anthropic and Google. MAI-Thinking-1 targets the same reasoning and coding niche occupied by Claude Opus 4.6, OpenAI’s o3, and Google’s Gemini 3.5 Flash. The key differentiator is cost: Microsoft built MAI-Thinking-1 as a medium-sized model focused on high efficiency at low token costs, rather than competing on raw parameter count.

MAI-Code-1-Flash occupies a different segment. At 5 billion parameters, it sits well below frontier models in size but is optimized specifically for the Copilot workflow. Its direct competitor is Claude Haiku 4.5, which currently powers many lightweight coding assistant tasks. The SWE-Bench Pro results suggest MAI-Code-1-Flash outperforms Haiku in this specific context, though real-world Copilot user feedback will be the true test.

MAI-Thinking-1 and MAI-Code-1-Flash availability and pricing

MAI-Thinking-1 is available in private preview through Microsoft Foundry. It supports the Chat Completions API and will become available through third-party inference providers including Fireworks AI, Baseten, and OpenRouter. Public pricing has not been finalized.

MAI-Code-1-Flash began rolling out to GitHub Copilot individual users in Visual Studio Code on June 2, with availability expanding across Copilot Free, Pro, Pro+, and Max plans. GitHub’s pricing page lists MAI-Code-1-Flash at $0.75 per million input tokens, $0.075 per million cached input tokens, and $4.50 per million output tokens, though the model card notes pricing is still being finalized.

The broader MAI model family at build 2026

MAI-Thinking-1 and MAI-Code-1-Flash are part of a broader family of seven MAI models announced at Build 2026. The full lineup includes MAI-Image-2.5, an updated image generation model that debuted at third place on the Arena.ai leaderboard for image generation; MAI-Image-2.5 Flash, a faster variant; MAI-Transcribe-1.5, which supports 43 languages and holds the top spot on the FLEURS speech benchmark; and MAI-Voice-2, covering voice cloning and prompting in more than 15 languages. These models already power features across Copilot, Bing, PowerPoint, and Azure Speech.

Microsoft’s full announcement of MAI-Thinking-1 is available on the Microsoft AI blog, and the MAI-Code-1-Flash introduction can be found on the Microsoft AI news page.