Published: April 15, 2026
OpenRouter has published Elephant Alpha, a 100 billion parameter text model from an undisclosed lab, available through its API since April 13. The model targets intelligence efficiency, pairing a 256K token context window with strong reasoning at low token usage, and is offered free during the alpha period.
What Elephant Alpha does
Elephant Alpha is a text-only large language model positioned as a daily driver for code completion, document processing and agent workflows. According to the listing on OpenRouter and an introductory post on the Kilo Code blog, the model focuses on producing accurate answers using fewer tokens than comparable 100B class systems, which lowers latency and cost when the model is wired into developer tools or long-running agents.
The provider is described as a stealth release from a prominent open model lab. The actual organisation behind the weights has not been disclosed, which mirrors a pattern seen with several recent stealth launches on OpenRouter where labs gather community feedback before announcing the model under their own brand.
Key technical specifications
Elephant Alpha is a 100 billion parameter dense or mixture-of-experts model. The exact architecture has not been published. The model exposes a 262,144 token context window and supports up to 32,768 output tokens per request. It accepts text input and returns text output, with no native image, audio or video support.
The model supports function calling, structured output and prompt caching, which positions it for use in tool-using agents and structured data pipelines. Pricing through OpenRouter is set at zero dollars per million input and output tokens during the alpha. Prompts and completions may be logged by the provider and used to improve the model, which is relevant for teams handling sensitive data.
How it compares to recent releases
At 100B parameters, Elephant Alpha sits between mid-sized open models such as Llama 4 Scout and the larger frontier systems like MiniMax M2.7 and Claude Mythos. The headline differentiator is the 256K context window combined with the explicit efficiency framing, where the model trades raw benchmark peak for shorter, cheaper responses on routine work.
Independent benchmark scores were not published with the release. Early community reports point to competitive results on coding and document processing tasks, but no official model card, training details or evaluation table has been shared. That makes direct comparison with Gemma 4, Llama 4 and the recent Qwen and DeepSeek models difficult at this stage.
Availability and use cases
The model is available immediately through OpenRouter at the model ID openrouter/elephant-alpha, and is also accessible inside the Kilo Code editor where it has been wired in as a free option. Typical workloads identified by the provider include rapid code completion and debugging, processing long documents in a single pass, and lightweight agent loops where token budget matters.
Because Elephant Alpha is a stealth release with a free pricing tier and provider logging enabled, teams building production systems should treat it as an evaluation channel rather than a stable supplier. The model is well-suited for prototypes, internal tools and benchmarking against current proprietary baselines.
Full specifications are available on the model page at https://openrouter.ai/openrouter/elephant-alpha and in the Kilo Code introduction post at https://blog.kilo.ai/p/introducing-elephant-a-new-stealth.