OpenAI unveils Jalapeño: AI inference chip

25-06-2026

Jalapeño marks OpenAI's expansion from models and products into custom silicon, giving the company a purpose-built inference chip co-developed with Broadcom and targeted for deployment by the end of 2026.

Written by:

Jorick van Weelie

Marketing Lead at DataNorth | AI Enthusiast & Tech Storyteller

Published: 25 June 2026

OpenAI and Broadcom have unveiled Jalapeño, OpenAI’s first custom AI chip and its first in-house designed inference processor. Announced on 24 June 2026, Jalapeño is a purpose-built ASIC made specifically for large language model inference rather than training, and OpenAI says it was taken from initial design to manufacturing tape-out in nine months. The companies are aiming for an initial deployment by the end of 2026, as the first step in a multi-generation compute platform built together with Broadcom and Celestica.

What is OpenAI’s Jalapeño chip?

Jalapeño is what OpenAI calls its first Intelligence Processor, a custom accelerator designed from scratch around large language model inference. Unlike a general-purpose GPU or a training accelerator adapted for inference, Jalapeño is a blank-slate ASIC tuned for the serving patterns OpenAI runs every day across ChatGPT, Codex, and its API. OpenAI designed the chip architecture, kernels, memory systems, and networking itself, while Broadcom handled silicon implementation and networking and Celestica contributed board, rack, and system integration.

The design goal is to combine the throughput of today’s leading AI accelerators with latency closer to specialised inference systems, which makes the chip well suited to interactive LLM products at scale. OpenAI says engineering samples are already running machine learning workloads in the lab at production target frequency and power, including its GPT-5.3-Codex-Spark model.

Jalapeño specifications, performance, and the nine-month tape-out

Jalapeño is a large, reticle-sized ASIC. OpenAI and Broadcom say it went from initial design to manufacturing tape-out in nine months, which they describe as the fastest ASIC development cycle they are aware of in high-performance advanced semiconductors. The architecture is built to reduce data movement and to balance compute, memory, and networking resources so that realised utilisation sits closer to theoretical peak performance, and it uses Broadcom’s Tomahawk networking silicon to scale across many chips.

On performance, OpenAI is still measuring final numbers and has promised a detailed technical report in the coming months. Its headline claim is that the first-generation accelerator will deliver performance per watt substantially better than current state-of-the-art hardware. Because Jalapeño is an ASIC rather than a flexible GPU, it is cheaper to produce and can be tuned for a narrow set of tasks. Some outlets reported that the design could cut inference costs by roughly 50 percent compared with conventional GPUs, although OpenAI has not published a specific cost figure.

How OpenAI used its own models to design Jalapeño

One detail OpenAI highlights is that its own models helped accelerate parts of the chip’s design and optimisation. The company frames this as a flywheel: better infrastructure improves compute efficiency, which enables better training and serving, which produces more capable models, which in turn help design the next generation of hardware. Richard Ho, who leads OpenAI’s hardware program, said the architecture was optimised around the kernels, memory movement, networking, and serving patterns that matter most for frontier models, and that early testing suggests Jalapeño will run OpenAI’s most important workloads close to the hardware’s theoretical limits.

How does Jalapeño compare to Nvidia GPUs?

Jalapeño is widely read as OpenAI’s move to reduce its reliance on Nvidia, whose GPUs currently dominate AI training and inference. An ASIC like Jalapeño is less flexible than a general-purpose Nvidia GPU, but it is less expensive and can be designed for a specific job, in this case LLM inference. The approach mirrors the custom-silicon strategies of other large AI operators such as Google, which builds its own TPUs, and Amazon, which builds Trainium and Inferentia chips. For OpenAI, owning the chip layer means it can optimise the full stack, from silicon and kernels up to the ChatGPT and Codex products that run on top.

Jalapeño availability, deployment, and roadmap

Jalapeño is being built for OpenAI’s own infrastructure rather than sold as a standalone product. The companies are aiming for initial deployment by the end of 2026, with Broadcom President and CEO Hock Tan saying the collaboration will enable gigawatt-scale data centers with Microsoft and other partners beginning in 2026, and expanding over multiple generations in the years ahead. Greg Brockman, OpenAI’s President and co-founder, described Jalapeño as part of a long-term, full-stack infrastructure strategy intended to make compute more abundant and AI more affordable for people and businesses.

Full details and quotes are available in OpenAI’s official announcement on Jalapeño.