Home » OpenAI Jalapeño Explained: From Years of Rumors to the First Official OpenAI AI Chip

IT Infrastructure

OpenAI Jalapeño Explained: From Years of Rumors to the First Official OpenAI AI Chip

6 hours agoby 14 min read

OpenAI Jalapeño unveiled on June 24 2026 the first OpenAI-designed AI inference chip co-developed with Broadcom and manufactured by TSMC on the N3 3-nanometer process node featuring a reticle-class compute chiplet flanked by 6 HBM memory stacks and a separate I/O chiplet on a single package developed in a 9-month design-to-tape-out cycle that OpenAI says was accelerated by using its own AI models in the design process projected by Broadcom CEO Hock Tan to deliver approximately 50 percent lower inference cost per token versus current NVIDIA GPUs with Microsoft as the anchor customer deploying through Azure infrastructure as part of the previously-announced 10-gigawatt OpenAI custom accelerator commitment that scales from prototype deployments in late 2026 through full rollout by late 2029.

OpenAI has been rumored to be building its own AI chip for almost as long as ChatGPT has been a household name. Today, June 24, 2026, that effort finally has a real product name, real silicon, and a real launch announcement. The chip is called Jalapeño. It is co-designed with Broadcom and manufactured by TSMC on the N3 3-nanometer process. It is a large-language-model inference ASIC, the first piece of OpenAI-designed silicon ever announced, and the production form of a development program that has been the subject of speculation, partial confirmations, and trade-press reporting for the better part of two years.

This piece is a one-on-one explainer organized around three questions: What did we hear over the rumor cycle that led here? What is OpenAI officially confirming today? And what do we still not know? The goal is to give you the full picture in one read, separating verified information from the speculation that surrounded it as the project came together. We will update as more detail becomes public in the days and weeks ahead.

The short version is that Jalapeño matches most of what the rumors had been telling us for the past 18 months. It is real silicon. It targets inference, not training. It involves Broadcom as a co-design partner and TSMC for manufacturing. Microsoft is the anchor customer. The cost-per-token claim is genuinely meaningful (Broadcom says approximately 50 percent lower than current NVIDIA GPUs). And the development timeline OpenAI describes (a 9-month design cycle aided by OpenAI’s own models) is faster than the industry’s typical 18-to-24-month silicon program, which would be a notable result in its own right if independently verified.

What we’d been hearing

The rumor cycle around OpenAI’s chip ambitions has been running since at least 2023 and has produced a fairly consistent set of expectations that today’s announcement either confirms or sharpens. The major beats of that cycle:

Late 2023 to mid-2024: the rumor begins. Reports from outlets including Reuters, Bloomberg, and The Information began citing sources familiar with OpenAI’s strategic discussions about reducing dependence on NVIDIA. The reports were directionally consistent (OpenAI exploring custom silicon) but light on specifics. Sam Altman was reportedly involved in conversations with chip designers and foundry executives. The motivation, according to multiple reports, was the combination of the enormous and growing cost of NVIDIA GPUs at OpenAI’s scale and the supply-constrained availability of those GPUs through 2023 and 2024.

October 2024: the Broadcom co-design agreement. This was the first major partial-confirmation of the rumor cycle. OpenAI and Broadcom publicly announced a co-design partnership for custom AI accelerators. The agreement did not include a product name, did not include a launch timeline, and did not include any specifications. It did confirm that OpenAI was indeed pursuing custom silicon, that Broadcom would be the design partner, and that the program would target inference (rather than competing with NVIDIA on training). Sam Altman characterized the partnership at the time as part of OpenAI’s broader effort to build out the compute stack it would need for its next several years of growth.

Late 2024 to early 2025: TSMC and the manufacturing question. Reports throughout this period indicated TSMC was the chosen foundry for the OpenAI-Broadcom chip, with N3 (3-nanometer) as the likely process node. The reporting was based on sourcing from inside the foundry-customer ecosystem and was treated as well-substantiated by the trade press. TSMC’s existing relationships with NVIDIA, AMD, Apple, and the major hyperscalers meant the OpenAI program was joining a queue with other heavyweight customers competing for advanced-node capacity.

Through 2025: the codename rumors. Multiple codenames circulated through 2025 without ever being officially confirmed. "Trinity" appeared in some reporting. "Atlas" in others. Some early reports referenced "Project Rainier" but that turned out to be the Anthropic-AWS Trainium deployment program, not the OpenAI chip. None of these earlier codenames were Jalapeño, which appears to have been introduced (or at least surfaced publicly) much closer to the actual launch.

January 2026: TrendForce confirms TSMC N3. Industry-analyst firm TrendForce published research notes specifically identifying the OpenAI-Broadcom chip as being manufactured on TSMC N3. This was the first time a specific process node had been publicly confirmed (as opposed to widely speculated). The TrendForce note also indicated production volume targets that, if accurate, would put OpenAI’s chip in a meaningful share of TSMC’s advanced-node CoWoS packaging capacity.

February 2026: the $30 billion NVIDIA investment. Just as the rumor cycle was building toward an expected Jalapeño launch, OpenAI announced a $30 billion investment in NVIDIA and a commitment to deploy 10 gigawatts of NVIDIA’s Vera Rubin platform. The simultaneous custom-chip pursuit and NVIDIA recommitment was confusing to some observers, but the underlying logic became clearer as the year progressed: training stays on NVIDIA, where the workload favors the GPU’s flexibility, while inference moves toward custom silicon, where the workload economics justify the design investment.

Spring 2026: hand-wave indicators from Microsoft. Microsoft’s commentary on capital expenditure through the spring of 2026 included references to expanded OpenAI-designed accelerator capacity through 2029. This was the first public indication that Microsoft would be a meaningful customer of the OpenAI chip rather than just a hosting partner. Industry sources put Microsoft’s anticipated share of the initial Jalapeño output at approximately 40 percent, though this figure has not been officially confirmed.

June 24, 2026: the official Jalapeño unveiling. Which is today.

What OpenAI is officially confirming today

The announcement comes through three coordinated channels. OpenAI’s announcement page at openai.com/index/openai-broadcom-jalapeno-inference-chip carries the official product description. Broadcom released a coordinated investor announcement emphasizing the financial and partnership dimensions. The major tech press (Tom’s Hardware, TechCrunch, CNBC, Bloomberg, Axios, VentureBeat, The Decoder) has detailed coverage from briefings held in advance of the launch.

The officially-confirmed details:

The product. Jalapeño is OpenAI’s first custom-designed AI chip. It is internally codenamed Jalapeño and officially branded an "Intelligence Processor." It is positioned for LLM inference, not training. Training continues to run on NVIDIA’s Vera Rubin platform under the February 2026 commitment.

The physical chip. Compute chiplet measures approximately 25.46 by 33 millimeters (about 840 square millimeters), which is near the EUV reticle limit of 858 square millimeters. The package combines the compute chiplet with six HBM memory stacks, a separate I/O chiplet, and two structural dummy dies for mechanical balance. The architecture is described as a systolic array.

The process. TSMC N3 (3-nanometer) for this generation. OpenAI has indicated a second-generation Jalapeño is planned for TSMC’s A16 process when it becomes available.

The performance claim. Approximately 50 percent lower inference cost per token compared to current NVIDIA GPUs, per Broadcom CEO Hock Tan. The specific TOPS, memory bandwidth, and power consumption figures have not been included in the launch materials.

The development cycle. Nine months from design start to tape-out, which OpenAI says is the fastest high-performance ASIC cycle ever publicly reported. Greg Brockman attributes the cycle time to using OpenAI’s own AI models in the design process, an unusually concrete claim about AI-aided engineering productivity from a frontier-AI lab.

The partnership structure. Broadcom is the co-design partner, covering chip architecture, networking integration, and deployment support. TSMC is the manufacturing partner. Microsoft is the anchor customer, deploying Jalapeño in Azure infrastructure as part of the previously-announced 10-gigawatt OpenAI custom accelerator commitment. Hock Tan and Broadcom executive Charlie Kawwas reportedly hand-delivered the first physical silicon to Sam Altman and Greg Brockman.

The deployment trajectory. Prototype and small-scale production beginning late 2026, scaling through 2027, "full tilt" deployment in the first half of 2028, full rollout completed by late 2029.

The NVIDIA relationship. Jalapeño does not replace NVIDIA in OpenAI’s stack. NVIDIA continues to handle training across OpenAI’s compute fleet, and OpenAI’s $30 billion February 2026 investment in NVIDIA and the 10-gigawatt Vera Rubin commitment remain in force. The framing in the announcement materials is "full-stack" rather than "NVIDIA replacement."

What we still don’t know

Several pieces of information are not yet public and would significantly inform a deeper analysis:

The specific TOPS figures and peak performance numbers. Hock Tan’s 50-percent-lower-cost claim is the only quantitative performance assertion in the launch materials. The detailed performance breakdown (peak TOPS, sustained TOPS, INT8/FP8/FP16 throughput, memory-bandwidth-utilization characteristics) has not been published.
Memory bandwidth from the six HBM stacks. The HBM generation (HBM3E vs HBM4) and the per-stack bandwidth would let analysts model the chip’s effective inference throughput, but neither has been disclosed.
Power consumption per chip and per rack. Critical for the operational economics of a gigawatt-scale deployment but not in the launch materials.
The full software stack. OpenAI has indicated standard inference framework support but has not detailed the toolchain, the supported model formats, or how compatibility with the existing OpenAI API stack is preserved.
NVIDIA-comparison benchmarks beyond the 50-percent-cost claim. Independent benchmark publications against H100, H200, B100, B200, and the GB200 generation will probably surface in the next two to four weeks as analyst briefings produce reports.
The full customer roster. Microsoft is confirmed as the anchor customer. Whether any other customers will have access to Jalapeño through Azure or other channels has not been confirmed. The announcement materials read as Microsoft-only for now, but the structure could expand.
Export-control treatment. The chip’s classification under United States and allied export-control regimes will affect which regions it can be deployed in. This has not been addressed in the launch materials and is likely to come out in subsequent regulatory filings.
The relationship between Jalapeño’s deployment and OpenAI’s data-residency commitments. OpenAI has data-residency commitments to specific enterprise and government customers. How those commitments are served as the inference fleet transitions from NVIDIA to Jalapeño-based has not been clarified.

We will update this piece and write follow-up coverage as these details become public. The first follow-up will be when independent analysts publish detailed assessments of the chip specifications and the benchmark claims. We expect this within the next two weeks based on typical analyst cycle times for major silicon launches.

What this means

The honest reading is that Jalapeño is meaningful for OpenAI’s own cost structure and for Broadcom’s AI accelerator business, while being a smaller direct threat to NVIDIA’s training dominance than the headline framing suggests. NVIDIA’s stock dipped slightly on the announcement but the structural position is largely unchanged: training stays on NVIDIA for OpenAI and for most of the major AI labs, and the inference-side custom-silicon trend has been visible for over a year.

For developers and customers consuming OpenAI’s API or ChatGPT products, the long-term consequence is downward pressure on per-token pricing as OpenAI’s cost-per-token improves with Jalapeño deployment. The pricing changes will lag the chip deployment by some months, and the changes will probably show up first in tier-restructuring rather than headline rate cuts. Expect price-per-token improvements through 2027 and 2028.

For the broader semiconductor industry, the AI accelerator market is increasingly bifurcated between NVIDIA (training plus the high end of inference) and a multi-vendor custom-silicon ecosystem (the rest of inference). Broadcom is the dominant design partner in the custom-silicon ecosystem, with design wins now spanning Google (TPU), Meta (MTIA), and OpenAI (Jalapeño). TSMC is the manufacturing bottleneck. Both are sitting in good positions across the structural shift.

For OpenAI, Jalapeño completes a gap in the company’s compute strategy. As of yesterday, OpenAI was the only major frontier-AI lab without its own custom silicon. As of today, it has one. The competitive logic alone made the launch close to inevitable; the specific timing and specifications were what was uncertain, and now both are known.

Frequently asked questions

What does Jalapeño mean for the price of GPT-5.5 API calls? Eventually downward, but the timing is several quarters out at minimum. The chip needs to be deployed at scale and the cost savings have to flow through to operating margins before they translate to pricing changes. Expect price-per-token improvements through 2027 and 2028 as deployment ramps.

Is Jalapeño a training chip or an inference chip? Inference. OpenAI’s training workloads continue to run on NVIDIA Vera Rubin and the broader NVIDIA stack. Jalapeño is purpose-built for the inference workload that runs ChatGPT, the API, and OpenAI’s products.

Can other companies buy Jalapeño? Not directly as of the launch announcement. The chip is for OpenAI’s own deployment with Microsoft as the anchor infrastructure customer. There has been no indication that Broadcom or OpenAI intends to sell Jalapeño to third-party AI companies.

How does Jalapeño compare to Google’s TPU v7 (Ironwood)? Both are custom inference-leaning silicon at similar process nodes. Detailed comparisons require benchmark publication, which has not yet happened for Jalapeño. The architectural differences suggest different workload tradeoffs that will become clearer with benchmark data.

Does this mean OpenAI is leaving NVIDIA? No. OpenAI invested $30 billion in NVIDIA in February 2026 and committed to deploy 10 gigawatts of NVIDIA’s Vera Rubin platform. Jalapeño supplements NVIDIA for inference; it does not replace NVIDIA across OpenAI’s compute stack.

Is this the chip that the October 2024 Broadcom announcement was about? Yes. The 2024 Broadcom-OpenAI co-design agreement is the foundation for Jalapeño. The 9-month design cycle that OpenAI cites started after that agreement matured into an actual production project.

Will Jalapeño affect Microsoft Azure pricing for AI workloads? Eventually yes for workloads that route through OpenAI models on Microsoft infrastructure. The direct Azure pricing model and the Microsoft-OpenAI commercial relationship are separate from the chip economics, so the flow-through will be partial.

What is the relationship between Jalapeño and OpenAI’s broader compute commitments? OpenAI’s total compute commitments through 2029 are now in the hundreds of billions of dollars across NVIDIA, Microsoft, Broadcom, and TSMC. Jalapeño is the OpenAI-designed component of that stack, with the cost-per-token economics designed to make the total commitment more financially sustainable over time.

What is the next thing to watch for? Three things in order of likely timing: (1) detailed technical analysis from Tom’s Hardware, AnandTech, SemiAnalysis, and similar in the next two weeks; (2) Broadcom’s Q3 FY2026 earnings call where Hock Tan will provide more guidance on Jalapeño revenue; (3) the first independent benchmark publications showing Jalapeño against NVIDIA H200 and B200 on realistic inference workloads, likely later in 2026 once early production chips are in the hands of independent reviewers.

This is a developing story. We will update with additional detail as more information becomes public.

Tagged asAI Inference, Frontier Models, Large Language Models (LLMs), NVIDIA, OpenAI

Facebook X

OpenAI Jalapeño Explained: From Years of Rumors to the First Official OpenAI AI Chip

What we’d been hearing

What OpenAI is officially confirming today

What we still don’t know

What this means

Frequently asked questions

What Is Astro? The Content-First Web Framework Cloudflare Acquired in 2026

Mistral OCR 4: The Self-Hosted Document Recognition Model Released June 2026

What Is Sakana Fugu? The Tokyo-Based Multi-Agent Orchestration Model Released June 2026

OpenAI Releases GPT-5.5-Cyber: The Daybreak Cybersecurity Model Now Generally Available to Vetted Defenders

What Is a Frontier Model? Defining the Term That Shapes AI Policy, Procurement, and Architecture in 2026

WordPress 7.0 AI Foundations and Connectors: What the New AI Integration Layer Actually Does

Menu

Instagram

Search

OpenAI Jalapeño Explained: From Years of Rumors to the First Official OpenAI AI Chip

What we’d been hearing

What OpenAI is officially confirming today

What we still don’t know

What this means

Frequently asked questions

Further reading

What Is Astro? The Content-First Web Framework Cloudflare Acquired in 2026

Mistral OCR 4: The Self-Hosted Document Recognition Model Released June 2026

What Is Sakana Fugu? The Tokyo-Based Multi-Agent Orchestration Model Released June 2026

OpenAI Releases GPT-5.5-Cyber: The Daybreak Cybersecurity Model Now Generally Available to Vetted Defenders

What Is a Frontier Model? Defining the Term That Shapes AI Policy, Procurement, and Architecture in 2026

WordPress 7.0 AI Foundations and Connectors: What the New AI Integration Layer Actually Does

Menu

Instagram