Quick answer

Nvidia still wins on training. But on inference — which is now 80% of AI compute spend — the field is genuinely competitive in 2026. AMD MI400, Google TPU v7, AWS Trainium 3, and Cerebras WSE-3 each have credible market share. Hyperscaler-managed inference prices have dropped 40% YoY. The duopoly era is ending.

For four years Nvidia GPUs were the only serious option for training frontier AI. In 2026 that's still mostly true for training. But inference — running the model after it's trained, which is where most actual compute spend now sits — has become genuinely contested.

The four serious Nvidia challengers

  • AMD MI400 — finally got the software stack right with ROCm 7. Roughly 90% of H200 inference performance at 70% of the price.
  • Google TPU v7 — Google's flagship internal chip. Powers Gemini 3.5 Pro. Trillium economics let Google undercut OpenAI on price.
  • AWS Trainium 3 — Amazon's in-house chip. Anthropic now serves most Claude traffic on Trainium clusters.
  • Cerebras WSE-3 — wafer-scale specialty chip. Dominates a small but high-value niche: low-latency inference for high-stakes apps.

What it means for prices

AI API prices have dropped 40% on average over the last 12 months. This is mostly because hyperscalers can now serve inference on their own silicon and skip Nvidia's margin. Expect another 20-30% drop in the next 12 months as MI400 capacity scales and Trainium 3 ramps.

What it means for Nvidia

Stock is still up YoY but growth has slowed dramatically. The duopoly with TSMC means revenue is enormous either way — but the gross-margin premium of "the only AI chip" is gone. Margins are compressing. Wall Street is still figuring out what that means.

What it means for builders

  • Cheaper APIs: budget another 25-30% input cost drop in the next 12 months.
  • More backends: model providers will be more willing to do on-prem deals.
  • Lock-in still risky: backend-specific quantisation is more advanced now; switching providers loses some efficiency.
  • Local inference more viable: consumer Nvidia, AMD, and Apple Silicon are all running quantised frontier models in 2026.

If you're negotiating multi-year inference contracts, don't lock in. Prices will drop. Negotiate 90-day renewable terms with re-pricing clauses.

Bottom line

Nvidia still dominates training. Inference is now genuinely contested. Prices are falling, and they'll fall more. The Nvidia monopoly era is ending — and AI builders are the winners.