Quick answer
Nvidia still wins on training. But on inference — which is now 80% of AI compute spend — the field is genuinely competitive in 2026. AMD MI400, Google TPU v7, AWS Trainium 3, and Cerebras WSE-3 each have credible market share. Hyperscaler-managed inference prices have dropped 40% YoY. The duopoly era is ending.
For four years Nvidia GPUs were the only serious option for training frontier AI. In 2026 that's still mostly true for training. But inference — running the model after it's trained, which is where most actual compute spend now sits — has become genuinely contested.
The four serious Nvidia challengers
- AMD MI400 — finally got the software stack right with ROCm 7. Roughly 90% of H200 inference performance at 70% of the price.
- Google TPU v7 — Google's flagship internal chip. Powers Gemini 3.5 Pro. Trillium economics let Google undercut OpenAI on price.
- AWS Trainium 3 — Amazon's in-house chip. Anthropic now serves most Claude traffic on Trainium clusters.
- Cerebras WSE-3 — wafer-scale specialty chip. Dominates a small but high-value niche: low-latency inference for high-stakes apps.
What it means for prices
AI API prices have dropped 40% on average over the last 12 months. This is mostly because hyperscalers can now serve inference on their own silicon and skip Nvidia's margin. Expect another 20-30% drop in the next 12 months as MI400 capacity scales and Trainium 3 ramps.
What it means for Nvidia
Stock is still up YoY but growth has slowed dramatically. The duopoly with TSMC means revenue is enormous either way — but the gross-margin premium of "the only AI chip" is gone. Margins are compressing. Wall Street is still figuring out what that means.
What it means for builders
- Cheaper APIs: budget another 25-30% input cost drop in the next 12 months.
- More backends: model providers will be more willing to do on-prem deals.
- Lock-in still risky: backend-specific quantisation is more advanced now; switching providers loses some efficiency.
- Local inference more viable: consumer Nvidia, AMD, and Apple Silicon are all running quantised frontier models in 2026.
If you're negotiating multi-year inference contracts, don't lock in. Prices will drop. Negotiate 90-day renewable terms with re-pricing clauses.
Related reading
Bottom line
Nvidia still dominates training. Inference is now genuinely contested. Prices are falling, and they'll fall more. The Nvidia monopoly era is ending — and AI builders are the winners.

