Quick answer

In June 2026 the gap between open-source frontier models and closed ones is the narrowest it has ever been. Llama 4 Behemoth, Qwen3 405B, and DeepSeek V3 all sit within 5 percentage points of GPT-5 and Claude Opus 4.8 on most published benchmarks. You can self-host any of them. The ecosystem matters more than the gap now.

In 2024 the gap between open and closed AI was visible to anyone who used both. By mid-2026 that gap has effectively closed for most practical purposes. The leading open models now match closed ones on coding, reasoning, multilingual, and tool use — within margins that don't matter for 90% of real applications.

The three leaders

  • Llama 4 Behemoth (405B + MoE) — Meta's flagship. Strong on reasoning, the cleanest licence (modified Apache 2.0), and the widest ecosystem support.
  • Qwen3 (405B + MoE) — Alibaba. Tops Chinese benchmarks. Strong on code. Best on tool use of any open model.
  • DeepSeek V3 (671B MoE) — DeepSeek. Cheapest to serve. Best raw maths. Surprisingly strong English-language performance.

Where open still loses

  • Multimodal: GPT-5 and Gemini 3.5 Pro still have a clear lead on vision + image gen integration.
  • Long agentic workflows: Opus 4.8 with extended thinking outperforms every open model on multi-hour task reliability.
  • Tooling ecosystem: Cursor, Devin, Cline, Lindy all default to closed models. Open models work but require setup.
  • Inference UX: closed APIs are battle-tested. Self-hosted open models still require devops effort to run reliably at scale.

Where open clearly wins

  • Cost at scale: $0.20/M tokens for Qwen3 vs $12.50/M for Opus 4.8. 60× cheaper.
  • Privacy and data residency: critical for healthcare, legal, financial, EU enterprises.
  • Customisation: you can fine-tune Llama 4 on your data without negotiating with a lab.
  • No deprecation: closed APIs change behaviour. Open weights are frozen forever — reproducibility matters.

Most serious AI teams in 2026 run a hybrid: closed APIs (Opus 4.8, GPT-5) for cutting-edge tasks, open models (Llama 4, Qwen3, DeepSeek V3) for high-volume cheap inference and private data.

What changed in 2026

Three things. First, Mixture-of-Experts architectures finally became reproducible — open teams figured out how to train big MoE models without exploding training cost. Second, post-training matured: open models now get the same quality of RLHF and constitutional refinement that closed ones do. Third, deployment got cheap: H100 prices crashed, Llama 4 Behemoth runs on 4x H100s, Qwen3 on 2x.

Bottom line

Open source AI is no longer the second-best option for most workloads. It's a genuine alternative on quality and a clear winner on cost, privacy, and customisation. The smart 2026 question isn't "open or closed?" — it's "which task goes where?"