Global AI Forum Edition 01 · 2026

The Enterprise Field Guide

The Buyer's
Atlas

How a CEO, CFO, CTO, CIO, or compliance lead actually chooses a foundation model. Every major model on the board. Seven parameters. One decision you can defend in a boardroom.

A long read · ~30 min
Mapped to task, function & industry
Updated 27 June 2026

Scroll to begin

Start here

There is no best model.
There is only the right one.

The most expensive mistake in enterprise AI is not picking the wrong vendor. It is believing one model should win every job. In 2026 the frontier is a portfolio, not a podium. The leaders trade places by the week. A model that writes the cleanest board memo may be the wrong one to run a ten-hour migration. A model that aces a reasoning benchmark may quietly leak data your regulator will ask about.

This guide does one thing well. It hands the buyer a vocabulary. Seven parameters that decide everything. A full atlas of the models that matter. Five lenses, one for each seat at the table. And a way to walk into the room with a choice you can explain in plain words and still defend under audit.

The one rule to keep

Pick by elimination, not reputation. Strike every model that fails a non-negotiable first: data residency, latency ceiling, cost cap. Then match what survives to the intelligence the task actually needs. Most teams run three to five models at once and route each job to the one that leads it.

Model families now compete, up from 3 in 2023

100×

Price gap between the cheapest and priciest mainstream model

~5%

Of sessions where a top model's safety layer reroutes the request

The Vocabulary

Seven parameters
that decide it all.

Forget the leaderboards for a moment. Every model choice in the enterprise comes down to seven dials. Learn these and you can read any spec sheet, cut through any sales deck, and ask the one question the vendor hoped you would not.

Capability tier How smart, for this kind of work

Capability is not one number. A model can sit at the frontier for coding and mid-pack for long multimodal reasoning. The honest way to read it is per task type: agentic coding, deep reasoning, writing and tone, multimodal, multilingual. The benchmarks that matter in 2026 are SWE-Bench Pro and Terminal-Bench for coding agents, GPQA Diamond and FrontierMath for reasoning, and human-preference arenas for writing.

In plain wordsDo not ask "is it smart." Ask "is it smart at the one thing I will make it do all day." A top-of-class coder can be an average lawyer.

CodingReasoningWritingMultimodalMultilingual

Cost & token economics What it really bills, not the sticker

Price is quoted per million input and output tokens, and the spread is enormous: from roughly ten cents per million on a value model to ten dollars on a top tier, a hundred-fold gap. But the sticker lies. Frontier models fan a single prompt into dozens of internal calls, so a "one" request can bill like fifty. Output tokens cost three to five times input. Long-context requests often jump to a higher rate above a threshold.

In plain wordsThe bill is set by how the model works, not by the headline rate. Budget it, cap it, and route only the hard, high-value work to the expensive tier. Many teams have been blindsided by exactly this.

Input/output splitFan-outLong-context surchargeCaching

Context window How much it can hold at once

The context window is how much text a model considers in a single request, measured in tokens, each roughly three-quarters of a word. In 2026 a million tokens is table stakes: most frontier models hold around 1M, some reach far higher. That lets you drop an entire codebase, a full contract set, or a research corpus into one pass. But watch the gap between the advertised window and the effective window. Providers differ sharply in how well a model actually uses the back half of a long prompt.

In plain wordsBig window equals fewer "it forgot what I told it" moments. But a model that holds a million tokens and only reads the first hundred thousand well is selling you shelf space it does not use.

1M standardEffective vs advertisedOutput cap

Deployment & data residency Where your data physically goes

The first fork in the road is proprietary versus open weight. Proprietary models reach peak capability through an API, with no infrastructure to run, but your data leaves the building and you depend on the provider's roadmap and uptime. Open-weight models can be downloaded and run on your own hardware, giving full control, privacy, and zero per-token cost, in exchange for the burden of running them. For regulated work in healthcare, government, and financial services, self-hosting is now a legitimate path, not a capability sacrifice.

In plain wordsIf a regulator can ask "where did that data go," you need an answer before you need a benchmark. Open weights on your own servers is the cleanest answer. Hosted API is the fastest start.

Open weightProprietary APIPrivate cloudAir-gapped

Latency & throughput How fast, and how much at once

Two numbers matter: time to first token, which is how responsive it feels, and output throughput, how fast it finishes. A customer-facing assistant lives or dies on the first; a nightly batch job on the second. The trick is that the smartest model is rarely the fastest. Reasoning models that "think" before answering pay for depth with delay. For high-volume, latency-sensitive work, a smaller distilled model often wins on experience even though it loses on paper.

In plain wordsFor a chat box, fast and good-enough beats slow and brilliant. For an overnight pipeline, nobody is watching the clock. Match the speed to who is waiting.

Time to first tokenThroughputReasoning delay

Agentic & tool use Can it act, not just answer

The defining shift of 2026 is from assistant to agent. The question is no longer "can it write the answer" but "can it run the whole job": call tools, hit your systems, chain hundreds of steps, and keep working through ambiguity without a human touching each one. The leaders now sustain multi-hour autonomous runs and hundreds of tool calls in a single chain. Persistent memory is the new differentiator, models that use notes, logs, and stored context across a task that spans days.

In plain wordsAn assistant drafts the email. An agent finds the contact, drafts it, checks the calendar, and books the meeting while you review the result, not the process. If you want work done, not just words, this is the dial.

Long-horizon runsTool chainingPersistent memoryComputer use

Safety, governance & lock-in What the compliance lead asks

The least glamorous parameter quietly decides the most. Does the model retain your prompts, and for how long? The most capable tiers are starting to require 30-day data retention with no zero-retention option, even for enterprises that previously negotiated one. Do safety classifiers reroute or refuse some requests, and how often? Can you fine-tune, or are you frozen on the provider's roadmap? Per-token API pricing is a form of lock-in; open weights are insurance against a provider changing prices or deprecating a model you depend on.

In plain wordsRead the data-retention clause before the benchmark table. The strongest model in the world is the wrong choice if its terms break your audit. This is the slide that gets a deal killed in legal.

Data retentionClassifier rerouteFine-tune rightsVendor lock-in

Strike on the non-negotiables first. Then compete on capability. A team that scores models before eliminating them ends up buying the cleverest model that fails their own rules.

The Atlas

Every model
worth knowing.

The board, laid out. Eight families, five tiers, and the open-weight challengers rewriting the price floor. Filter by what you need. Figures are mid-2026 and move fast; treat them as a map, not a contract.

Tier colors: ● Mythos-class ● Frontier ● Value / fast ● Open weight

Side by side

The numbers, on one screen.

Highlighted cells mark the current leader in that column. No single model owns the table. That is the whole point.

SWE-Bench Pro

The coding exam

It hands a model real, unsolved bug reports from actual open-source software and checks whether its fix makes the project's own test suite pass. A score of 80% means it correctly resolved 80 of 100 real engineering tickets, the closest thing to "can it do a junior engineer's day job." The catch: scores swing hard with the scaffold around the model. The same model scores differently inside a purpose-built coding harness than in a raw setup, so read coding numbers as directional, never absolute.

GPQA Diamond

The reasoning exam

"Graduate-level Google-Proof Q&A." PhD-level science questions written so you cannot simply search the answer. It measures genuine multi-step reasoning, not recall or memorized facts. The catch: the frontier now clusters in the mid-90s, which means the test is nearly saturated. When every leader scores 94 to 95, the benchmark has stopped telling you who is actually better. Treat a near-perfect GPQA as table stakes, not a tiebreaker.

Other names you will see: Terminal-Bench (can it operate a real command line), FrontierMath (the hardest unsolved math, still far from saturated), GDPval (economically valuable knowledge work), and human-preference arenas for writing and tone. No single score tells the whole story, which is exactly why the table below has more than one column.

Column leader StrongPrices = USD per 1M tokens (input / output)

Model	Maker	Released	Tier	Coding (SWE-Bench Pro)	Reasoning (GPQA)	Context	Price in / out	Best at
GPT-5.6 Sol	OpenAI	Jun 26 2026	Frontier	leader*	~95%	1.5M	$5 / $30	Agentic coding, cyber, biology
GPT-5.6 Terra	OpenAI	Jun 26 2026	Frontier	strong	high-90s	1.5M	$2.50 / $15	GPT-5.5 class at half the cost
GPT-5.6 Luna	OpenAI	Jun 26 2026	Value	good	high	1.5M	$1 / $6	Fast, cheap, high-volume
Claude Fable 5	Anthropic	Jun 9 2026	Mythos	~80%	~94%	1M	$10 / $50	Long-horizon agents, hardest work
Claude Opus 4.8	Anthropic	May 28 2026	Frontier	~69%	~94%	1M	$5 / $25	Coding, high-stakes writing
Claude Sonnet 4.6	Anthropic	Mar 2026	Value	~58%	high-80s	1M	$3 / $15	Near-Opus quality at value price
GPT-5.5	OpenAI	Apr 23 2026	Frontier	~59%	~95%	1M	$5 / $30	All-round knowledge work, research
Gemini 3.1 Pro	Google	early 2026	Frontier	~54%	~94%	1M+	$2 / $12	Multimodal, long context, value
Gemini 3.5 Flash	Google	2026	Value	good	high	1M	$1.50 / $9	Best price-per-intelligence
Grok 4.3	xAI	Apr 17 2026	Frontier	~55%	competitive	2M	$2 / $15	Live data, real-time web/X search
DeepSeek V4-Pro	DeepSeek	Apr 24 2026	Open	~58%	strong	1M	$0.27 / $1.10	Frontier-ish quality, lowest cost
DeepSeek V4-Flash	DeepSeek	Apr 24 2026	Open	good	solid	1M	$0.14 / $0.28	Cheapest 1M-context model
Llama 4	Meta	2025	Open	good	solid	10M*	self-host	Self-host, data never leaves
GLM-5.2	Z.AI	Jun 16 2026	Open	~58%	~91%	200K	self-host	Open-weight reasoning leader
Qwen 3.7 Max	Alibaba	2026	Open	strong	high	256K	$1.25 / $3.75	Cheapest top-10 reasoner, math
Kimi K2.7	Moonshot	Jun 12 2026	Open	~59%	solid	256K	self-host	Long tool-call chains, agents
MiniMax M3	MiniMax	2026	Open	~59%	solid	1M	$0.60	Cheapest self-host frontier coder
Mistral Large 3	Mistral	Dec 2025	Open	good	solid	256K	$0.50 / $1.50	EU sovereign, Apache 2.0, on-prem
Command A+	Cohere	May 20 2026	Open	fair	strong	256K	self-host	Enterprise RAG & search, citations
Amazon Nova 2 Pro	Amazon	2026	Value	fair	solid	300K	low	Native to AWS Bedrock, video
Sarvam 105B (Indus)	Sarvam AI	Feb 2026	Open	fair	solid	128K	self-host	22 Indian languages, sovereign

*GPT-5.6 Sol leads Terminal-Bench 2.1 (command-line agentic coding) at 91.9% in Ultra mode, edging Claude Mythos 5; its SWE-Bench Pro figure was not broken out at preview. Sol, Terra, and Luna launched June 26 2026 under a US-government-coordinated limited preview, broad availability expected within weeks. Llama 4 Scout advertises up to 10M tokens. Coding figures use SWE-Bench Pro where available; scaffolding changes scores materially, so read them as directional. Pricing, benchmarks, and dates verified late June 2026 and change frequently. Confirm against provider docs before production.

Five seats, five questions

The same choice
looks different
from each chair.

A foundation model is bought by a committee that does not share a vocabulary. Here is what each seat is really asking, and the model traits that answer it. Hover a card.

CEO

Does this move the business?

Frame as a portfolio, not a vendor bet
Ask for spend-per-task, not per-token
Insist on a routing strategy in the plan
Avoid single-model lock-in to one provider

CFO

What does it really cost?

Model fan-out before signing any cap
Route volume to value tiers, hard work up
Open weights remove per-token lock-in
Watch long-context price jumps

CTO

Will it ship and scale?

Capability is per task type, not global
Agentic runs & tool chains for real work
Benchmark scores depend on scaffold
Latency for users, throughput for batch

CIO

Does it fit the estate?

Cloud alignment: Bedrock, Vertex, Foundry
One gateway to route across models
Plan for model deprecation cycles
Blend hosted + self-host by workload

Compliance

Can I defend it in audit?

Read the data-retention clause first
Top tiers may force 30-day retention
Know the classifier reroute rate
Self-host for data residency mandates

Map the model to the job

Task first.
Brand last.

The fastest way to a defensible choice is to start from the workflow and work backwards. A few common enterprise jobs and where the strength sits today.

Function / workflow	What it needs most	Lead choices today	Value alternative
Software engineering & migrations	Agentic coding, long runs	GPT-5.6 Sol, Claude Fable 5	DeepSeek V4-Pro, Sonnet 4.6
Financial modeling & analysis	Step-by-step reasoning	GPT-5.6 Sol, Opus 4.8	Gemini 3.1 Pro
Legal redlines & contract review	Long context, careful tone	Claude Opus 4.8, Fable 5	Gemini 3.1 Pro (1M)
Customer support at scale	Low latency, low cost	Gemini 3.5 Flash, Haiku 4.5	DeepSeek V4-Flash
Market & competitive research	Multi-step, live data	GPT-5.6, Grok 4.3	Gemini 3.1 Pro + search
Board materials & long-form writing	Prose rhythm, subtext	Claude Opus 4.8	GPT-5.5, Sonnet 4.6
Document-heavy / multimodal ops	Vision, video, audio	Gemini 3.1 Pro	Gemini 3.5 Flash
High-volume first drafts	Cheap, fast, good-enough	DeepSeek V4-Flash	GPT-5.4 mini, Haiku 4.5
Regulated / air-gapped workloads	Data never leaves	Llama 4, Qwen 3.5 (self-host)	GLM-5.2, DeepSeek (self-host)
Indian-language & sovereign service	22 languages, local infra	Sarvam 105B (Indus)	Krutrim, BharatGen Param 2

Beyond text

Models don't only write.
They draw, film,
and speak.

Three more markets, each a real buying decision with its own leaders and its own compliance traps. Verified mid-2026, and faster-moving than any other corner of this guide.

A text model is one purchase. The enterprise that stops there misses three more whole markets, each with its own leaders, prices, and compliance traps. Here is the full board for image, video, and voice, the models behind the products your teams are already signing up for.

Image generation & vision

GPT Image 2

OpenAI · replaced DALL-E

Best overall default. World knowledge and complex-prompt fidelity; multilingual text in images.

Best at: realistic publishing, complex prompts

Imagen 4

Google · Vertex AI

Photorealism leader, especially human faces and natural scenes. Google-native workflow.

Best at: photorealistic humans & nature

Nano Banana Pro

Google · Gemini Image

Strong editing and character consistency across generations. Reliable Google infrastructure.

Best at: editing, consistent characters

Midjourney V8

Midjourney

The aesthetic-quality king. Distinctive, art-directed look. No real API; web and Discord.

Best at: stylized, artistic imagery

Open

FLUX.2

Black Forest Labs

Open-weight champion. Top-tier photorealism, skin and lighting; unmatched fine-tune ecosystem.

Best at: photorealism, custom fine-tunes

Seedream 4.5

ByteDance

Renders text better than almost anything, native 4K, excels at product and commercial looks.

Best at: product shots, text, 4K

Ideogram 3

Ideogram

The typography specialist. If you need readable text, logos, or posters, it is unmatched.

Best at: text in images, logos, posters

Safe

Adobe Firefly 4

Adobe

Commercially safe, trained on licensed data, Photoshop-native. The brand-workflow choice.

Best at: commercial safety, brand workflow

Recraft V4

Recraft

Brand-asset powerhouse: vectors, batch style consistency, logo integration. MCP support.

Best at: vectors, brand asset systems

Open

Qwen Image 2

Alibaba

Open-source value, custom LoRA training. Strong multilingual and Asian-script rendering.

Best at: open-weight value, custom training

Video generation

Veo 3.1

Google · Flow

Best all-rounder. The only model doing one-pass 48kHz lip-synced dialogue. 4K, cinematic.

Best at: spoken dialogue, cinematic clips

Kling 3.0

Kuaishou

Native 4K, up to 2-min clips, AI Director shot control. Best hand rendering. Data in China.

Best at: 4K social, long clips, value

Seedance 2.0

ByteDance

Tops the arena with audio. Flexible input: images, clips, audio per generation. Safe default.

Best at: top quality, flexible inputs

Runway Gen-4.5

Runway

The production workstation. Motion brush, camera control, reference-driven consistency.

Best at: creative control, ad workflows

Luma Ray3

Luma AI

The only HDR option, color-managed pipelines. Atmospheric, environment-heavy image-to-video.

Best at: HDR, cinematic mood

Hailuo 2.3

MiniMax

Most output per dollar. Expressive human motion and faces. Note an active IP lawsuit.

Best at: cheap volume, human subjects

Sunsetting

Sora 2

OpenAI

Best physics simulation, but the app retired and the API shuts down Sep 2026. Do not build new.

Best at: physics realism (migrate off)

Open

Wan 2.7

Alibaba

The serious open-weight video slot. Self-host for custom pipelines and full data control.

Best at: self-hosted, custom pipelines

PixVerse V4.5

PixVerse

The anime and stylized specialist. Handles non-photoreal styles others cannot.

Best at: anime, stylized motion

HeyGen / Synthesia

Avatar tools

Avatar-first: talking heads, corporate training, multilingual lip-sync localization.

Best at: avatars, training, localization

Voice, speech & retrieval

ElevenLabs

The production standard for text-to-speech and voice agents. Broadest language and voice library.

Best at: production TTS, voice agents

Open

Voxtral TTS

Mistral · 4B

First credible open TTS at production quality. Beats ElevenLabs in most blind tests. Voice cloning from 3s.

Best at: open-weight voice, cloning

Cohere Transcribe

Cohere

Enterprise-grade speech-to-text. Pairs with Cohere's RAG stack for call and meeting workflows.

Best at: enterprise transcription

Open

Saaras V3

Sarvam · India

Speech across many Indian languages. The voice layer for South Asian sovereign deployments.

Best at: Indian-language speech

Cohere Embed & Rerank

Cohere · retrieval

Not generators, the plumbing of enterprise search. They decide which documents the model even sees.

Best at: RAG accuracy, search relevance

Open

Tiny Aya

Cohere · 70+ langs

3.35B open models in regional variants, runs offline on a laptop. Strongest small multilingual story.

Best at: offline, edge, 70+ languages

The two compliance traps in generative media

Data residency: the strongest video models from Kling, Hailuo, and Seedance process your prompts and assets on servers in China. Fine for personal creative work, a problem for client work under NDA or sensitive brand content. IP indemnification: most paid plans grant commercial rights, but only Adobe Firefly will legally cover you if an output is claimed to infringe. For brand-facing work, that distinction decides the vendor.

The third option

Buy, self-host,
or build your own.

Most guides present a binary: rent a proprietary API, or self-host an open model. There is a third path that matters most to regulated industries, and it is new. Train a frontier-grade model on your own data. Mistral's Forge platform supports the full training lifecycle, pre-training, post-training, and reinforcement learning, on a company's internal datasets, going well beyond fine-tuning. An insurer can train a model from scratch on its own claims and contracts. Early adopters include ASML, Ericsson, and the European Space Agency. Cohere's Model Vault deploys inside your own private cloud so sensitive data never leaves the network.

The build-vs-buy ladder

Prompt the model and you change nothing. RAG feeds it your documents at query time. Fine-tune nudges a small slice of the weights toward your domain. Full custom training (Forge-style) builds the model around your data from the ground up. Cost and control rise at every rung. Most enterprises never need the top rung, but the ones with proprietary data and a hard residency mandate increasingly do, and it is no longer science fiction to reach for it.

The Decision Compass

Answer four questions.
Get a defensible pick.

Not a verdict, a starting shortlist you can take into the room. It eliminates on your non-negotiables first, exactly as you should.

Build your shortlist

Tap one option per row. The pick updates live.

1 · The primary job

Coding & agents

Reasoning & analysis

Writing & comms

High-volume support

Multimodal / documents

2 · Can data leave your infrastructure?

Yes, API is fine

No, must self-host

3 · Budget posture

Strict, cost rules

Moderate

Pay for the best

4 · Language & region focus

Global / English

Indian languages

Needs live web data

Your shortlist

Pick an option in each row →

The compass strikes models that fail your non-negotiables, then ranks what survives against your primary job.

The new top tier

What is this
whole Mythos?

Mythos is a class, not a model. In April 2026 Anthropic introduced a tier that sits above its Opus line, with capabilities it judged too strong to put in everyone's hands at once. The first member, Claude Mythos Preview, went out to roughly fifty vetted cyber-defenders and infrastructure providers through a program called Project Glasswing, run in collaboration with the US government. It was never offered to the public.

Then on June 9, the tier reached everyone, through a clever split. Mythos and Fable are the same underlying model. The difference is the wrapper. Mythos 5 is the raw model with safeguards lifted in some areas, still reserved for Glasswing partners and trusted defenders. Fable 5 is that same model made safe for general use: safety classifiers watch for high-risk requests in cybersecurity, biology, chemistry, and model distillation, and quietly reroute those to the safer Opus 4.8. Anthropic expects that to touch under 5% of sessions.

Mythos is the raw model.
Fable is its version made safe for the public.

The name tells the strategy. A myth is the dangerous original. A fable is the version with a moral attached, the one you can hand to anyone. For a buyer, the practical fact is that Anthropic's lineup now spans five tiers, and that creates a real routing decision rather than a single default.

Haiku

Fast and cheap. High-volume, latency-sensitive work where good-enough wins.

value · fastest

Sonnet

The everyday workhorse. Near-flagship quality at a fraction of flagship price.

$3 / $15

Opus

Complex work that does not need long-horizon stamina. Also the safety fallback for Fable.

$15 / $75

Fable

Mythos-class, public. The longer and harder the task, the larger its lead grows. Built-in classifiers.

$10 / $50

Mythos

Same model, safeguards lifted. Glasswing partners only. Strongest cyber capability of any model.

restricted

The compliance footnote that matters

Mythos-class traffic carries a mandatory 30-day data retention policy, with no zero-retention option, even for enterprises that previously negotiated one. Anthropic says the data is not used for training, only to catch novel jailbreaks and reduce false positives. If you hold a zero-retention agreement for regulatory reasons, it does not apply to Fable or Mythos. Factor it into your data review before routing anything sensitive through the top tier.

There is one more wrinkle a buyer should track. Access to Mythos-class models has become entangled with export policy. The same US authority that gates advanced chips has issued directives touching this tier, and access has been adjusted in response to export-control rules. The lesson is not the specific directive. It is that the most powerful models now sit close enough to national security that geopolitics, not just price, can decide what you are allowed to run.

This is now a pattern, not a one-off

On June 26, 2026, OpenAI launched its next frontier family, GPT-5.6 Sol, Terra, and Luna, and shipped it the same way: a limited preview to roughly 20 government-cleared partners, while US agencies run a security review of up to 30 days under a June executive order. Sol tops the agentic-coding benchmarks, edging Claude Mythos 5, with a 1.5M-token context window. The takeaway for a buyer is structural, not about one lab: the very top tier from both leaders now ships through a government access gate first. EU, UK, India, and APAC teams could not touch GPT-5.6 on normal tiers at launch. Plan procurement around the broadly-available tier (Terra, Opus, Fable) and treat frontier access as a roadmap item, not a given.

What comes next

The map is
about to redraw.

If Mythos and Fable show where the frontier is heading, four forces will decide who reaches it, and from where. The next models will not just be smarter. They will be sovereign, specialized, and shaped as much by policy as by research.

Force 01

Tiers above the tier

Mythos opened a class above Opus. Expect every major lab to ship a "too powerful for default release" tier, gated by safeguards, trusted-access programs, and government partnership. The frontier becomes a velvet rope, and capability is metered by trust earned, not money paid.

Force 02

The price floor falls out

Open weights now deliver near-frontier quality at a fraction of the cost. A task that bills fifteen dollars on a flagship can cost cents on an open model. This does not just save money. It changes what is economically worth automating, which pulls AI into industries that could never justify the price before.

Force 03

Specialized over general

The next wave is vertical. Smaller models tuned for one domain, one language, one regulatory context, beating giant generalists on their home turf. Procurement stops asking "which model is best" and starts asking "which model is best at my Tuesday-morning job."

Force 04

Sovereignty as strategy

Nations and regions are building their own models so intelligence does not arrive only through someone else's API. Europe's answer is Mistral, Apache 2.0 open weights plus a Forge platform to train private models, backed by 13,800 GPUs near Paris and Cohere's merger with Germany's Aleph Alpha into a transatlantic sovereign stack. The contest is shifting from raw IQ to ecosystem control: who owns the infrastructure, who sets the default rails, whose chips you depend on. Geopolitics is now a model parameter.

The world map

Eight regions.
Every nation wants
its own model.

The frontier is no longer one zip code in California. In February 2026 over a hundred nations signed the Bangkok Declaration committing to AI sovereignty. The reasons repeat everywhere: a global model does not natively understand local dialects, legal frameworks, or culture; sending citizen data to a foreign API raises law and security questions; and intelligence is now infrastructure, like electricity, that nations do not want to rent forever. Here is the world a buyer actually chooses from.

North America

The frontier

Sets the ceiling. English-first.

Still the capability peak: the Mythos and Opus tiers, GPT-5.5, Gemini 3.1. The strategy is closed, API-first, and roadmap-driven. Canada plays a quieter "Switzerland of AI" role: talent-rich and neutral, home to Cohere's enterprise-RAG stack.

Anthropic Fable/OpusOpenAI GPT-5.5Google Gemini 3.1Meta Llama 4Cohere Command (CA)

China

The open surge

Open weights, frugal cost.

The most crowded open-weight ecosystem on earth, and it reset the global price floor. DeepSeek crossed 80% on coding benchmarks with downloadable weights; GLM, Qwen, and Kimi trade the open-source lead month to month. Strong on Asian languages and cultural nuance. The catch for buyers: hosted APIs route through servers in China, so regulated work means self-hosting.

DeepSeek V4Alibaba Qwen 3.7Z.AI GLM-5.2Moonshot Kimi K2.7Tencent HunyuanXiaomi MiMo

Europe

Regulated & sovereign

Open weights, EU data law.

The regulation-first bloc, shaped by the EU AI Act. France's Mistral is the clear champion: Apache 2.0 open weights, a Forge platform to train private models, and 13,800 GPUs going live near Paris. Cohere's merger with Germany's Aleph Alpha created a transatlantic sovereign stack; Switzerland proved a fully in-house national model is feasible.

Mistral Large 3 (FR)Aleph Alpha (DE)Cohere EU stackSwiss national LLM

Middle East

Compute as oil

Arabic-first, capital-rich.

The Gulf is buying its way to the frontier, treating compute as the new oil. Abu Dhabi's Falcon scales to 180B with permissive licensing; Jais delivers strong bilingual Arabic-English with dialect switching; Saudi Arabia's ALLaM powers the national HUMAIN Chat assistant with deep Arabic cultural nuance.

Falcon (UAE)Jais bilingualALLaM (Saudi)

South Asia

The frugal stack

22 languages, public rails.

India is not waiting for the frontier to be handed down. Sarvam open-sourced a 30B and a 105B model on government compute under the IndiaAI Mission; the flagship ships as the Indus chatbot, fluent in 22 Indian languages and a founding member of NVIDIA's Nemotron coalition. The edge is not raw scale but frugal architecture plus Digital Public Infrastructure that reaches a billion people through rails that already exist.

Sarvam Indus 105BKrutrim (Ola)BharatGen Param 2CoRover BharatGPT

East Asia

Hardware & language

Korean, Japanese fluency.

Backed by a Korean government sovereign-AI fund, SK Telecom's A.X processes Korean a third more efficiently than Western models; Upstage's Solar Pro packs frontier performance into a compact 31B. Japan focuses on Japanese-language fluency through models like Fujitsu's Takane and lightweight on-prem options.

SK Telecom A.X (KR)Upstage Solar ProFujitsu Takane (JP)

Latin America

A public good

Spanish & Portuguese.

Launched February 2026 by Chile's CENIA with 60+ institutions across 15 countries, Latam-GPT is the region's first collaborative model, around 50B parameters trained on 8TB of Spanish, Portuguese, and regional data: Buenos Aires court rulings, Colombian textbooks, Peruvian library records. Built for $550K, it is a public good for citizen services and education, with indigenous languages planned.

Latam-GPT (Chile/CENIA)15-country coalition

Africa & SE Asia

Lightweight inclusion

Low-compute, local languages.

Inclusion-first and built for constraint. Africa's InkubaLM is a compact 0.4B model spanning Hausa, Swahili, isiXhosa, isiZulu, and Yoruba; Kenya's UlizaLlama delivers Swahili health services. Singapore's SEA-LION covers Southeast Asian languages. These prove a model does not need to be huge to matter where no giant ever bothered to look.

InkubaLM (Africa)UlizaLlama SwahiliSEA-LION (SG)

Sources: provider disclosures, regional launches & the Bangkok Declaration, 2026

What this means for a buyer

For broad capability today, the North American frontier still leads and is available now. But for legal analysis in a regional language, government service automation, healthcare in vernacular tongues, or any workload that cannot sit on foreign cloud, a sovereign or regional model is no longer a compromise. The smartest 2026 architecture blends them: a global frontier model for the hardest reasoning, a regional model for language and residency, routed by the job.

The next great models will not all speak English first. Some are being built to speak to billions who were never spoken to before.

A buyer's horizon

What to watch, and when

NOW · 2026

Routing is the skill

The teams that win are not the ones with the single smartest model. They are the ones who route each job to the model that leads it, and cap spend with a gateway. Build that muscle first.

NEAR · 12 months

Tiers split further

Expect more "above-the-flagship" classes gated by trusted access, and more value models that erase the quality gap on routine work. The middle gets crowded; the top gets exclusive.

MID · 2027

Sovereign goes mainstream

National and regional models reach enterprise-grade for language and residency workloads. Procurement checklists start asking where the model was trained and whose hardware it ran on.

FAR · beyond

Policy is a parameter

Export controls, retention mandates, and security gating decide access to the very top as much as capability or price. The compliance lead becomes the most important seat in the room.

Do not buy the smartest model.
Buy the one you can route, afford, and defend, and keep the freedom to change your mind.

Global AI Forum · The Buyer's Atlas · Edition 01

There is no best model.There is only the right one.

The one rule to keep

Seven parametersthat decide it all.

Every modelworth knowing.

The numbers, on one screen.

The coding exam

The reasoning exam

The same choicelooks differentfrom each chair.

Task first.Brand last.

Models don't only write.They draw, film,and speak.

The two compliance traps in generative media

Buy, self-host,or build your own.

The build-vs-buy ladder

Answer four questions.Get a defensible pick.

Build your shortlist

What is thiswhole Mythos?

The compliance footnote that matters

This is now a pattern, not a one-off

The map isabout to redraw.

Tiers above the tier

The price floor falls out

Specialized over general

Sovereignty as strategy

Eight regions.Every nation wantsits own model.

The frontier

The open surge

Regulated & sovereign

Compute as oil

The frugal stack

Hardware & language

A public good

Lightweight inclusion

What this means for a buyer

What to watch, and when

Routing is the skill

Tiers split further

Sovereign goes mainstream

Policy is a parameter

There is no best model.
There is only the right one.

Seven parameters
that decide it all.

Every model
worth knowing.

The same choice
looks different
from each chair.

Task first.
Brand last.

Models don't only write.
They draw, film,
and speak.

Buy, self-host,
or build your own.

Answer four questions.
Get a defensible pick.

What is this
whole Mythos?

The map is
about to redraw.

Eight regions.
Every nation wants
its own model.