The AI Chip Story Just Moved From GPUs to CPUs

For two years, the entire AI infrastructure conversation has sounded like a broken record with a very expensive NVIDIA logo on it.

GPUs. More GPUs. Bigger clusters. Bigger power contracts. Bigger cooling bills. Bigger everything.

And then Meta signs a deal with AWS for tens of millions of Graviton CPU cores.

Not GPUs. CPUs.

Tiny sentence. Big shift.

Why this matters

AWS announced on April 24 that Meta will deploy Graviton processors at scale to support its next generation of AI. The deployment starts with tens of millions of cores, with room to expand.

The obvious reaction is: wait, CPUs? For AI?

Yes. Because training large models and running agentic workloads are not the same problem.

GPUs are still essential for training and heavy inference. Nobody serious is saying otherwise. But once you have models running inside products, the surrounding work becomes messy and orchestration-heavy: search, retrieval, tool calls, code execution, routing, state management, long-running workflows, real-time reasoning, and coordinating agents across steps.

That work loves general-purpose compute.

The model may be the star. The CPU is backstage moving furniture every thirty seconds so the show does not collapse.

Agentic AI changes the infrastructure shape

This is the part people keep missing.

A chatbot interaction is relatively simple. User sends prompt. Model responds. Maybe some retrieval. Maybe a tool call. Done.

An agentic workflow is different. The model plans, calls tools, waits, checks, branches, writes files, asks another service, retries, executes code, summarizes state, invokes another model, and keeps the loop alive.

That creates a massive amount of non-GPU work.

AWS is saying Graviton5 is built for exactly those CPU-heavy workloads. Meta is saying, with money, that this matters at their scale.

When a company with billions of users starts buying CPU capacity for AI agents, that is not a footnote. That is a market signal.

The GPU story is not dead

No, this does not mean GPUs are over. Please. That would be the dumbest possible takeaway.

It means the AI stack is getting more layered.

Training wants accelerators. Frontier inference wants accelerators. But the product layer around AI increasingly wants orchestration compute, storage, networking, security, queues, databases, and CPU-heavy coordination.

In other words: AI infrastructure is starting to look less like one giant GPU bonfire and more like cloud computing again.

That is good news for AWS, which has spent years trying to prove its custom silicon is not just a side quest. It is also a reminder that Amazon does not need to beat NVIDIA at NVIDIA’s game to win. It can win by owning the infrastructure around the model.

The quiet lesson

Every time the industry gets obsessed with one bottleneck, the bottleneck moves.

First it was model quality. Then data. Then GPUs. Then power. Now orchestration.

Meta’s Graviton deal is not glamorous. Nobody is going to make a hype video about CPUs moving packets and coordinating tool calls.

But that is exactly why it matters.

The future does not run only on the chips that train intelligence. It also runs on the boring machines that let intelligence do work.

And boring machines, at scale, are where the money hides.

Sources: Amazon Press Center, TechCrunch