DeepSeek V4 Is the Boring Kind of Terrifying: Cheap, Huge, and Almost There

DeepSeek came back with exactly the kind of release that makes executives pretend they are calm.

Not because V4 is obviously better than every frontier model. It is not. That is almost the point. The scary version of DeepSeek is not “China instantly leapfrogged everyone”. The scary version is much more boring.

It is: cheap, huge, open-weight, text-only, good enough, and maybe three to six months behind the best closed models.

That is the kind of thing that breaks pricing tables.

What shipped

DeepSeek previewed two V4 models on April 24: V4 Flash and V4 Pro.

Both are mixture-of-experts models with 1 million token context windows. V4 Pro has 1.6 trillion total parameters, with 49 billion active per token. V4 Flash is smaller at 284 billion total parameters, with 13 billion active.

That matters because mixture-of-experts is the “do not activate the whole model for every token” architecture. Instead of activating everything, the model activates the parts it needs. Lower cost. Better inference economics. Less waste.

The V4 Pro number is absurd on paper. Biggest open-weight model available, according to TechCrunch. Bigger than Kimi, bigger than MiniMax, more than double DeepSeek V3.2.

And yet the model is not trying to win every benchmark. DeepSeek says it has nearly closed the gap with current leading models on reasoning benchmarks, while still trailing GPT-5.4 and Gemini 3.1 Pro on knowledge tests.

Translation: not the best brain in the room. Maybe the best deal in the room.

The price is the weapon

The technical story is interesting. The pricing story is the knife.

DeepSeek’s V4 Flash pricing undercuts the small frontier models. V4 Pro undercuts larger frontier models like GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro. The exact comparison depends on input/output mix, but the direction is clear: DeepSeek is attacking the margin structure of closed AI.

This is what made R1 so disruptive in the first place. Not just “look, another good model”. More like: “why exactly are we paying that much?”

That question is radioactive.

Because once a cheaper model is good enough for a real workload, procurement people do not care about your launch keynote. They care about the invoice. Developers care about throughput. Startups care about whether they can survive inference bills without selling a kidney to a cloud provider.

Open weights change the psychology

Closed models ask you to trust the vendor.

Open-weight models let you inspect, host, fine-tune, benchmark, compress, and build around them. Not always easily. Not always cheaply. But the relationship is different.

DeepSeek V4 is text-only, so this is not a magical everything model. No audio. No image generation. No video. No shiny multimodal circus.

Good.

Sometimes boring is exactly what developers need. A massive text model with a long context window and sane economics is not sexy, but it is useful. Codebases. Legal documents. Research corpora. Enterprise archives. The kind of workloads where “good enough and cheap enough” beats “amazing but financially ridiculous”.

The uncomfortable part

The industry keeps pretending the model race is a simple ladder: better model on top, weaker model below.

It is not.

It is a market with segments. Frontier models for the hardest work. Cheap open models for volume. Small models for local tasks. Specialized models for narrow workflows. The future is not one model winning everything. That is too childish.

DeepSeek V4 is another reminder that the bottom keeps rising.

And when the bottom rises, the top has to justify its price every single day.

Sources: TechCrunch, VentureBeat