Tag: reasoning
All the articles with the tag "reasoning".
-
OpenAI's Geometry Proof Is the Research Shock
OpenAI says an internal general-purpose reasoning model disproved a central conjecture in discrete geometry. The important part is not the headline, it is the kind of work that survived expert scrutiny.
-
Every Frontier AI Model Just Scored Below 1% on a Reasoning Test. Humans Score 100%.
ARC-AGI-3 is the first interactive reasoning benchmark for AI agents. Gemini scored 0.37%. GPT-5.4 scored 0.26%. Claude scored 0.25%. Humans solve every single one. The gap is not closing.