AI struggles with advanced math, scoring barely passing grade

Scientific American 10 Jun 2026 2 min read

AI Insight

AI systems were tested on a second set of "First Proof" problems designed to assess their capability in research-level mathematics. The top-performing AI model successfully solved approximately six to seven out of ten problems, achieving what amounts to a C-minus grade. This benchmark represents one of the most challenging mathematical tests administered to AI systems to date.

Why it matters

The results indicate that while AI is making progress in advanced mathematical reasoning, it still falls short of the level needed to independently conduct cutting-edge mathematical research. This assessment helps researchers understand current limitations and guides future development of AI systems intended to assist or collaborate with mathematicians in solving complex problems.

Confidence

6/10Peer-reviewedInterdisciplinary

Understand the Science

Artificial intelligence 236 articles Explore Concept → Benchmark (computing) Concept coming soon Mathematical proof Concept coming soon

The second batch of “First Proof” problems is meant to evaluate AI’s usefulness for research-level math. The best model got six or seven of the 10 questions basically right

Source: AI scores a ‘C–’ on its hardest math test yet

Where this fits

Signature JourneyAI Learns Science136 discoveries · 4 conceptsExplore journey →