AI Insight
This study investigates why large language models struggle with basic arithmetic despite their advanced capabilities. Researchers discovered a geometric structure called the Iso-Raw-Sum Trajectory in the models' internal representations, where arithmetic errors occur as "geometric slippages" when neural noise pushes continuous internal calculations across discrete output thresholds. They developed a geometric consistency check method that can detect and correct these quantization failures during model inference.
Why it matters
The findings provide a mathematical framework for understanding and potentially fixing a fundamental weakness in AI systems that handle numerical reasoning. The geometric consistency check method offers a practical tool for improving the reliability of large language models in arithmetic tasks, which is crucial for applications requiring accurate numerical computation.
arXiv:2606.03645v1 Announce Type: cross
Abstract: Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discrete output. By analyzing the residual stream geometry during multi-operand addition, we identify the Iso-Raw-Sum Trajectory (IRST), a geometric structure where representations are anchored by semantic digits and modulated by continuous carry fibers. We propose the Noisy Quantization Model to explain this geometry, framing arithmetic errors as Geometric Slippages caused by internal neural noise pushing a continuous, latent Carry Potential across quantization thresholds. This geometric framework further elucidates Probe Versatility, explaining how lightweight probes can disentangle coexisting latent signals (such as ground truth versus hallucination) from a single activation vector. Finally, we validate these insights through a geometric consistency check method that effectively detects and corrects these quantization failures during inference. Our code is available at https://github.com/RL-MIND/Shape-of-Addition.
Source: The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models