How Neural Networks Learn to Add Up Information Over Time

arXiv 9 Jun 2026 2 min read

AI Insight

This study develops a mathematical theory explaining how linear recurrent neural networks (RNNs) learn to maintain information over long timescales through gradient-based training. The researchers demonstrate that when RNNs are trained to integrate white noise signals, the learning process can be described by tracking a single outlier eigenvalue in the network's recurrent weights, revealing the precise mechanism by which networks acquire the ability to remember information over extended periods. The framework is extended to networks learning damped oscillatory filters, showing how pairs of eigenvalues evolve during training.

Why it matters

This work provides fundamental theoretical insights into how neural networks—both artificial and biological—learn to develop memory capabilities. The mathematical framework could inform the design of more efficient machine learning architectures and help neuroscientists understand how biological brains develop circuits for temporal integration and working memory.

Confidence

6/10Peer-reviewedBiology

Understand the Science

Gradient descent Concept coming soon Recurrent neural network Concept coming soon White noise Concept coming soon

arXiv:2503.18754v2 Announce Type: replace
Abstract: Learning recurrent connectivity that supports memory over long intrinsic timescales is a basic problem in the theory of dynamical computation. While continuous attractor and integrator models describe how tuned recurrent circuits can maintain information, less is known about how such slow modes are acquired by gradient-based learning. Here we study this question in an analytically tractable setting: we build a mathematical theory of the learning dynamics of linear RNNs trained to integrate white noise. We show that when the initial recurrent weights are small, the dynamics of learning are described by a low-dimensional system that tracks a single outlier eigenvalue of the recurrent weights. This reveals the precise manner in which the long timescale associated with white noise integration is learned. We extend our analyses to RNNs learning a damped oscillatory filter, and find low-dimensional effective dynamical equations for the evolution of a conjugate pair of outlier eigenvalues. Taken together, our analyses build a rich mathematical framework for studying dynamical learning problems relevant to both machine learning and neuroscience.

Source: Dynamics of learning to integrate in linear recurrent neural networks

Continue Exploring

Key Concepts

Gradient descent Recurrent neural network White noise

Latest Research

New technique pinpoints human DNA inherited from 'ghost' ancestors 30 Jul 2026 European colonization carried deadly smallpox to the Americas, ancient genomes show 30 Jul 2026 Ancient mammals crossed between the Americas before land bridge formed 30 Jul 2026 Plants adapt to climate by silencing sugar transport genes with RNA 30 Jul 2026

Source
arXiv