AI Insight
This study develops a mathematical theory explaining how linear recurrent neural networks (RNNs) learn to maintain information over long timescales through gradient-based training. The researchers demonstrate that when RNNs are trained to integrate white noise signals, the learning process can be described by tracking a single outlier eigenvalue in the network's recurrent weights, revealing the precise mechanism by which networks acquire the ability to remember information over extended periods. The framework is extended to networks learning damped oscillatory filters, showing how pairs of eigenvalues evolve during training.
Why it matters
This work provides fundamental theoretical insights into how neural networks—both artificial and biological—learn to develop memory capabilities. The mathematical framework could inform the design of more efficient machine learning architectures and help neuroscientists understand how biological brains develop circuits for temporal integration and working memory.
arXiv:2503.18754v2 Announce Type: replace
Abstract: Learning recurrent connectivity that supports memory over long intrinsic timescales is a basic problem in the theory of dynamical computation. While continuous attractor and integrator models describe how tuned recurrent circuits can maintain information, less is known about how such slow modes are acquired by gradient-based learning. Here we study this question in an analytically tractable setting: we build a mathematical theory of the learning dynamics of linear RNNs trained to integrate white noise. We show that when the initial recurrent weights are small, the dynamics of learning are described by a low-dimensional system that tracks a single outlier eigenvalue of the recurrent weights. This reveals the precise manner in which the long timescale associated with white noise integration is learned. We extend our analyses to RNNs learning a damped oscillatory filter, and find low-dimensional effective dynamical equations for the evolution of a conjugate pair of outlier eigenvalues. Taken together, our analyses build a rich mathematical framework for studying dynamical learning problems relevant to both machine learning and neuroscience.
Source: Dynamics of learning to integrate in linear recurrent neural networks