TY - UNPB
T1 - Weight-Space Linear Recurrent Neural Networks
AU - Nzoyem, Roussel Desmond
AU - Keshtmand, Nawid
AU - Tsayem, Idriss
AU - Barton, David A W
AU - Deakin, Tom
PY - 2025/6/1
Y1 - 2025/6/1
N2 - We introduce WARP (Weight-space Adaptive Recurrent Prediction), a simple yet powerful framework that unifies weight-space learning with linear recurrence to redefine sequence modeling. Unlike conventional recurrent neural networks (RNNs) which collapse temporal dynamics into fixed-dimensional hidden states, WARP explicitly parametrizes the hidden state as the weights of a distinct root neural network. This formulation promotes higher-resolution memory, gradient-free adaptation at test-time, and seamless integration of domain-specific physical priors. Empirical validation shows that WARP matches or surpasses state-of-the-art baselines on diverse classification tasks, spanning synthetic benchmarks to real-world datasets. Furthermore, extensive experiments across sequential image completion, dynamical system reconstruction, and multivariate time series forecasting demonstrate its expressiveness and generalization capabilities. Critically, WARP's weight trajectories offer valuable insights into the model's inner workings. Ablation studies confirm the architectural necessity of key components, solidifying weight-space linear RNNs as a transformative paradigm for adaptive machine intelligence.
AB - We introduce WARP (Weight-space Adaptive Recurrent Prediction), a simple yet powerful framework that unifies weight-space learning with linear recurrence to redefine sequence modeling. Unlike conventional recurrent neural networks (RNNs) which collapse temporal dynamics into fixed-dimensional hidden states, WARP explicitly parametrizes the hidden state as the weights of a distinct root neural network. This formulation promotes higher-resolution memory, gradient-free adaptation at test-time, and seamless integration of domain-specific physical priors. Empirical validation shows that WARP matches or surpasses state-of-the-art baselines on diverse classification tasks, spanning synthetic benchmarks to real-world datasets. Furthermore, extensive experiments across sequential image completion, dynamical system reconstruction, and multivariate time series forecasting demonstrate its expressiveness and generalization capabilities. Critically, WARP's weight trajectories offer valuable insights into the model's inner workings. Ablation studies confirm the architectural necessity of key components, solidifying weight-space linear RNNs as a transformative paradigm for adaptive machine intelligence.
U2 - 10.48550/arXiv.2506.01153
DO - 10.48550/arXiv.2506.01153
M3 - Preprint
BT - Weight-Space Linear Recurrent Neural Networks
PB - arXiv.org
ER -