Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
(Submitted on 24 Nov 2023 (v1), last revised 16 May 2024 (this version, v3))
Abstract: In this paper, we investigate the long-term memory learning capabilities of state-space models (SSMs) from the perspective of parameterization. We prove that state-space models without any reparameterization exhibit a memory limitation similar to that of traditional RNNs: the target relationships that can be stably approximated by state-space models must have an exponential decaying memory. Our analysis identifies this "curse of memory" as a result of the recurrent weights converging to a stability boundary, suggesting that a reparameterization technique can be effective. To this end, we introduce a class of reparameterization techniques for SSMs that effectively lift its memory limitations. Besides improving approximation capabilities, we further illustrate that a principled choice of reparameterization scheme can also enhance optimization stability. We validate our findings using synthetic datasets, language models and image classifications.
Submission history
From: Shida Wang [view email][v1] Fri, 24 Nov 2023 14:08:31 GMT (1954kb,D)
[v2] Thu, 2 May 2024 15:02:10 GMT (1641kb,D)
[v3] Thu, 16 May 2024 22:23:13 GMT (1963kb,D)
Link back to: arXiv, form interface, contact.