Discovering multiscale dynamical features with hierarchical Echo State Networks
- Many time series of practical relevance data have multi-scale characteristics. Prime examples are speech, texts, writing, or gestures. If one wishes to learn models of such systems, the models must be capable to represent dynamical features on different temporal and/or spatial scales. One natural approach to this end is hierarchical models, where higher processing layers are responsible for processing longer-range (slower, coarser) dynamical features of the input signal. This report introduces a hierarchical architecture where the core ingredient of each layer is an echo state network. In a bottom-up flow of information, throughout the architecture increasingly coarse features are extracted from the input signal. In a top-down flow of information, feature expectations are passed down. The architecture as a whole is trained on a one-step input prediction task by stochastic error gradient descent. The report presents a formal specification of these hierarchical systems and illustrates important aspects of its functioning in a case study with synthetic data.