Repository logo
 

Architectures and algorithms for stable and constructive learning in discrete time recurrent neural networks.

Date

1997

Authors

Sivakumar, Shyamala C.

Journal Title

Journal ISSN

Volume Title

Publisher

Dalhousie University

Abstract

Description

This thesis deals with a discrete time recurrent neural network (DTRNN) with a block-diagonal feedback weight matrix, called the block-diagonal recurrent neural network (BDRNN), that allows a simplified approach to on-line training and addresses stability issues. It is known that backpropagation-through-time (BPTT) is the algorithm of choice for training DTRNN due to the exact and local nature of gradient computation. However, the BPTT algorithm stores the network state variables at each time instant and hence requires large storage for long training sequences. Here, the block-diagonal structure of the BDRNN is exploited to modify the BPTT algorithm to reduce the storage requirements while still maintaining exactness and locality of gradient computation. To achieve this, a numerically stable method for recomputing the state variables in the backward pass of the BPTT algorithm is proposed. It is also known that the local or global stability of DTRNN during training is guaranteed if a suitably defined norm of the updated weight matrix is less than a bound determined by the slope of the sigmoidal limiter. The determination of this norm at each weight update requires eigenvalue computations and is computationally expensive. In linear systems, this is overcome by using special sparser structures which facilitate direct monitoring of stability during weight updates by examining appropriate matrix elements. In this thesis, this approach is extended by exploiting the sparse structure of the BDRNN to monitor and maintain stability. This is addressed, first, by developing a suitable stability function that provides a measure of the system eigenvalues with reference to the unit circle; next, a penalty term based on this stability function is incorporated as part of the cost function being minimized during training. It is shown that the stability function can be suitably tailored and formulated as a constrained feedforward neural network. This allows the stabilization to be addressed in a feedforward BDRNN framework which can be trained using conventional gradient descent techniques. Finally, a modular construction method is presented that is suitable for modelling dynamic trajectories which can be decomposed xx into several subdynamics. The performance of the FF-BDRNN architecture, the new learning algorithm, the stabilization technique and the construction method are demonstrated using several simulation examples.
Thesis (Ph.D.)--DalTech - Dalhousie University (Canada), 1997.

Keywords

Engineering, Electronics and Electrical., Artificial Intelligence.

Citation