By deleting all recurrent links in a partial recurrent network, a simple feedforward network remains. The context units have now the function of input units, i.e. the total network input consists of two components. The first component is the pattern vector, which was the only input to the partial recurrent network. The second component is a state vector. This state vector is given through the next--state function in every step. By this way the behavior of a partial recurrent network can be simulated with a simple feedforward network, that receives the state not implicitly through recurrent links, but as an explicit part of the input vector. In this sense, backpropagation algorithms can easily be modified for the training of partial recurrent networks in the following way:
In this manner, the following learning functions have been adapted for the training of partial recurrent networks like Jordan and Elman networks:
The parameters for these learning functions are the same as for the regular feedforward versions of these algorithms (see section ) plus one special parameter.
For training a network with one of these functions a method called teacher forcing can be used. Teacher forcing means that during the training phase the output units propagate the teaching output instead of their produced output to successor units (if there are any). The new parameter is used to enable or disable teacher forcing. If the value is less or equal 0.0 only the teaching output is used, if it is greater or equal 1.0 the real output is propagated. Values between 0.0 and 1.0 yield a weighted sum of the teaching output and the real output.