net.trainFcn = 'trainrp' sets the network
trainrp is a network training function that updates weight and bias
values according to the resilient backpropagation algorithm (Rprop).
Training occurs according to
trainrp training parameters, shown here
with their default values:
net.trainParam.epochs— Maximum number of epochs to train. The default value is
net.trainParam.show— Epochs between displays (
NaNfor no displays). The default value is
net.trainParam.showCommandLine— Generate command-line output. The default value is
net.trainParam.showWindow— Show training GUI. The default value is
net.trainParam.goal— Performance goal. The default value is
net.trainParam.time— Maximum time to train in seconds. The default value is
net.trainParam.min_grad— Minimum performance gradient. The default value is
net.trainParam.max_fail— Maximum validation failures. The default value is
net.trainParam.lr— Learning rate. The default value is
net.trainParam.delt_inc— Increment to weight change. The default value is
net.trainParam.delt_dec— Decrement to weight change. The default value is
net.trainParam.delta0— Initial weight change. The default value is
net.trainParam.deltamax— Maximum weight change. The default value is
Solve Problems with Network Trained with
This example shows how to train a feed-forward network with a
trainrp training function to solve a problem with inputs
p and targets
Create the inputs
p and the targets
t that you
want to solve with a network.
p = [0 1 2 3 4 5]; t = [0 0 0 1 1 1];
Create a two-layer feed-forward network with two hidden neurons and this training function.
net = feedforwardnet(2,'trainrp');
Train and test the network.
net.trainParam.epochs = 50; net.trainParam.show = 10; net.trainParam.goal = 0.1; net = train(net,p,t); a = net(p)
For more examples, see
help feedforwardnet and
trainedNet — Trained network
Trained network, returned as a
tr — Training record
Training record (
perf), returned as
a structure whose fields depend on the network training function
net.NET.trainFcn). It can include fields such as:
Training, data division, and performance functions and parameters
Data division indices for training, validation and test sets
Data division masks for training validation and test sets
Number of epochs (
num_epochs) and the best epoch (
A list of training state names (
Fields for each state name recording its value throughout training
Performances of the best network (
You can create a standard network that uses
To prepare a custom network to be trained with
'trainrp'. This sets
trainrp’s default parameters.
net.trainParamproperties to desired values.
In either case, calling
train with the resulting network trains the
Multilayer networks typically use sigmoid transfer functions in the hidden layers. These functions are often called “squashing” functions, because they compress an infinite input range into a finite output range. Sigmoid functions are characterized by the fact that their slopes must approach zero as the input gets large. This causes a problem when you use steepest descent to train a multilayer network with sigmoid functions, because the gradient can have a very small magnitude and, therefore, cause small changes in the weights and biases, even though the weights and biases are far from their optimal values.
The purpose of the resilient backpropagation (Rprop) training algorithm is to eliminate
these harmful effects of the magnitudes of the partial derivatives. Only the sign of the
derivative can determine the direction of the weight update; the magnitude of the derivative
has no effect on the weight update. The size of the weight change is determined by a
separate update value. The update value for each weight and bias is increased by a factor
delt_inc whenever the derivative of the performance function with
respect to that weight has the same sign for two successive iterations. The update value is
decreased by a factor
delt_dec whenever the derivative with respect to
that weight changes sign from the previous iteration. If the derivative is zero, the update
value remains the same. Whenever the weights are oscillating, the weight change is reduced.
If the weight continues to change in the same direction for several iterations, the
magnitude of the weight change increases. A complete description of the Rprop algorithm is
given in [RiBr93].
The following code recreates the previous network and trains it using the Rprop
algorithm. The training parameters for
deltamax. The first eight parameters have been previously discussed.
The last two are the initial step size and the maximum step size, respectively. The performance of Rprop is not very sensitive
to the settings of the training parameters. For the example below, the training parameters
are left at the default values:
p = [-1 -1 2 2;0 5 0 5]; t = [-1 -1 1 1]; net = feedforwardnet(3,'trainrp'); net = train(net,p,t); y = net(p)
rprop is generally much faster than the standard steepest descent
algorithm. It also has the nice property that it requires only a modest increase in memory
requirements. You do need to store the update values for each weight and bias, which is
equivalent to storage of the gradient.
trainrp can train any network as long as its weight, net input, and
transfer functions have derivative functions.
Backpropagation is used to calculate derivatives of performance
with respect to the weight and bias variables
X. Each variable is adjusted
according to the following:
dX = deltaX.*sign(gX);
where the elements of
deltaX are all initialized to
gX is the gradient. At each iteration the
deltaX are modified. If an element of
changes sign from one iteration to the next, then the corresponding element of
deltaX is decreased by
delta_dec. If an element of
gX maintains the same sign from one iteration to the next, then the
corresponding element of
deltaX is increased by
delta_inc. See Riedmiller, M., and H. Braun, “A direct adaptive
method for faster backpropagation learning: The RPROP algorithm,” Proceedings
of the IEEE International Conference on Neural Networks,1993,
Training stops when any of these conditions occurs:
The maximum number of
epochs(repetitions) is reached.
The maximum amount of
Performance is minimized to the
The performance gradient falls below
Validation performance (validation error) has increased more than
max_failtimes since the last time it decreased (when using validation).
 Riedmiller, M., and H. Braun, “A direct adaptive method for faster backpropagation learning: The RPROP algorithm,” Proceedings of the IEEE International Conference on Neural Networks,1993, pp. 586–591.