Why modifying the weights of a recurrent neural network does not cause the output to change when predicting on same data?

2 views (last 30 days)
I consider the following recurrent neural network (RNN):
where x is the input (a vector of reals), h the hidden state vector and y is the output vector. I trained the network "net" on MATLAB using some data x and obtained W, V, and U. I assumed that MATLAB uses equations of the form:
h(t) = tansig(net.LW{1,1} * h(t-1) + net.IW{1,1} * x(t)); y(t) = net.LW{2,1} * h(t)
However, this does not seem to be the case, since after changing matrix W to W', and keeping U,V the same, the output (y) of the RNN that uses W is the same as the output (y') of the RNN that uses W' when both predict on the same data x. Those two outputs should be different just by looking at the above equation (when I modify V or U, the outputs do change). How could I fix the code so that the outputs y and y' are different as they should be?
There is a relevant post here but it was inconclusive since the suggested equations did not match the matrices the OP was getting. That is why I decided to make a pretty similar question on this site after a few days of searching.
The relevant code is shown below:
[x,t] = simplefit_dataset; % x: input data ; t: targets
net = newelm(x,t,5); % Recurrent neural net with 1 hidden layer (5 nodes) and 1 output layer (1 node)
net.layers{1}.transferFcn = 'tansig'; % 'tansig': equivalent to tanh and also is the activation function used for hidden layer
net.biasConnect = [0;0]; % biases set to zero for easier experimenting
net.derivFcn ='defaultderiv'; % defaultderiv: tells Matlab to pick whatever derivative scheme works best for this net
view(net) % displays the network topology
net = train(net,x,t); % trains the network
W = net.LW{1,1}; U = net.IW{1,1}; V = net.LW{2,1}; % network matrices
Y = net(x); % Y: output when predicting on data x using W
net.LW{1,1} = rand(5,5); % This is the modified matrix W, W'
Y_prime = net(x) % Y_prime: output when predicting on data x using W'
max(abs(Y-Y_prime )); % The difference between the two outputs is 0 when it probably shouldn't be.
There is a chance that MATLAB's "newelm" does not take W into account after training is complete for some reason, but this does not seem to be the case since if the data {x,t} are fed into the net as cells instead, then the matrix W seems to have an effect on the output but not sure why:
[x,t] = simplefit_dataset; % x: input data ; t: targets
x = num2cell(x); t = num2cell(t); % convert x and t to type cell
net = newelm(x,t,5); % Recurrent neural net with 1 hidden layer (5 nodes) and 1 output layer (1 node)
net.layers{1}.transferFcn = 'tansig'; % 'tansig': equivalent to tanh and also is the activation function used for hidden layer
net.biasConnect = [0;0]; % biases set to zero for easier experimenting
net.derivFcn ='defaultderiv'; % defaultderiv: tells Matlab to pick whatever derivative scheme works best for this net
view(net) % displays the network topology
net = train(net,x,t); % trains the network
W = net.LW{1,1}; U = net.IW{1,1}; V = net.LW{2,1}; % network matrices
Y = net(x); % Y: output when predicting on data x using W
net.LW{1,1} = rand(5,5); % This is the modified matrix W, W'
Y_prime = net(x) % Y_prime: output when predicting on data x using W'
max(abs(cell2mat(Y)-cell2mat(Y_prime))) % The difference between the two outputs is not 0 in this case as it should be.
Edit: minor corrections.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!