How to change a subset of ANN weights while keep others weights unchanged?
Show older comments
Hello folks,
I am using the neural network toolbox 2012a in my project. I have created a feed-forward-net with 2 layers(inputs are not counted as a layer as conventionalized in the users' guide), and I want to update some of the input weights (IW{1,1}) while keep other input weights in IW{1,1} and the first-to-second-layer weights(LW{2,1}) fixed. To be short, I want to change a subset of IW{1,1} while remain all the other weights fixed. Let me refer this as my optimal goal here.
If the optimal goal is impossible, a sub-optimal goal is also acceptable. That is,update the entire IW{1,1} and keep the whole LW{2,1} fixed.
I already figured out how to achieve the sub-optimal goal. My solution is to use the command 'adapt' and set the learning rate to 0 for LW{2,1}. But I do not like this solution since 'adapt' is an over-simplified function lacking parameters and features(eg. min-grad, plotperform, etc.) of other training functions/algorithms(eg. trainlm,traingd,etc.) Therefore it is harder to control the training process and check on the results.
So, first, I want to know if there is a way to achieve the optimal goal rather than the sub-optimal.
Second, if the optimal goal is not possible (besides composing everything from scratch), I wonder if I can achieve the sub-optimal goal by taking advantage of some training functions instead of using 'adpat'. I have already looked through 'trainlm' and 'traingd' but I do not think they are helpful to either of my goals.
I will really appreciate it if anyone can help me with this issue.
Jason Lee
Accepted Answer
More Answers (7)
Jai
on 7 Jul 2016
3 votes
You can use net.biases{i}.learn=0, net.inputWeights{i,j}.learn=0, To fix some of the weights.
Greg Heath
on 1 Nov 2012
0 votes
You can directly assign any combination of weights that you want after the call of the osolete functions newpr, newfit or newff. However, if you use the updated functions patternnet, fitnet or feedforwardnet, you have to first call configure, init or train.
net.IW{:,:} = IW;
net.LW{:,:} = LW;
net.b{:,:} = b;
Hope this helps.
Thank you for formally accepting my answer.
Greg
jason
on 6 Nov 2012
0 votes
1 Comment
Greg Heath
on 7 Nov 2012
PATTERNET was explicitly designed for classification and pattern recognition.
FITNET was explicitly designed for regression and curvefitting.
BOTH call FEEDFORWARDNET.
If you compare source codes via
type fitnet
type feedforwardnet
you will see that the only difference is that fitnet automatically uses PLOTFIT whereas feedforwardnet does not.
So, if you want to use feedforwardnet, you have to explicitly call plotfit afterward as demonstrated in
help plotfit
Greg
Greg Heath
on 8 Nov 2012
I guess I do not understand exactly what you want to do. My original point was that if you use one of the obsolete fuctions, you can change the value of any combination of weights before or during training and then continue training.
However, if you use one of the current functions,
1. You have to use configure or init if you want to use a specific subset of initial weights. Direct assignment is not allowed before configure, init or train is called.
2. If you want to interrupt training, specify a specific subset of weights and then continue training, training will not continue smoothly from where you interrupted. Instead, your training parameters will be automatically reinitialized.
To make it clearer, suppose you wanted to interrupt training, then do nothing before continuing to train. You will end up with a different result than if you trained continuously. In particular
net.trainParam.epoch = 10;
rng(0)
for i = 1:10
[net tr ] = train(net,x,t);
end
will have a different result than
net.trainParam.epoch = 100;
rng(0)
[net tr ] = train( net, x, t);
If you can figure out how to obtain the same results, it is worth starting a new thread to share the discovery.
Greg
Greg Heath
on 10 Nov 2012
0 votes
I don't know how many times you want to change the first layer weights during training. However, if you have 2 hidden layers and want to fix the first layer of weights, you can switch between that net and a double net configuration:
1. Use [x;t] to train net1 I-H1-H2-O
h1 = ...
h2 = ...
y = ...
2. Initialize net2 I-H1 and net3 H1-H2-O with weights from net1
3. Use net2 to create the new input matrix h1 = tansig(b1+IW*x)
4. Use [h1;t] to train net3. Since it has a hidden layer, it is a universal approximator.
5. Use the weights from net3 to intialize the last 2 layers of net1
6. etc
The fact that retraining net1 and/or net3 reinitializes the state of TRAIN is not a problem.
If your data set is not large, your toughest problem may be choosing a suitable pair of values for the number of hidden nodes H1 and H2 to prevent overtraining an overfit net ( Number of training equations is not sufficiently larger than the number of unknown weights).
Hope this helps.
Thank you for formally accepting my answer.
Greg
5 Comments
jason
on 11 Nov 2012
jason
on 11 Nov 2012
jason
on 11 Nov 2012
Greg Heath
on 13 Nov 2012
>Thank you Greg. This is a good trick. But this cannot be used in the 2 cases below.
Easily Modified:
>1. If I do not want to fix the first layer weights completely, but only some of them. That is, I also want to fix some elements of IW{1,1} while changing other elements of IW{1,1} and elements of LW.
This can be accomplished by changing the weights of the first net and generating a new h1.
>2. If I want to do the opposite, that is, fix the layer weights LW while changing the input weights IW{1,1}.
In this case you can use pseudo-inversion to obtain h2 from b2+LW2*h2 = t
jason
on 14 Nov 2012
Greg Heath
on 13 Nov 2012
0 votes
I have performed 40 experiments using MATLAB's simplefit_dataset. There were 10 random weight initializations of 1-4-1 nets for each of the following 4 scenarios:
1. NEWFIT (calls NEWFF) continuous training with the default net.trainParam.epochs = 1000
2. NEWFIT WHILE-LOOP training with net.trainParam.epochs = 1
3. FITNET (calls FEEDFORWARDNET) continuous.
4. FITNET WHILE-LOOP
The 4 MSE results for each of the 10 random weight initializations were in agreement.
This is because the first 9 initializations achieved the training goal of R2trna >= 0.99 where R2trna is the adjusted coefficient of determination (AKA degree-of-freedom adjusted R^2 ... see Wikipedia). The last initializations terminated before reaching the goal because the specified minimum gradient of MSE (1e-10) was reached.
However, when the weights of the continuous and interupted training designs are compared, only 50% of the designs achieved the same weights.
I do not intend to pursue the reason why the other 50% did not beyond looking at the 1-3-1 case where many of the designs did not achieve the training goal.
Greg Heath
on 13 Nov 2012
These are the results using FITNET for the 1-3-1 design.
1. All 20 cases terminated via tr.stop = 'Minimum gradient reached.' before achieving the goal of R2trna >= 0.99.
2. Continuous training took 6.8 sec, interrupted training took 32.0 sec
3.The differences in tr.mu were either 1e-3 or 0.9999e-3.
4. The differences in R2trn and R2trna were less than 1e-8.
5. The differences in R2val and R2tst were less than 1e-5.
6. The differences in number of epochs were
dNepochs = -5 0 -5 -4 0 -3 -3 0 -6 -5
7. Nevertheless, in both cases runs 2-5 and 7-9 obtained the EXACT same set of weights. In run 1 there was a sign change in IW(2), b1(2) and LW(2) which caused no change in output because the hidden node activation has odd parity. Adjusting for these 3 sign changes (*), the differences between the continous and interupted training weight estimates were
dWB(: , [1 2 6 10 ] ) =
1 [2-5,7-9] 6 10
=======================================
0.0017 -0.0000 0 -0.0007
*0.0001 -0.0000 0.0001 -0.0022
-0.0074 0.0414 1.0558 -0.0015
0.0014 0.0000 0 0.0003
*0.0000 0.0000 -0.0000 -0.0011
-0.0071 0.0387 0.7752 -0.0013
0.0002 0.0000 0.2949 -0.0000
*0.0000 -0.0000 0.0000 -0.0000
0.0000 -0.0002 -0.5898 0.0001
-0.0002 -0.0002 -0.2949 0.0001
Categories
Find more on Pattern Recognition in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!