Performance of test data Neural Network

2 views (last 30 days)
Hi, I'm using Greg Heath's code of K-fold cross validation (CV) for neural network from this website (<http://www.mathworks.com/matlabcentral/newsreader/view_thread/340857)>. I used the Bayesian training function and 3 folds for the CV, then I took the average of the 3 folds CV results to represent the performance for the network, I repeat this procedure for 10 times, and then compare the results of the 10 networks to choose the network with the best performance, I expect that the network with the best training performance should produce the best testing performance, because using the K-fold cross validation it means that all the data has been trained and tested and by taking the average of the results that mean it represent all the trained and tested sets. Unfortunately the results were not like what I expect, because some times I found the best training network performance produce the worst testing performance. I repeat the code also for many cases of hidden nodes from 3 until 10, but the same problem still exist and I couldn't find a network which produce the best training and testing performance. Is there any thing wrong I did with code which affect the performance results? can anyone help?
This is the code.
k = 3 X=x';Y=y';
[ I N ] = size(X) %[ 5 1792 ]
[O N ] = size(Y) %[ 1 1792 ]
MSE00 = mean(var(Y',1)) % 30.5996 Biased Reference
MSE00a = mean(var(Y')) % 30.6167 Unbiased Reference
rng(0)
ind0 = randperm(N);
M = floor(N/k) % length(valind & tstind)
Ntrn = N-2*M % length(trnind)
Ntrneq = Ntrn*O % No. training equations
H =3 % default No. hidden nodes
Nw = (I+1)*H+(H+1)*O % No. unknown weights
Ndof = Ntrneq-Nw % No. of estimation degrees of freedom
MSEgoal = 0.01*Ndof*MSE00a/Ntrneq
MinGrad = MSEgoal/10
nuNN=10; netoutp=cell(1,nuNN);
for j=1:nuNN clear perform Aveperfor net nets net = feedforwardnet(H);
net=init(net);
nets = cell(1,k);
for i = 1:k
nets{i}.trainParam.goal = MSEgoal;
nets{i}.trainParam.min_grad = MinGrad;
net.divideFcn = 'divideind';
valind = 1 + M*(i-1) : M*i;
if i==k
tstind = 1:M;
trnind = [ M+1:M*(k-1) , M*k+1:N ];
else
tstind = valind + M;
trnind = [ 1:valind(1)-1 , tstind(end)+1:N ];
end
trnInd = ind0(trnind); %Note upper & lower case "i"
valInd = ind0(valind);
tstInd = ind0(tstind);
net.divideParam.trainInd = trnInd;
net.divideParam.valInd = valInd;
net.divideParam.testInd = tstInd;
[ nets{i} tr{i} yy e ] = train( net, X,Y);
yt = nets{i}(X);
e = gsubtract(Y,yt);
bestepoch(i,1) = tr{i}.best_epoch;
R2trn(i,1) = 1 - tr{i}.best_perf/MSE00;
R2trna(i,1) = 1 -(Ntrneq/Ndof)* tr{i}.best_perf/MSE00a;
R2val(i,1) = 1 - tr{i}.best_vperf/MSE00;
R2tst(i,1) = 1 - tr{i}.best_tperf/MSE00;
performance(i,1) = perform(nets{i},Y,yt) trainTargets = Y .* tr{i}.trainMask{1}; testTargets = Y .* tr{i}.testMask{1}; trainPerformance(i,1) = perform(nets{i},trainTargets,yt) testPerformance(i,1) = perform(nets{i},testTargets,yt)
end
perform=[performance trainPerformance testPerformance];
Aveperfor=[mean(performance,1) mean(trainPerformance,1) mean(testPerformance,1)];
netoutp{j}=Aveperfor;
end D=cell2mat(netoutp');
Many Thanks in advane

Accepted Answer

Greg Heath
Greg Heath on 4 Jul 2015
k = 3 X=x'; Y=y';
GEH1: 3 X syntax ERROR
GEH2: Preferred Notation
a. Use T & t for target; Y & y for output
b. UC (X,T,Y) for cells; LC (x,t,y) for matrices
[ I N ] = size(X) %[ 5 1792 ]
[ O N ] = size(Y) %[ 1 1792 ]
GEH3: XVAL is meant for small N where sufficiently large Ntrn results in insufficiently large Nval and Ntest. N = 1792 is not small. The default 70/15/15 with Ntrials = 10 would probably suffice for each candidate value of H.
MSE00 = mean(var(Y',1)) % 30.5996 Biased Reference
MSE00a = mean(var(Y')) % 30.6167 Unbiased Reference
rng(0)
ind0 = randperm(N);
M = floor(N/k) % length(valind & tstind) 597
Ntrn = N-2*M % length(trnind) 598
Ntrneq = Ntrn*O % No. training equations 598
H = 3 % default No. hidden nodes
GEH4: ERROR! The MATLAB default on all training functions is H = 10
Nw = (I+1)*H+(H+1)*O % No. unknown weights 22
Ndof = Ntrneq-Nw % No. of estimation degrees of freedom 576
MSEgoal = 0.01*Ndof*MSE00a/Ntrneq % 0.2949
MinGrad = MSEgoal/10 % 0.02949
GEH5: Reduce MSEgoal by 10 and MinGrad by 100. MATLAB Defaults are 0 and 1e-7, respectively.
nuNN=10; netoutp=cell(1,nuNN);
GEH 5.5 Why is this a cell?
for j=1:nuNN % clear perform Aveperfor net nets net = feedforwardnet(H);
GEH6: Messed up format. I am assuming you meant
clear perform Aveperfor net nets
net = feedforwardnet(H);
GEH7: Use specialized FITNET for regression & specialized PATTERNNET for classification. Both automatically call the generic FEEDFORWARDNET
net = init(net);
GEH8: CONFIGURE preferred. It takes input and taget into account
nets = cell(1,k);
for i = 1:k
nets{i}.trainParam.goal = MSEgoal;
nets{i}.trainParam.min_grad = MinGrad;
net.divideFcn = 'divideind';
valind = 1 + M*(i-1) : M*i;
if i==k
tstind = 1:M;
trnind = [ M+1:M*(k-1) , M*k+1:N ];
else
tstind = valind + M;
trnind = [ 1:valind(1)-1 , tstind(end)+1:N ];
end
trnInd = ind0(trnind); % Note upper & lower case "i"
valInd = ind0(valind);
tstInd = ind0(tstind);
net.divideParam.trainInd = trnInd;
net.divideParam.valInd = valInd;
net.divideParam.testInd = tstInd;
[ nets{i} tr{i} yy e ] = train( net, X,Y);
yt = nets{i}(X);
e = gsubtract(Y,yt);
GEH9: yt is just yy and e has already been calculated by train
bestepoch(i,1) = tr{i}.best_epoch;
R2trn(i,1) = 1 - tr{i}.best_perf/MSE00;
R2trna(i,1) = 1 -(Ntrneq/Ndof)* tr{i}.best_perf/MSE00a;
R2val(i,1) = 1 - tr{i}.best_vperf/MSE00;
R2tst(i,1) = 1 - tr{i}.best_tperf/MSE00;
GEH10: I Recommend adding stopcrit{i,1} = tr{i}.stop;
GEH11: Formatting screwed up I assume you mean
performance(i,1) = perform(nets{i},Y,yt)
trainTargets = Y .* tr{i}.trainMask{1};
testTargets = Y .* tr{i}.testMask{1};
trainPerformance(i,1) = perform(nets{i},trainTargets,yt)
testPerformance(i,1) = perform(nets{i},testTargets,yt)
GEH12: Where is validation set performance?
end
perform =[performance trainPerformance testPerformance];
Aveperfor =[mean(performance,1) mean(trainPerformance,1) mean(testPerformance,1)];
netoutp{j}=Aveperfor;
end
D=cell2mat(netoutp');
GEH13: I assumed D = ... should be after the end
GEH recommendation: Use dividetrain to determine the minimum value of H before deciding on k
Hope this helps.
Thank you for formally accepting my answer
Greg
  2 Comments
sherien al-azerji
sherien al-azerji on 4 Jul 2015
Thanks for answering my question. about your recommendation to use dividetrain to determine the minimum value of H, I wonder if you have a code or if you follow a procedure to predict the optimum number of hidden nodes, because I also have a problem with the method that it should produce the optimum nodes, I'll explain it briefly if you don't mind. The following figure show that the mean square error for the training curve is decreasing with the increasing number of hidden nodes , while MSE for the testing curve is a concave curve, and the optimum number of nodes represented by the blue straight line, however when I adopt the above code and train the network for different number of nodes (2-10), the training curve was similar to the figure but the testing curve was keep varying between up and down thus I think there is something wrong I'm doing but I don't know what is it because this is the first time I'm using NN. by the way the reason why I don't put the validation performance because I'm using the trainbr function but I forget to put the fuction in the posted code. Many Thanks
Greg Heath
Greg Heath on 5 Jul 2015
Thanks for answering my question. about your recommendation to use dividetrain to determine the minimum value of H,
%WARNING: I, typically, only recommend standard XVAL when N is not large enough to SIMULTANEOUSLY have sufficiently large Ntrn, Nval & Ntst subsets. (your example doesn't fit this criterion). It is desirable to have Ntrn large enough to obtain reliable weight estimates with Nval and Ntst large enough to obtain reliable error estimates. In this case, I prefer to find a minimum value for H using dividetrain.
%COMMENT: In your case, N is large enough to use the default 0.7/0.15/0.15 split and my double loop approach (search GROUP & ANSWERS with greg hmin:dH:Hmax Ntrials for ZILLIONS of examples).
I wonder if you have a code or if you follow a procedure to predict the optimum number of hidden nodes, because I also have a problem with the method that it should produce the optimum nodes,
%SEE MY ABOVE COMMENT
I'll explain it briefly if you don't mind. The following figure show that the mean square error for the training curve is decreasing with the increasing number of hidden nodes , while MSE for the testing curve is a concave curve, and the optimum number of nodes represented by the blue straight line, however when I adopt the above code and train the network for different number of nodes (2-10), the training curve was similar to the figure but the testing curve was keep varying between up and down thus I think there is something wrong I'm doing but I don't know what is it because this is the first time I'm using NN.
%STANDARD APPROACH: Stop when the VALIDATION ERROR curve reaches a local min. Then the test error is used to obtain an UNBIASED estimate on unseen (e.g., operational) data.
By the way the reason why I don't put the validation performance because I'm using the trainbr function but I forget to put the fuction in the posted code. Many Thanks
%ADVICE: It is better to USE A VALIDATION SET WITH TRAINBR!!! However, I don't think MATLAB allows that for older versions.
Hope this helps
Greg

Sign in to comment.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!