Performance of test data Neural Network

Question

sherien al-azerji on 4 Jul 2015

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/228192-performance-of-test-data-neural-network

Commented: Greg Heath on 5 Jul 2015

Hi, I'm using Greg Heath's code of K-fold cross validation (CV) for neural network from this website (<http://www.mathworks.com/matlabcentral/newsreader/view_thread/340857)>. I used the Bayesian training function and 3 folds for the CV, then I took the average of the 3 folds CV results to represent the performance for the network, I repeat this procedure for 10 times, and then compare the results of the 10 networks to choose the network with the best performance, I expect that the network with the best training performance should produce the best testing performance, because using the K-fold cross validation it means that all the data has been trained and tested and by taking the average of the results that mean it represent all the trained and tested sets. Unfortunately the results were not like what I expect, because some times I found the best training network performance produce the worst testing performance. I repeat the code also for many cases of hidden nodes from 3 until 10, but the same problem still exist and I couldn't find a network which produce the best training and testing performance. Is there any thing wrong I did with code which affect the performance results? can anyone help?

This is the code.

k = 3 X=x';Y=y';

 [ I N ] = size(X) %[ 5 1792 ]
 [O N ] = size(Y) %[ 1 1792 ]
 MSE00 = mean(var(Y',1)) % 30.5996 Biased Reference
 MSE00a = mean(var(Y')) % 30.6167 Unbiased Reference
 rng(0)
ind0 = randperm(N);
 M = floor(N/k) % length(valind & tstind)
 Ntrn = N-2*M % length(trnind)
 Ntrneq = Ntrn*O % No. training equations
 H =3 % default No. hidden nodes
 Nw = (I+1)*H+(H+1)*O % No. unknown weights
Ndof = Ntrneq-Nw % No. of estimation degrees of freedom
MSEgoal = 0.01*Ndof*MSE00a/Ntrneq
MinGrad = MSEgoal/10

nuNN=10; netoutp=cell(1,nuNN);

for j=1:nuNN clear perform Aveperfor net nets net = feedforwardnet(H);

 net=init(net);
nets = cell(1,k);
for i = 1:k
      nets{i}.trainParam.goal = MSEgoal;
      nets{i}.trainParam.min_grad = MinGrad;
       net.divideFcn = 'divideind';
     valind = 1 + M*(i-1) : M*i;
     if i==k
          tstind = 1:M;
          trnind = [ M+1:M*(k-1) , M*k+1:N ];
     else
          tstind = valind + M;
          trnind = [ 1:valind(1)-1 , tstind(end)+1:N ];
     end
     trnInd = ind0(trnind); %Note upper & lower case "i"
     valInd = ind0(valind); 
     tstInd = ind0(tstind);
     net.divideParam.trainInd = trnInd;
     net.divideParam.valInd = valInd;
     net.divideParam.testInd = tstInd;
     [ nets{i} tr{i} yy e ] = train( net, X,Y);
     yt = nets{i}(X);
     e = gsubtract(Y,yt);
     bestepoch(i,1) = tr{i}.best_epoch;
     R2trn(i,1) = 1 - tr{i}.best_perf/MSE00;
     R2trna(i,1) = 1 -(Ntrneq/Ndof)* tr{i}.best_perf/MSE00a;
     R2val(i,1) = 1 - tr{i}.best_vperf/MSE00;
     R2tst(i,1) = 1 - tr{i}.best_tperf/MSE00;

performance(i,1) = perform(nets{i},Y,yt) trainTargets = Y .* tr{i}.trainMask{1}; testTargets = Y .* tr{i}.testMask{1}; trainPerformance(i,1) = perform(nets{i},trainTargets,yt) testPerformance(i,1) = perform(nets{i},testTargets,yt)

 end
 perform=[performance trainPerformance testPerformance];
 Aveperfor=[mean(performance,1) mean(trainPerformance,1)  mean(testPerformance,1)];
 netoutp{j}=Aveperfor;

end D=cell2mat(netoutp');

Many Thanks in advane

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Greg Heath on 4 Jul 2015

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/228192-performance-of-test-data-neural-network#answer_184992

k = 3 X=x'; Y=y';

GEH1: 3 X syntax ERROR

GEH2: Preferred Notation

a. Use T & t for target; Y & y for output
b. UC (X,T,Y) for cells; LC (x,t,y) for matrices
 [ I N ] = size(X) %[ 5 1792 ]
 [ O N ] = size(Y) %[ 1 1792 ]

GEH3: XVAL is meant for small N where sufficiently large Ntrn results in insufficiently large Nval and Ntest. N = 1792 is not small. The default 70/15/15 with Ntrials = 10 would probably suffice for each candidate value of H.

MSE00  = mean(var(Y',1)) % 30.5996 Biased Reference
MSE00a = mean(var(Y'))   % 30.6167 Unbiased Reference
rng(0)
ind0   = randperm(N);
M      = floor(N/k) % length(valind & tstind) 597
Ntrn   = N-2*M      %  length(trnind)         598
Ntrneq = Ntrn*O     % No. training equations  598
H      = 3          % default No. hidden nodes

GEH4: ERROR! The MATLAB default on all training functions is H = 10

 Nw      = (I+1)*H+(H+1)*O % No. unknown weights 22
 Ndof    = Ntrneq-Nw % No. of estimation degrees of freedom 576
 MSEgoal = 0.01*Ndof*MSE00a/Ntrneq % 0.2949
 MinGrad = MSEgoal/10              % 0.02949

GEH5: Reduce MSEgoal by 10 and MinGrad by 100. MATLAB Defaults are 0 and 1e-7, respectively.

nuNN=10; netoutp=cell(1,nuNN);

GEH 5.5 Why is this a cell?

for j=1:nuNN % clear perform Aveperfor net nets net = feedforwardnet(H);

GEH6: Messed up format. I am assuming you meant

 clear  perform Aveperfor net nets 
 net = feedforwardnet(H);

GEH7: Use specialized FITNET for regression & specialized PATTERNNET for classification. Both automatically call the generic FEEDFORWARDNET

net = init(net);

GEH8: CONFIGURE preferred. It takes input and taget into account

 nets = cell(1,k);
 for i = 1:k
      nets{i}.trainParam.goal = MSEgoal;
      nets{i}.trainParam.min_grad = MinGrad;
      net.divideFcn = 'divideind';
      valind = 1 + M*(i-1) : M*i;
      if i==k
          tstind = 1:M;
          trnind = [ M+1:M*(k-1) , M*k+1:N ];
      else
          tstind = valind + M;
          trnind = [ 1:valind(1)-1 , tstind(end)+1:N ];
     end
     trnInd = ind0(trnind); % Note upper & lower case "i"
     valInd = ind0(valind); 
     tstInd = ind0(tstind);
     net.divideParam.trainInd = trnInd;
     net.divideParam.valInd   = valInd;
     net.divideParam.testInd  = tstInd;
     [ nets{i} tr{i} yy e ]   = train( net, X,Y);
     yt = nets{i}(X);
     e = gsubtract(Y,yt);

GEH9: yt is just yy and e has already been calculated by train

     bestepoch(i,1) = tr{i}.best_epoch;
     R2trn(i,1) = 1 - tr{i}.best_perf/MSE00;
     R2trna(i,1) = 1 -(Ntrneq/Ndof)* tr{i}.best_perf/MSE00a;
     R2val(i,1) = 1 - tr{i}.best_vperf/MSE00;
     R2tst(i,1) = 1 - tr{i}.best_tperf/MSE00;

GEH10: I Recommend adding stopcrit{i,1} = tr{i}.stop;

GEH11: Formatting screwed up I assume you mean

 performance(i,1) = perform(nets{i},Y,yt) 
 trainTargets = Y .* tr{i}.trainMask{1}; 
 testTargets = Y .* tr{i}.testMask{1}; 
 trainPerformance(i,1) = perform(nets{i},trainTargets,yt) 
 testPerformance(i,1) = perform(nets{i},testTargets,yt)

GEH12: Where is validation set performance?

 end
 perform   =[performance trainPerformance testPerformance];
 Aveperfor =[mean(performance,1) mean(trainPerformance,1) mean(testPerformance,1)];
 netoutp{j}=Aveperfor;
 end 
 D=cell2mat(netoutp');

GEH13: I assumed D = ... should be after the end

GEH recommendation: Use dividetrain to determine the minimum value of H before deciding on k

Hope this helps.

Thank you for formally accepting my answer

Greg

2 Comments
Show NoneHide None

sherien al-azerji on 4 Jul 2015

Thanks for answering my question. about your recommendation to use dividetrain to determine the minimum value of H, I wonder if you have a code or if you follow a procedure to predict the optimum number of hidden nodes, because I also have a problem with the method that it should produce the optimum nodes, I'll explain it briefly if you don't mind. The following figure show that the mean square error for the training curve is decreasing with the increasing number of hidden nodes , while MSE for the testing curve is a concave curve, and the optimum number of nodes represented by the blue straight line, however when I adopt the above code and train the network for different number of nodes (2-10), the training curve was similar to the figure but the testing curve was keep varying between up and down thus I think there is something wrong I'm doing but I don't know what is it because this is the first time I'm using NN. by the way the reason why I don't put the validation performance because I'm using the trainbr function but I forget to put the fuction in the posted code. Many Thanks

Greg Heath on 5 Jul 2015

Thanks for answering my question. about your recommendation to use dividetrain to determine the minimum value of H,

%WARNING: I, typically, only recommend standard XVAL when N is not large enough to SIMULTANEOUSLY have sufficiently large Ntrn, Nval & Ntst subsets. (your example doesn't fit this criterion). It is desirable to have Ntrn large enough to obtain reliable weight estimates with Nval and Ntst large enough to obtain reliable error estimates. In this case, I prefer to find a minimum value for H using dividetrain.

%COMMENT: In your case, N is large enough to use the default 0.7/0.15/0.15 split and my double loop approach (search GROUP & ANSWERS with greg hmin:dH:Hmax Ntrials for ZILLIONS of examples).

I wonder if you have a code or if you follow a procedure to predict the optimum number of hidden nodes, because I also have a problem with the method that it should produce the optimum nodes,

%SEE MY ABOVE COMMENT

I'll explain it briefly if you don't mind. The following figure show that the mean square error for the training curve is decreasing with the increasing number of hidden nodes , while MSE for the testing curve is a concave curve, and the optimum number of nodes represented by the blue straight line, however when I adopt the above code and train the network for different number of nodes (2-10), the training curve was similar to the figure but the testing curve was keep varying between up and down thus I think there is something wrong I'm doing but I don't know what is it because this is the first time I'm using NN.

%STANDARD APPROACH: Stop when the VALIDATION ERROR curve reaches a local min. Then the test error is used to obtain an UNBIASED estimate on unseen (e.g., operational) data.

By the way the reason why I don't put the validation performance because I'm using the trainbr function but I forget to put the fuction in the posted code. Many Thanks

%ADVICE: It is better to USE A VALIDATION SET WITH TRAINBR!!! However, I don't think MATLAB allows that for older versions.

Hope this helps

Greg

Sign in to comment.

Performance of test data Neural Network

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

Performance of test data Neural Network

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None