Feedforward Network and Backpropagation

17 views (last 30 days)
Hi,
I've very new to Matlab and Neural Networks. I've done a fair amount of reading (neural network faq, matlab userguide, LeCunn, Hagan, various others) and feel like I have some grasp of the concepts - now I'm trying to get the practical side down. I am working through a neural net example/tutorial very similar to the Cancer Detection MATLAB example (<http://www.mathworks.co.uk/help/nnet/examples/cancer-detection.html?prodcode=NN&language=en)>. In my case I am trying to achived a 16 feature set binary classification and am evaluating the effect on training and generalisation of varying the number of nodes in the single hidden layer. For reference below, x (double) is my feature set variable and t is my target vector (binary), training sample size is 200 and test sample size is approx 3700.
My questions are: 1) I'm using patternnet default 'tansig' in both the hidden and output layers with 'mapminmax' and 'trainlm'. I'm interpretting the output by thresholding on y . 0>=0.5 The matlab userguide suggests using 'logsig' for constrained output to [0 1]. Should I change the output layer transfer function to 'logsig' or not ? I've read some conflicting suggestion with regard to doing this and that 'softmax' is sometimes suggested, but can't be used for training without configuring your own derivative function (which I don't feel confident in doing).
2) The tutorial provides a training and test dataset, directing the use of the full training set in training (i.e. dividetrain) and at the same time directs stopping training once the network achieves x% success in classifying patterns. a) is this an achieveable goal without a validation set, or are these conflicting directions? b) if achievable, how do I set 'trainParam.goal' to evaluate at x% success ? Webcrawling has led me to the answer of setting preformFcn = 'mse' and trainParam.goal = (1-x%)var(t) - does this make sense (it's seems to rely on mse = var(err) )? c) Assuming my intuition above is correct - is there an automated way of applying cross validation to a nn in matlab or will I effectively have to program in a loop? e) is there any point to this or does would a simple dividerand(200, 0.8, 0.2, 0.0) acheive the same thing ?
3) Is there an automated way in the nntoolbox of establishing the optimum number of nodes in the hidden layer ?
Thanks in advance for any and all help

Accepted Answer

Greg Heath
Greg Heath on 15 Apr 2013
Use the default trn/val/tst ratio 0.7/0.15/0.15 or choose 0.6 <= Ntrn/N <= 0.7 with Ntst = Nval.
[ I N ] = size( x ) % [ 16 3900 ]
[ O N ] = size(t) % [ 1 3900 ]
Ntst = round(0.15*N) % 2730
Nval = Ntst % 585
Ntrn = N-Nval-Ntst % 585
% My questions are: 1) I'm using patternnet default 'tansig' in both the hidden and output layers with 'mapminmax' and 'trainlm'. I'm interpretting the output by thresholding on y . 0>=0.5
UGH.
1. That doesn't make any sense since the range of tansig is (-1,1).
2. The advice in help/doc/type patternnet all conflict
3. Use { 'tansig' logsig } with mapstd or mapminmax(the default) inputs
4. Use 'trainscg' for unipolar binary targets {0,1}
5. class = 1+ round(net(x)). Can modify if have unequal priors and/or
unequal misclassification costs.
6. Can use softmax for more than 2 classes. MATLAB now has the
derivative for softmax
7. Use a val set with round(0.15*N) <= Nval = Ntst <= round(0.2*N)
8. Use MSEgoal = max( 0, 0.01*Ndof*MSE00a/Ntrneq)
a. MSE00a = mean(var(t'))
b. Ntrneq = round(0.7*prodsize(t)) % ~ No. of training equations
c. Ndof = Ntrneq-Nw % No. of estimation degrees-of-freedom(See Wikipedia)
d. Nw = (I+1)*H+(H+1)*O % No. of unknown weights to estimate
9. Stopping on any misclassification rate cannot be done unless
a. Either the training is broken up into a loop of few epochs at a time
with breaks to check the classification rate
b. Or patternnet is modified
c. It's not worth the time(a) or effort (b).
d. If you disagree, please send me a copy of your code.
10. Dividetrain is only useful if Ntrn >> Nw and the generalization error
is estimated using the DOF adjusted value Ntrneq*MSE/Ndof.
Unfortunately, MATLAB does not allow Nval = 0, Ntst > 0. The closest
fudge that I can think of is Nval = 1 (ratio = 1/N) , max_fail = inf.
11. I find a good value for the number of hidden nodes, H, by using an
outer loop over j = Hmin:dH:Hmax and an inner loop over random weight
intializations i = 1:Ntrials with Ntrials ~ 10 and
Hmax <= Hub = -1+ceil( (Ntrneq-O)/(I+O+1))
12. Nw > Ntrneq and Ndof < 0 when H > Hmax.
Hope this helps.
Thank you for formally accepting my answer
Greg
PS I have many examples in comp.ai.neural-nets and comp.soft-sys.matlab.
  2 Comments
Dink
Dink on 15 Apr 2013
Edited: Dink on 15 Apr 2013
Hi Greg - thank you very much for your insight - I realise that the way that I've phrased the question has been somewhat confusing and incomplete.
  • the given dataset to create the network is a n=200 sample (the 3700 dataset is for 'real world' classification testing post creation of the network - i.e the sim or y = net(x) part of experiment)
  • as part of the experiment I will be varying the number of hidden nodes - but lets assume for this case it's 5
  • The instruction given is to use the full 200 sample set for training to a success rate of x% on the training sample -
  • my interpretation of this was, in the absence of a validation set, to target r^2 as you suggested in a previous post http://www.mathworks.co.uk/matlabcentral/answers/63738.
  • As I understand your answer - 8. details how to correctly set msegoal to r^2 and this will then (in this case 99%) stop the training once a 99% classification rate has been reached on the training data ???
  • I'm unfamiliar with prodsize() - what does this mean ??
Greg Heath
Greg Heath on 16 Apr 2013
1. There is no analytic relationship between discontinuous classification error rates and continuous training objectives. Therefore you have to calculate PctErr after the design to see if the specified goal has been reached.
The number of training equations is the product of the number of output nodes and the number of training vectors:
Ntrneq = Ntrn*O = prod(size(ttrn))

Sign in to comment.

More Answers (2)

mangood UK
mangood UK on 14 Apr 2013
3) Is there an automated way in the nntoolbox of establishing the optimum number of nodes in the hidden layer ?
nnnnnnooooo its depend on you you can use genetic algorithm to find that but its too complex you start from minimum number then increase as you like

Greg Heath
Greg Heath on 16 Apr 2013
There is no MATLAB neural-network training algorithm that directly minimizes discontinuous classification percentage error rate (PctErr). Furthermore, there is no known analytic relationship between PctErr and the typical training goals ( MSE, MSEREG, MAE ) of the MATLAB training algorithms. Therefore, if a classification error rate goal is specified, train with a standard minimization function using an inner loop over random weight initializations and an outer loop over increasing number of hidden nodes. Calculate PctErr at the end of each inner loop.
You can either
1. Choose from a Ntrials X numH tabulation of results
2. Break at the end of an H = constant loop if the PctErr goal is achieved
within that inner loop.
3. Break as soon as the PctErr goal is achieved.
Hope this helps.
Thank you for formally accepting my answer
Greg
  2 Comments
Greg Heath
Greg Heath on 16 Apr 2013
I don't think the powers that be want you to post the same question in the NEWSGROUP and ANSWERS.
Greg
Dink
Dink on 16 Apr 2013
Yep - newbie mistake - sorry, it won't happen again

Sign in to comment.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!