how can i choose the parameters of my network?

2 views (last 30 days)
Hi, I'm beginner in neural network,so to justify its using in my doctorate research, i want to do some comparatives with other tools to show the ability of this method.Now i want to create network, its input is x=[ -1 1 -1 1 -1 1 -1 1; -1 -1 1 1 -1 -1 1 1; -1 -1 -1 -1 1 1 1 1] and the ouptut for example y=[ 9 6 4 3 8 1 3 4] knowing that for example 9= a*(-1)+b*(-1)+c*(-1)*a1*(-1)*(-1)+b1*(-1)*(-1)+c1*(-1)*(-1)+d*(-1)*(-1)*(-1) the same way for others how can i choose the parameters of my network?

Accepted Answer

Greg Heath
Greg Heath on 20 Feb 2013
close all, clear all, clc
x0 = [ -1 1 -1 1 -1 1 -1 1; -1 -1 1 1 -1 -1 1 1; -1 -1 -1 -1 1 1 1 1]
x4 =[ 9 6 4 3 8 1 3 4]
x = [ x0; x4]
t = [ 23.72, 19.84, - 8.91 , 4.41 , -7.53, 1.41, 5.16, 0.66 ]
[I N] = size(x) % [ 4 8 ]
[O N ] = size(t) % [ 1 8 ]
Neq = N*O % 8
% Obviously, you have Neq = 8 equations.
meant2 = mean(t,2) % 4.845
vart12 = var(t,1,2) % 119.13 Biased
vart02 = var(t,0,2) % 136.15 Unbiased
% Therefore, if you try a constant model you get
y00 = repmat(mean(t,2),1,N)
MSE00 = vart12
MSE00a = vart02
% If you try a linear model solution (see my previous code) you get
W0 = t/[x;ones(1,N)] %3.3242 -2.9258 -3.9665 1.2714 -1.194
Nw0 = numel(W0) %5
Ndof0 = Neq-Nw0 %3
y0 = W0*[x;ones(1,N)]
e0 = t-y0
SSE0 = sse(e0) % 536.7
MSE0 = SSE0/Neq % 67.1
MSE0a =SSE0/Ndof0 % 178.9
NMSE0 = MSE0/MSE00 % 0.563
NMSE0a = MSE0a/MSE00a % 1.314
R20 = 1-NMSE0 % 0.437
R20a = 1-NMSE0a % -0.314
% R20 = 0.44 means that the linear model appears to account for 44% of the
% target variance and, therefore, is better than the constant model. However,
% R20a < 0 means that when the bias of testing with training data is taken
% into account, the linear model is probably worse and not much, if any,
% confidence should be put in using the R2 estimate for future data.
% In order to obtain a better coefficient model, higher order polynomial or
% neural network models can be tried. In general, however, this will result in
% decreased and negative (more unknowns than equations.) estimation degrees of
% freedom
% Apparently you have deduced a reduced term 3rd order polynomial model for
% the I/O relation. Was this done via underdetermined linear least squares and
% the coefficients for the 12 missing terms were negligible? If so, was
% regularization used?
% Neural Network Solutions:
% H = 0 hidden nodes corresponds to the linear classifier.
% H = 1 hidden node can be worse if not much better than H= 0
% H= 2 overfits the model Nw > Neq. Therefore R2a < 0. Nevertheless,
% It is interesting to see the results for Ntrials = 20 multiple designs for
% H=1 and H=2:
% H=1 result =
% Trial Nepochs R2 R2a
% 1 11 0.242 -4.309
% 2 10 0.014 -5.901
% 3 10 0.038 -5.736
% 4 11 0.260 -4.183
% 5 11 => 0.837 -0.144
% 6 8 0.427 -3.010
% 7 10 => 0.837 -0.144
% 8 33 => 0.851 -0.047
% 9 7 0.178 -4.752
% 10 17 => 0.851 -0.075
% 11 458 => 0.867 0.061
% 12 20 => 0.851 -0.047
% 13 146 0.293 -3.948
% 14 24 => 0.851 -0.047
% 15 84 0.293 -3.948
% 16 11 0.290 -3.971
% 17 16 0.293 -3.948
% 18 255 0.298 -3.914
% 19 4 0.000 -5.998
% 20 317 => 0.866 0.061
%
% maxresult =
% 20 458 0.866 0.061
% THERE ARE 7/20 R^2 RESULTS IN [ 0.837 0.867 ]
% H=2 OVERFITTING (Nw > Neq) result =
%
% 1 391 0.791
% 2 749 0.896
% 3 456 0.896
% 4 247 ==>0.998
% 5 672 0.896
% 6 7 0.046
% 7 863 0.910
% 8 19 0.860
% 9 40 ==>0.990
% 10 669 0.913
% 11 7 ==>0.992
% 12 18 0.858
% 13 14 ==>0.993
% 14 526 0.910
% 15 625 0.579
% 16 16 ==>0.991
% 17 1000 0.911
% 18 952 0.896
% 19 9 0.036
% 20 132 ==>0.993
%
% maxresult =
%
% 20 1000 0.998
%
% THERE ARE 6/20 R^2 RESULTS IN [ 0.990 0.998 ]
Hope this helps.
Thank you for formally accepting my answer
Greg

More Answers (6)

Greg Heath
Greg Heath on 16 Feb 2013
The example below demonstrates how I design NNs. Quite a few of the steps are for understanding the data and the design procedure. Subsequently, I removed all command ending semicolons so you can cut, paste, and study the printout.
The model parameters that you are looking for are the network weights net.IW, net.LW and net.b from the best of several (or many) designs.
I use the degree-of-freedom adjusted coefficient of determination R2a (see wikipedia) to rank the nets. The DOFA is used to reduce the bias caused by evaluating the model with the same data that is used to design it.
Out of 10 designs I obtained two with R2a = 0.677 interpreted as having a model that represents ~ 68% of the variance of the probability distribution from which the design data were randomly selected. Without the DOF adjustment I obtained R2 = 0.908 representing the biased estimate of ~ 91%. The large difference is caused by just using 8 data point equations from a noncontinuous data distribution to estimate 6 weights of a continuous nonlinear model.
For large differences between R2 and R2a, it is useful to divide the data into 3 trn/val/tst subsets for
1. Estimating weights (training subset)
2. Choosing nonweight design parameters and evaluating different designs (validation subset)
3. Obtaining unbiased estimates of performance (test subset)
However, since this data set too small for a statistically significant data division (where all 3 subsets have approximately the same means and covariances) it would better to use a regularized objective function like 'msereg'. I will let you investigate that on your own with better chosen examples listed in the MATLAB documentation (help nndata).
The goal is to design sum of tanh neural network models to approximate your target data t, given input x. The model output y, has the error e = t-y whos mean-squared-error is to be minimized.
{close all, clear all,clc,plt=0
tic
% Ending semicolons removed for clarity of results
x=[ -1 1 -1 1 -1 1 -1 1; -1 -1 1 1 -1 -1 1 1; -1 -1 -1 -1 1 1 1 1]
% x =
%
% -1 1 -1 1 -1 1 -1 1
% -1 -1 1 1 -1 -1 1 1
% -1 -1 -1 -1 1 1 1 1
t = [ 9 6 4 3 8 1 3 4]
[ I N ] =size(x) % [ 3 8 ]
[O N ] = size(t) % [ 1 8 ]
Neq = N*O % 8
plt = plt+1, figure(plt)
subplot(311)
plot( x(1,:), t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
subplot(312)
plot(x(2,:), t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
subplot(313)
plot(x(3,:), t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
% NAIVE CONSTANT MODEL
y00 = mean(t) % 4.7500
MSE00 = mean(var(t',1)) % 6.4375
MSE00a= mean(var(t')) % 7.3571
% LINEAR MODEL
W0 = t/[x ; ones(1,N)]
% -1.2500 -1.2500 -0.7500 4.7500
Nw0 = numel(W0) % 4
Ndof = Neq-Nw0 % 4
y0 = W0*[ x ;ones(1,N)]
% y0 = [ 8 5.5 5.5 3 6.5 4 4 1.5 ]
e0 = t-y0
% e0 = [1 0.5 -1.5 -0.0 1 -3 -1 ]
SSE0 = sse(e0) % 22
MSE0 = SSE0/Neq % 2.75
NMSE0 = MSE0/MSE00 % 0.4272
R2 = 1-MSE0/MSE00 % 0.5728
MSE0a = SSE0/Ndof % 5.5
NMSE0a = MSE0a/MSE00a % 0.7476
R2a = 1-MSE0a/MSE00a % 0.2524
% NEURAL MODEL
% Nw = (I+1)*H+(H+1)*O
% H =< Hub ==> Neq >= Nw
Hub = -1 + ceil( (Neq-O)/(I+O+1)) % 1
H = 1
Nw = (I+1)*H+(H+1)*O % 6
Ndof = Neq-Nw % 2
Ntrials = 10
net = fitnet(H);
net.divideFcn = 'dividetrain' % N too low for trn/val/tst
rng(0)
for i = 1:Ntrials
net = configure(net,x,t)
view(net)
[net tr y e ] = train(net,x,t)
Nepochs(i,1) = tr.epoch(end)
SSE = sse(e)
MSE = SSE/Neq
NMSE = MSE/MSE00
R2(i,1) = 1-NMSE
MSEa = SSE/Ndof
NMSEa = MSEa/MSE00a
R2a(i,1) = 1-NMSEa
end
disp( ' Nepochs R2 R2a ' )
disp ([ Nepochs R2 R2a ] )
% Nepochs R2 R2a
% 6. 0.0874 -2.1942
% 36 0.9078 0.6772 % Best
% 2 0.2427 -1.6505
% 4 0.4008 -1.0971
% 15 0.2492 -1.6278
% 26 0.2492 -1.6278
% 7 0.2492 -1.6278
% 9 0.3424 -1.3016
% 11 0.0906 -2.1828
% 16 0.9078 0.6772 % Best
bestresults = [ t' y' e' ] % last design
% target output error
% 9 9 0
% 6 6 0
% 4 4 0
% 3 2.75 0.25
% 8 8 0
% 1 2.75 -1.75
% 3 2.75 0.25
% 4 2.75 1.25
%
subplot(311)
hold on
plot( x(1,:), y, 'r*', 'LineWidth', 2)
subplot(312)
hold on
plot( x(2,:), y, 'r*', 'LineWidth', 2)
subplot(313)
hold on
plot( x(3,:), y, 'r*', 'LineWidth', 2)
time = toc % 16.7882 sec }
Hope this helps.
Thank you for formally accepting my answer!
Greg
  1 Comment
DemoiselX
DemoiselX on 16 Feb 2013
Thank you a lot for answer!But realy i didn't understand all of this calcul. My idea is using toolbox of neural network nftool, i train the network with 8 input then wen i test whith an other values i don't found the same answer us with regression fuction i retrain the network many often but i don't found the good response Thank you for help

Sign in to comment.


Greg Heath
Greg Heath on 16 Feb 2013
This is not the type of problem for a neural network. For the typical NN design, an approximation function of the type
y = b2 + LW*tanh(b1+IW*x);
is desired given input matrix x and target matrix t with dimensions
[ I N ] = size(x)
[ O N ] = size(t).
The solution desired are the weight matrices b1, IW, b2 and LW with dimensions
[ H I ] = size(IW)
[ H 1 ] = size(b1)
[ O H ] = size(LW)
[ O 1 ] = size(b2)
obtained by minimizing the mean-squared error
MSE = sumsqr(t-y)/(N*O)
Hope this helps.
*Thank you for formally accepting my answer.
Greg

Greg Heath
Greg Heath on 17 Feb 2013
Edited: Greg Heath on 17 Feb 2013
I was confused. I thought that the eight corners of the 3D cube either represented the domain of x or the domain of x was the interior of the cube. Correspondingly, I thought you wanted a NN model to replace the equation. That was the answer I gave. It is the best for H=1 (I checked with Ntrials = 100).
You mentioned testing with other values. I assume they are in the interior of the cube. Is that right? Exactly what kind of numbers are you getting?
Now it seems that you want the NN model to yield the coefficients of the analytic model. Is that right?
Well, NNs don't work that way. Typically, you get a sum of tanh model that approximates the I/O relationship between the numerical design inputs and corresponding targets.
However, a solution for the coefficients is available. You would have to write the equation in terms of the components x1, x2 and x3. Then you would have 8 linear equations for the 6 unknowns a,b,c*a1,b1,c1 and d. c and a1 cannot be separated.
Hope this helps.
Thank you for formally accepting my answer.
Greg

Greg Heath
Greg Heath on 19 Feb 2013
OK. Let' try this:
You have 8 4-dimensional input vectors of the form
input = [ x1 x2 x3,x4 x5 x6 x7 x8 ; y1 y2 y3 y4 y5 y6 y7 y8 ] ;
size(input) = [ 4 8 ]
You have 8 8-dimensional output vectors which are all equal
output = repmat( C, 1, 8];
size(output) = [ 8 8 ]
where C is the 8-dimensional column vector containing the polynomial coefficients.
help fitnet
will allow you to find network weights that will solve the problem.
In fact, if you initialize the random number generator, and use a double do loop over number of hidden nodes and weight initializations, you may get many solutions.
My previous code should give you a hint.
Hope this help.
Thank you for formally accepting my answer
Greg

DemoiselX
DemoiselX on 18 Feb 2013
Sorry may be my question is not obvious from first.Ok! i have a design of expirements 2^3 with interactions between factors,I have already gess the mathematic equation which is ' y=23.72 + 19.84 X1 - 8.91 *X2 + 4.41 X3 -7.53*X1*X2+1.41*X2*X3+5.16*X1*X3+0.66*X1*X2*X3 ' Now i want to create a ANN and train this network just with values of x1 x2 x3 and y corresponding input : [ -1 -1 -1 1 -1 -1 -1 1 -1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 1 1 1] ouput y =[ 6.75 52.5 2.5 15.5 3.75 67.5 2.5 38.75] can i do this using nntool or nftool?How? Thaank you a lot but realy i'm blocked thereat
  1 Comment
Greg Heath
Greg Heath on 19 Feb 2013
But how did you know that the other 12 terms ( xi^2, xj*xi^2) were missing??

Sign in to comment.


DemoiselX
DemoiselX on 21 Feb 2013
Ok!Thank you for all, realy it helps.

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!