how can i choose the parameters of my network?

Question

0 votes

Hi, I'm beginner in neural network,so to justify its using in my doctorate research, i want to do some comparatives with other tools to show the ability of this method.Now i want to create network, its input is x=[ -1 1 -1 1 -1 1 -1 1; -1 -1 1 1 -1 -1 1 1; -1 -1 -1 -1 1 1 1 1] and the ouptut for example y=[ 9 6 4 3 8 1 3 4] knowing that for example 9= a*(-1)+b*(-1)+c*(-1)*a1*(-1)*(-1)+b1*(-1)*(-1)+c1*(-1)*(-1)+d*(-1)*(-1)*(-1) the same way for others how can i choose the parameters of my network?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Greg Heath on 20 Feb 2013

Open in MATLAB Online

1 vote

close all, clear all, clc

 x0 = [ -1 1 -1 1 -1 1 -1 1; -1 -1 1 1 -1 -1 1 1; -1 -1 -1 -1 1 1 1 1]
 x4 =[ 9 6 4 3 8 1 3 4]
 x = [ x0; x4]
 t = [ 23.72, 19.84, - 8.91 , 4.41 , -7.53, 1.41, 5.16, 0.66 ]
 [I N]  = size(x)  % [ 4 8 ]
 [O N ] = size(t)  % [ 1 8 ]
 Neq    = N*O      %   8

% Obviously, you have Neq = 8 equations.

 meant2 = mean(t,2)     %   4.845
 vart12 = var(t,1,2)    %  119.13 Biased
 vart02 = var(t,0,2)    % 136.15 Unbiased

% Therefore, if you try a constant model you get

 y00    = repmat(mean(t,2),1,N)  
 MSE00  = vart12
 MSE00a = vart02

% If you try a linear model solution (see my previous code) you get

 W0 = t/[x;ones(1,N)]    %3.3242  -2.9258  -3.9665  1.2714  -1.194
 Nw0  = numel(W0)        %5
 Ndof0 = Neq-Nw0         %3
 y0 = W0*[x;ones(1,N)]  
 e0 = t-y0 
 SSE0 = sse(e0)          %  536.7
 MSE0 = SSE0/Neq         %   67.1
 MSE0a =SSE0/Ndof0       %  178.9
 NMSE0 = MSE0/MSE00      %  0.563 
 NMSE0a = MSE0a/MSE00a   %  1.314          
 R20 = 1-NMSE0           %  0.437
 R20a = 1-NMSE0a         % -0.314

% R20 = 0.44 means that the linear model appears to account for 44% of the

% target variance and, therefore, is better than the constant model. However,

% R20a < 0 means that when the bias of testing with training data is taken

% into account, the linear model is probably worse and not much, if any,

% confidence should be put in using the R2 estimate for future data.

% In order to obtain a better coefficient model, higher order polynomial or

% neural network models can be tried. In general, however, this will result in

% decreased and negative (more unknowns than equations.) estimation degrees of

% freedom

% Apparently you have deduced a reduced term 3rd order polynomial model for

% the I/O relation. Was this done via underdetermined linear least squares and

% the coefficients for the 12 missing terms were negligible? If so, was

% regularization used?

% Neural Network Solutions:

% H = 0 hidden nodes corresponds to the linear classifier.

% H = 1 hidden node can be worse if not much better than H= 0

% H= 2 overfits the model Nw > Neq. Therefore R2a < 0. Nevertheless,

% It is interesting to see the results for Ntrials = 20 multiple designs for

% H=1 and H=2:

 % H=1 result =
 %          Trial      Nepochs     R2              R2a              
 %             1           11         0.242          -4.309
 %             2           10         0.014          -5.901
 %             3           10         0.038          -5.736
 %             4           11         0.260          -4.183
 %             5           11    => 0.837         -0.144     
 %             6             8          0.427         -3.010      
 %             7           10    => 0.837         -0.144    
 %             8           33    => 0.851         -0.047     
 %             9             7          0.178         -4.752       
 %            10          17    => 0.851         -0.075     
 %            11        458    => 0.867          0.061       
 %            12          20    => 0.851         -0.047       
 %            13        146         0.293         -3.948
 %            14          24    => 0.851         -0.047       
 %             15          84         0.293         -3.948
 %            16          11         0.290         -3.971
 %            17          16         0.293         -3.948
 %            18        255         0.298         -3.914           
 %            19            4         0.000          -5.998         
 %            20        317   => 0.866           0.061
 % 
 % maxresult =  
 %          20          458       0.866            0.061

% THERE ARE 7/20 R^2 RESULTS IN [ 0.837 0.867 ]

 % H=2 OVERFITTING (Nw > Neq) result =
 % 
 %            1           391      0.791 
 %             2          749      0.896
 %             3          456      0.896 
 %             4          247       ==>0.998  
 %             5          672      0.896 
 %             6              7      0.046 
 %             7          863      0.910      
 %             8            19      0.860      
 %             9            40       ==>0.990  
 %            10         669      0.913     
 %            11             7       ==>0.992   
 %            12           18      0.858       
 %            13           14       ==>0.993 
 %            14         526      0.910      
 %            15         625      0.579    
 %            16           16       ==>0.991   
 %            17       1000      0.911      
 %            18         952      0.896    
 %            19             9      0.036    
 %            20         132       ==>0.993      
 %  
 % maxresult =
 % 
 %            20        1000      0.998
 %
 % THERE ARE 6/20 R^2 RESULTS IN [ 0.990 0.998 ]

Hope this helps.

Thank you for formally accepting my answer

Greg

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 2

Greg Heath on 16 Feb 2013

Open in MATLAB Online

3 votes

The example below demonstrates how I design NNs. Quite a few of the steps are for understanding the data and the design procedure. Subsequently, I removed all command ending semicolons so you can cut, paste, and study the printout.

The model parameters that you are looking for are the network weights net.IW, net.LW and net.b from the best of several (or many) designs.

I use the degree-of-freedom adjusted coefficient of determination R2a (see wikipedia) to rank the nets. The DOFA is used to reduce the bias caused by evaluating the model with the same data that is used to design it.

Out of 10 designs I obtained two with R2a = 0.677 interpreted as having a model that represents ~ 68% of the variance of the probability distribution from which the design data were randomly selected. Without the DOF adjustment I obtained R2 = 0.908 representing the biased estimate of ~ 91%. The large difference is caused by just using 8 data point equations from a noncontinuous data distribution to estimate 6 weights of a continuous nonlinear model.

For large differences between R2 and R2a, it is useful to divide the data into 3 trn/val/tst subsets for

1. Estimating weights (training subset)

2. Choosing nonweight design parameters and evaluating different designs (validation subset)

3. Obtaining unbiased estimates of performance (test subset)

However, since this data set too small for a statistically significant data division (where all 3 subsets have approximately the same means and covariances) it would better to use a regularized objective function like 'msereg'. I will let you investigate that on your own with better chosen examples listed in the MATLAB documentation (help nndata).

The goal is to design sum of tanh neural network models to approximate your target data t, given input x. The model output y, has the error e = t-y whos mean-squared-error is to be minimized.

{close all, clear all,clc,plt=0
tic
% Ending semicolons removed for clarity of results
x=[ -1 1 -1 1 -1 1 -1 1; -1 -1 1 1 -1 -1 1 1; -1 -1 -1 -1 1 1 1 1] 
% x =
% 
%     -1     1    -1     1    -1     1    -1     1
%     -1    -1     1     1    -1    -1     1     1
%     -1    -1    -1    -1     1     1     1     1
t = [ 9 6 4 3 8 1 3 4]
[ I N ] =size(x)                         % [ 3  8  ]
[O N ] = size(t)                        % [ 1  8 ]
Neq = N*O                              % 8
plt = plt+1, figure(plt)
subplot(311)
plot( x(1,:), t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
subplot(312)
plot(x(2,:),  t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
subplot(313)
plot(x(3,:),  t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
% NAIVE CONSTANT MODEL
y00 = mean(t)                          %  4.7500
MSE00 = mean(var(t',1))        %  6.4375
MSE00a= mean(var(t'))          % 7.3571
% LINEAR MODEL
W0 = t/[x ; ones(1,N)]
%   -1.2500   -1.2500   -0.7500    4.7500
Nw0 = numel(W0)                 % 4
Ndof = Neq-Nw0                    % 4
y0 = W0*[ x ;ones(1,N)]
 % y0 = [  8  5.5  5.5  3  6.5   4   4  1.5 ]
e0 = t-y0
% e0 = [1    0.5   -1.5   -0.0    1   -3   -1 ]
SSE0 = sse(e0)                           %    22
MSE0 = SSE0/Neq                     %    2.75
NMSE0 = MSE0/MSE00             %   0.4272
R2 = 1-MSE0/MSE00                  %   0.5728
MSE0a = SSE0/Ndof                   %   5.5
NMSE0a = MSE0a/MSE00a       %   0.7476
R2a = 1-MSE0a/MSE00a            %   0.2524
% NEURAL MODEL
% Nw = (I+1)*H+(H+1)*O
% H =< Hub ==> Neq >= Nw
Hub = -1 + ceil( (Neq-O)/(I+O+1))  % 1
H = 1
Nw = (I+1)*H+(H+1)*O                   % 6
Ndof = Neq-Nw                                % 2
Ntrials = 10
net = fitnet(H);
net.divideFcn = 'dividetrain' % N too low for trn/val/tst
rng(0)
for i = 1:Ntrials
    net = configure(net,x,t)
    view(net)
    [net tr y e ] = train(net,x,t)
    Nepochs(i,1) = tr.epoch(end)
    SSE = sse(e)
    MSE = SSE/Neq
    NMSE = MSE/MSE00
    R2(i,1) = 1-NMSE
    MSEa = SSE/Ndof
    NMSEa = MSEa/MSE00a
    R2a(i,1) = 1-NMSEa
end
disp( ' Nepochs     R2          R2a ' )
disp ([ Nepochs R2 R2a ] )
%  Nepochs     R2          R2a 
%     6.          0.0874   -2.1942
%    36          0.9078    0.6772   % Best
%     2           0.2427   -1.6505
%     4           0.4008   -1.0971
%    15          0.2492   -1.6278
%    26          0.2492   -1.6278
%     7           0.2492   -1.6278
%     9           0.3424   -1.3016
%    11          0.0906   -2.1828
%    16          0.9078    0.6772  % Best
bestresults = [ t' y' e' ]   % last design
% target    output     error   
%     9           9            0
%     6           6            0
%     4           4            0
%     3         2.75       0.25
%     8           8            0
%     1         2.75      -1.75
%     3         2.75       0.25
%     4         2.75       1.25
% 
subplot(311)
hold on
plot( x(1,:), y, 'r*', 'LineWidth', 2)
subplot(312)
hold on
plot( x(2,:), y, 'r*', 'LineWidth', 2)
subplot(313)
hold on
plot( x(3,:), y, 'r*', 'LineWidth', 2)
time = toc   % 16.7882 sec }

Hope this helps.

Thank you for formally accepting my answer!

Greg

1 Comment
Show -1 older comments Hide -1 older comments

DemoiselX on 16 Feb 2013

Thank you a lot for answer!But realy i didn't understand all of this calcul. My idea is using toolbox of neural network nftool, i train the network with 8 input then wen i test whith an other values i don't found the same answer us with regression fuction i retrain the network many often but i don't found the good response Thank you for help

Sign in to comment.

Answer 3

Greg Heath on 16 Feb 2013

Open in MATLAB Online

2 votes

This is not the type of problem for a neural network. For the typical NN design, an approximation function of the type

y = b2 + LW*tanh(b1+IW*x);

is desired given input matrix x and target matrix t with dimensions

 [ I N ] = size(x)
 [ O N ] = size(t).

The solution desired are the weight matrices b1, IW, b2 and LW with dimensions

 [ H I ] = size(IW)
 [ H 1 ] = size(b1)
[ O H ] = size(LW)
 [ O 1 ] = size(b2)

obtained by minimizing the mean-squared error

MSE = sumsqr(t-y)/(N*O)

Hope this helps.

*Thank you for formally accepting my answer.

Greg

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 4

Greg Heath on 17 Feb 2013

Edited: Greg Heath on 17 Feb 2013

1 vote

I was confused. I thought that the eight corners of the 3D cube either represented the domain of x or the domain of x was the interior of the cube. Correspondingly, I thought you wanted a NN model to replace the equation. That was the answer I gave. It is the best for H=1 (I checked with Ntrials = 100).

You mentioned testing with other values. I assume they are in the interior of the cube. Is that right? Exactly what kind of numbers are you getting?

Now it seems that you want the NN model to yield the coefficients of the analytic model. Is that right?

Well, NNs don't work that way. Typically, you get a sum of tanh model that approximates the I/O relationship between the numerical design inputs and corresponding targets.

However, a solution for the coefficients is available. You would have to write the equation in terms of the components x1, x2 and x3. Then you would have 8 linear equations for the 6 unknowns a,b,c*a1,b1,c1 and d. c and a1 cannot be separated.

Hope this helps.

Thank you for formally accepting my answer.

Greg

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 5

Greg Heath on 19 Feb 2013

1 vote

OK. Let' try this:

You have 8 4-dimensional input vectors of the form

input = [ x1 x2 x3,x4 x5 x6 x7 x8 ; y1 y2 y3 y4 y5 y6 y7 y8 ] ;

size(input) = [ 4 8 ]

You have 8 8-dimensional output vectors which are all equal

output = repmat( C, 1, 8];

size(output) = [ 8 8 ]

where C is the 8-dimensional column vector containing the polynomial coefficients.

help fitnet

will allow you to find network weights that will solve the problem.

In fact, if you initialize the random number generator, and use a double do loop over number of hidden nodes and weight initializations, you may get many solutions.

My previous code should give you a hint.

Hope this help.

Thank you for formally accepting my answer

Greg

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 6

DemoiselX on 18 Feb 2013

0 votes

Sorry may be my question is not obvious from first.Ok! i have a design of expirements 2^3 with interactions between factors,I have already gess the mathematic equation which is ' y=23.72 + 19.84 X1 - 8.91 *X2 + 4.41 X3 -7.53*X1*X2+1.41*X2*X3+5.16*X1*X3+0.66*X1*X2*X3 ' Now i want to create a ANN and train this network just with values of x1 x2 x3 and y corresponding input : [ -1 -1 -1 1 -1 -1 -1 1 -1 1 1 -1 -1 -1 1 1 -1 1 -1 1 1 1 1 1] ouput y =[ 6.75 52.5 2.5 15.5 3.75 67.5 2.5 38.75] can i do this using nntool or nftool?How? Thaank you a lot but realy i'm blocked thereat

1 Comment
Show -1 older comments Hide -1 older comments

Greg Heath on 19 Feb 2013

But how did you know that the other 12 terms ( xi^2, xj*xi^2) were missing??

Sign in to comment.

Answer 7

how can i choose the parameters of my network?

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

0 Comments
Show -2 older comments Hide -2 older comments

More Answers (6)

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

how can i choose the parameters of my network?

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

0 Comments Show -2 older comments Hide -2 older comments

More Answers (6)

1 Comment Show -1 older comments Hide -1 older comments

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

1 Comment Show -1 older comments Hide -1 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

0 Comments
Show -2 older comments Hide -2 older comments