how can i choose the parameters of my network?
Show older comments
Hi, I'm beginner in neural network,so to justify its using in my doctorate research, i want to do some comparatives with other tools to show the ability of this method.Now i want to create network, its input is x=[ -1 1 -1 1 -1 1 -1 1; -1 -1 1 1 -1 -1 1 1; -1 -1 -1 -1 1 1 1 1] and the ouptut for example y=[ 9 6 4 3 8 1 3 4] knowing that for example 9= a*(-1)+b*(-1)+c*(-1)*a1*(-1)*(-1)+b1*(-1)*(-1)+c1*(-1)*(-1)+d*(-1)*(-1)*(-1) the same way for others how can i choose the parameters of my network?
Accepted Answer
More Answers (6)
Greg Heath
on 16 Feb 2013
The example below demonstrates how I design NNs. Quite a few of the steps are for understanding the data and the design procedure. Subsequently, I removed all command ending semicolons so you can cut, paste, and study the printout.
The model parameters that you are looking for are the network weights net.IW, net.LW and net.b from the best of several (or many) designs.
I use the degree-of-freedom adjusted coefficient of determination R2a (see wikipedia) to rank the nets. The DOFA is used to reduce the bias caused by evaluating the model with the same data that is used to design it.
Out of 10 designs I obtained two with R2a = 0.677 interpreted as having a model that represents ~ 68% of the variance of the probability distribution from which the design data were randomly selected. Without the DOF adjustment I obtained R2 = 0.908 representing the biased estimate of ~ 91%. The large difference is caused by just using 8 data point equations from a noncontinuous data distribution to estimate 6 weights of a continuous nonlinear model.
For large differences between R2 and R2a, it is useful to divide the data into 3 trn/val/tst subsets for
1. Estimating weights (training subset)
2. Choosing nonweight design parameters and evaluating different designs (validation subset)
3. Obtaining unbiased estimates of performance (test subset)
However, since this data set too small for a statistically significant data division (where all 3 subsets have approximately the same means and covariances) it would better to use a regularized objective function like 'msereg'. I will let you investigate that on your own with better chosen examples listed in the MATLAB documentation (help nndata).
The goal is to design sum of tanh neural network models to approximate your target data t, given input x. The model output y, has the error e = t-y whos mean-squared-error is to be minimized.
{close all, clear all,clc,plt=0
tic
% Ending semicolons removed for clarity of results
x=[ -1 1 -1 1 -1 1 -1 1; -1 -1 1 1 -1 -1 1 1; -1 -1 -1 -1 1 1 1 1]
% x =
%
% -1 1 -1 1 -1 1 -1 1
% -1 -1 1 1 -1 -1 1 1
% -1 -1 -1 -1 1 1 1 1
t = [ 9 6 4 3 8 1 3 4]
[ I N ] =size(x) % [ 3 8 ]
[O N ] = size(t) % [ 1 8 ]
Neq = N*O % 8
plt = plt+1, figure(plt)
subplot(311)
plot( x(1,:), t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
subplot(312)
plot(x(2,:), t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
subplot(313)
plot(x(3,:), t, 'ko', 'LineWidth', 2)
xlim( [-2 2 ] )
% NAIVE CONSTANT MODEL
y00 = mean(t) % 4.7500
MSE00 = mean(var(t',1)) % 6.4375
MSE00a= mean(var(t')) % 7.3571
% LINEAR MODEL
W0 = t/[x ; ones(1,N)]
% -1.2500 -1.2500 -0.7500 4.7500
Nw0 = numel(W0) % 4
Ndof = Neq-Nw0 % 4
y0 = W0*[ x ;ones(1,N)]
% y0 = [ 8 5.5 5.5 3 6.5 4 4 1.5 ]
e0 = t-y0
% e0 = [1 0.5 -1.5 -0.0 1 -3 -1 ]
SSE0 = sse(e0) % 22
MSE0 = SSE0/Neq % 2.75
NMSE0 = MSE0/MSE00 % 0.4272
R2 = 1-MSE0/MSE00 % 0.5728
MSE0a = SSE0/Ndof % 5.5
NMSE0a = MSE0a/MSE00a % 0.7476
R2a = 1-MSE0a/MSE00a % 0.2524
% NEURAL MODEL
% Nw = (I+1)*H+(H+1)*O
% H =< Hub ==> Neq >= Nw
Hub = -1 + ceil( (Neq-O)/(I+O+1)) % 1
H = 1
Nw = (I+1)*H+(H+1)*O % 6
Ndof = Neq-Nw % 2
Ntrials = 10
net = fitnet(H);
net.divideFcn = 'dividetrain' % N too low for trn/val/tst
rng(0)
for i = 1:Ntrials
net = configure(net,x,t)
view(net)
[net tr y e ] = train(net,x,t)
Nepochs(i,1) = tr.epoch(end)
SSE = sse(e)
MSE = SSE/Neq
NMSE = MSE/MSE00
R2(i,1) = 1-NMSE
MSEa = SSE/Ndof
NMSEa = MSEa/MSE00a
R2a(i,1) = 1-NMSEa
end
disp( ' Nepochs R2 R2a ' )
disp ([ Nepochs R2 R2a ] )
% Nepochs R2 R2a
% 6. 0.0874 -2.1942
% 36 0.9078 0.6772 % Best
% 2 0.2427 -1.6505
% 4 0.4008 -1.0971
% 15 0.2492 -1.6278
% 26 0.2492 -1.6278
% 7 0.2492 -1.6278
% 9 0.3424 -1.3016
% 11 0.0906 -2.1828
% 16 0.9078 0.6772 % Best
bestresults = [ t' y' e' ] % last design
% target output error
% 9 9 0
% 6 6 0
% 4 4 0
% 3 2.75 0.25
% 8 8 0
% 1 2.75 -1.75
% 3 2.75 0.25
% 4 2.75 1.25
%
subplot(311)
hold on
plot( x(1,:), y, 'r*', 'LineWidth', 2)
subplot(312)
hold on
plot( x(2,:), y, 'r*', 'LineWidth', 2)
subplot(313)
hold on
plot( x(3,:), y, 'r*', 'LineWidth', 2)
time = toc % 16.7882 sec }
Hope this helps.
Thank you for formally accepting my answer!
Greg
1 Comment
DemoiselX
on 16 Feb 2013
Greg Heath
on 16 Feb 2013
This is not the type of problem for a neural network. For the typical NN design, an approximation function of the type
y = b2 + LW*tanh(b1+IW*x);
is desired given input matrix x and target matrix t with dimensions
[ I N ] = size(x)
[ O N ] = size(t).
The solution desired are the weight matrices b1, IW, b2 and LW with dimensions
[ H I ] = size(IW)
[ H 1 ] = size(b1)
[ O H ] = size(LW)
[ O 1 ] = size(b2)
obtained by minimizing the mean-squared error
MSE = sumsqr(t-y)/(N*O)
Hope this helps.
*Thank you for formally accepting my answer.
Greg
Greg Heath
on 17 Feb 2013
Edited: Greg Heath
on 17 Feb 2013
1 vote
I was confused. I thought that the eight corners of the 3D cube either represented the domain of x or the domain of x was the interior of the cube. Correspondingly, I thought you wanted a NN model to replace the equation. That was the answer I gave. It is the best for H=1 (I checked with Ntrials = 100).
You mentioned testing with other values. I assume they are in the interior of the cube. Is that right? Exactly what kind of numbers are you getting?
Now it seems that you want the NN model to yield the coefficients of the analytic model. Is that right?
Well, NNs don't work that way. Typically, you get a sum of tanh model that approximates the I/O relationship between the numerical design inputs and corresponding targets.
However, a solution for the coefficients is available. You would have to write the equation in terms of the components x1, x2 and x3. Then you would have 8 linear equations for the 6 unknowns a,b,c*a1,b1,c1 and d. c and a1 cannot be separated.
Hope this helps.
Thank you for formally accepting my answer.
Greg
Greg Heath
on 19 Feb 2013
1 vote
OK. Let' try this:
You have 8 4-dimensional input vectors of the form
input = [ x1 x2 x3,x4 x5 x6 x7 x8 ; y1 y2 y3 y4 y5 y6 y7 y8 ] ;
size(input) = [ 4 8 ]
You have 8 8-dimensional output vectors which are all equal
output = repmat( C, 1, 8];
size(output) = [ 8 8 ]
where C is the 8-dimensional column vector containing the polynomial coefficients.
help fitnet
will allow you to find network weights that will solve the problem.
In fact, if you initialize the random number generator, and use a double do loop over number of hidden nodes and weight initializations, you may get many solutions.
My previous code should give you a hint.
Hope this help.
Thank you for formally accepting my answer
Greg
DemoiselX
on 18 Feb 2013
0 votes
1 Comment
Greg Heath
on 19 Feb 2013
But how did you know that the other 12 terms ( xi^2, xj*xi^2) were missing??
DemoiselX
on 21 Feb 2013
0 votes
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!