How can I forecast an integer related to other 5 numbers?
4 views (last 30 days)
Show older comments
I have 15 sets of data, you can see below:
33 48 47 68 79 => 6;
26 32 32 53 56 => 11;
60 17 19 57 59 => 16;
49 29 13 44 51 => 23;
0 19 25 44 73 => 24;
31 9 3 56 63 => 31;
3 22 24 43 39 => 37;
-3 4 39 32 52 => 41;
79 3 7 48 24 => 51;
9 26 24 49 5 => 68;
7 -2 0 57 41 => 75;
93 18 26 0 -2 => 76;
33 22 39 -1 5 => 103;
14 26 7 31 11 => 114;
17 -2 19 1 40 => 138;
Here first 5 data in each row are inputs and the last on in each row is output. I want to forecast result of other 4-numbers set of inputs; for example:
60 50 30 50 60 => ?;
9 Comments
Accepted Answer
Image Analyst
on 21 Aug 2012
Edited: Image Analyst
on 21 Aug 2012
If you don't have the Stats Toolbox, or the toolbox with regress() in it, you can use base MATLAB. Try a multiple linear regression like this:
clc; % Clear command window.
workspace; % Make sure the workspace panel is showing.
data = [...
33 48 47 68 79 6;
26 32 32 53 56 11;
60 17 19 57 59 16;
49 29 13 44 51 23;
0 19 25 44 73 24;
31 9 3 56 63 31;
3 22 24 43 39 37;
-3 4 39 32 52 41;
79 3 7 48 24 51;
9 26 24 49 5 68;
7 -2 0 57 41 75;
93 18 26 0 -2 76;
33 22 39 -1 5 103;
14 26 7 31 11 114;
17 -2 19 1 40 138]
numberOfObservations = size(data, 1);
% Set up the "x1" through "x5" values.
% Also add a contant term - see the help:
% MATLAB->Users Guide->Data Analysis->Regression Analysis->Programmatic Fitting->Multiple Regression.
inputs = [ones(numberOfObservations, 1) data(:, 1:5)]
% Extract out the "y" values.
observations = data(:, 6)
% Get the multiple linear coefficients.
coefficients = inputs \ observations % Will do a least square solution.
for row = 1 : length(observations)
predictedValue = coefficients(1) + ...
coefficients(2) * inputs(row, 1) + ...
coefficients(3) * inputs(row, 2) + ...
coefficients(4) * inputs(row, 3) + ...
coefficients(5) * inputs(row, 4) + ...
coefficients(6) * inputs(row, 5);
fprintf('For row #%d, predicted value = %.1f, observed value = %.1f\n',...
row, predictedValue, observations(row));
end
2 Comments
Image Analyst
on 21 Aug 2012
Use the formula:
predictedValue = coefficients(1) + ...
coefficients(2) * inputs(row, 1) + ...
coefficients(3) * inputs(row, 2) + ...
coefficients(4) * inputs(row, 3) + ...
coefficients(5) * inputs(row, 4) + ...
coefficients(6) * inputs(row, 5);
but just put in your 5 other numbers instead of the inputs() from the observations.
More Answers (6)
Sumit Tandon
on 21 Aug 2012
Agree with Matt. Do you have any idea about the nature of data, relationship of predictors, etc.
A place to start fitting would be LinearModel.Fit that can help you create linear regression model - assuming the output can be represented as a linear combination of predictors.
0 Comments
Greg Heath
on 22 Aug 2012
Edited: Matt Fig
on 22 Aug 2012
% 70% of the 10 NN solutions using H = 1 hidden node yield good solutions with adjusted R^2 ~ 0.98. Running time ~ 7 sec
close all, clear all, clc, plt = 0;
tic
z = [...
33 48 47 68 79 6;
26 32 32 53 56 11;
60 17 19 57 59 16;
49 29 13 44 51 23;
0 19 25 44 73 24;
31 9 3 56 63 31;
3 22 24 43 39 37;
-3 4 39 32 52 41;
79 3 7 48 24 51;
9 26 24 49 5 68;
7 -2 0 57 41 75;
93 18 26 0 -2 76;
33 22 39 -1 5 103;
14 26 7 31 11 114;
17 -2 19 1 40 138]';
x = z(1:end-1,: );
t = z(end, : );
[ I N ] = size(x) % [ 5 15 ]
[ O N ] = size(t) % [ 1 15 ]
Neq = N*O % No. of training equations
% Naive Constant Model
y00 = round(repmat(mean(t,2),1,N)) % 54*ones(1,N)
Nw00 = O % 1 = No. of independent weights
e00 = t - y00;
MSE00 = sse(e00)/Neq % 1508.7
MSE00a = sse(e00)/(Neq-Nw00) % 1616.5 = degree-of-freedom adjusted(= var(t))
% Linear Model
W = t / [ ones(1,N) ; x]
Nw0 = numel(W) % 6
y0 = round(W*[ ones(1,N) ; x])
e0 = t - y0;
MSE0 = sse(e0)/Neq % 272.61
MSE0a =sse(e0)/(Neq-Nw0) % 455.34
R20 = 1-MSE0/MSE00 % 0.81931= coefficient of determination (Rsquared)
R20a = 1-MSE0a/MSE00a % 0.71893 = adjusted R^2
plt=plt+1,figure(plt)
hold on
plot( y00, 'k--', 'LineWidth' ,2 )
plot( y0, 'b--', 'LineWidth' ,2 )
plot( t, 'ko', 'LineWidth' ,2 )
legend('Constant Model','Linear Model','Data',2)
title('Constant and Linear Models')
% Neural Network Model
R2agoal = 0.98
% For H hidden nodes I-H-O node topology yields Nw = (I+1)*H+(H+1)*O
weights to be estimated by Neq equations. Although typically just require
Neq >= Nw, actually desire Neq >> Nw to mitigate neasurement errors.
Hub = floor((Neq-O)/(I+O+1)) % 2
Hmax = 1 % Choice (larger values require a more complicated design)
Ntrials = 10 % Multiple random initial weights mitigate nonconvergence and local minima
rng(0) % Initialize RNG
j= 0
for h = 0:Hmax
j= j+1;
for i = 1:Ntrials
if h == 0 % Linear Model
Nw = (I+1)*O;
net = fitnet([]);
else
Nw = (I+1)*h+(h+1)*O;
net = fitnet(h);
end
net.divideFcn = '';
MSEgoal = (Neq - Nw)*(1-R2agoal)*MSE00/(Neq - Nw00);
net.trainParam.goal = MSEgoal ;
[net tr ] = train(net,x,t);
Nepochs(i,j) = tr.epoch(end);
MSE = tr.perf(end);
R2(i,j) = 1-MSE/MSE00;
if Nw >= Neq
MSEa = NaN;
R2a(i,j) = NaN;
else
MSEa = Neq*MSE/(Neq-Nw);
R2a(i,j) = 1-MSEa/MSE00a;
end
end
end
y = round(net(x)); % convenient choice of solutions (see tabulation below)
plt=plt+1,figure(plt)
hold on
plot( y00, 'k--', 'LineWidth' ,2 )
plot( y0, 'b--', 'LineWidth' ,2 )
plot( y, 'r--', 'LineWidth' ,2 )
plot( t, 'ko', 'LineWidth' ,2 )
legend('Constant Model','Linear Model','NN Model','Data',2)
title('Constant, Linear and NN Models')
H = 0:Hmax
Nepochs = Nepochs
R2 = R2
R2a = R2a
toc
% H = 0 1
%
%
% Nepochs = 2 18
% 2 17
% 2 7 <==
% 2 9
% 2 18
% 2 16 <==
% 2 7 <==
% 2 15
% 2 16
% 2 17
%
% R2 = 0.81931 0.99001
% 0.81931 0.99137
% 0.81931 0.619 <==
% 0.81931 0.99007
% 0.81931 0.99042
% 0.81931 0.14838 <==
% 0.81931 0.65724 <==
% 0.81931 0.9901
% 0.81931 0.99008
% 0.81931 0.99008
%
%
% R2a = 0.71893 0.98002
% 0.71893 0.98273
% 0.71893 0.238 < ==
% 0.71893 0.98014
% 0.71893 0.98085
% 0.71893 -0.70323 <==
% 0.71893 0.31447 <==
% 0.71893 0.9802
% 0.71893 0.98016
% 0.71893 0.98017
8 Comments
Greg Heath
on 25 Aug 2012
No need to apologize!
I was intrigued by your brute force approach. It truly extends the Confucius adage "Try all, choose best" to "Try many, combine best".
A similar combination of trained network outputs (not weights) is called ensemble averaging in NN jargon. The individual NN training algorithm I used is based on MSE minimization using Levenberg-Marquardt optimization (doc trainlm) . The weight initialization is random but constrained to parts of weight space chosen by the Widrow-Nguyen algorithm (initnw). Nevertheless, not all searches converge and some searches only converge to local optima.
Obviously, my code can be modified to automatically reject solutions that don't achieve the Ra^2 goal within a specified number of search steps and/or stop when a specified number of acceptable solutions are found. My tabulations show that the successful solutions were acheived within 20 epochs(passes of the data)
I was also intrigued by your use of target ranking instead of actual values. In particular, I am wondering if this can be modified to use as an objective function in a gradient search like LM.
Obviously, for serious algorithm comparisons more data is needed. However, if data is to be simulated, it should have, at least, similar 2nd order summary statistics (e.g., min, mean/median, std, cross-correlations, max) as the true data. Using similar histograms is probably over the top.
Greg
Matt Fig
on 25 Aug 2012
Thanks, Greg. Yes, I too wish we had more data just to see how these things work out. I initially chose such a large sample space because I wasn't sure that a (relatively) unique weight vector would give a the correct ranking. So at first I saved all of the weight vectors that gave the correct ranking then looked at them statistically. It turns out that they all fall within a narrow range (element-wise), which lends further confidence to the approach.
I have been fascinated by the idea of NN algorithms for some time, but never really looked into using them. This little experiment may give me the impetus to read up on them - finally! Though I have a feeling that such a technique is best for more hard-core problems than the present situation. I have the feeling that using NN to solve this particular problem is like using a sledge hammer to crack a walnut, though it is illustrative :-).
Thanks again,
Matt
José-Luis
on 21 Aug 2012
Edited: José-Luis
on 21 Aug 2012
Quick solution, you can try multiple linear regression:
data = rand(10,5); res = rand(10,1);
betahat = regress(res,data); %doc regress for more info
And your predictions:
prediction = score1 .* betahat(1) + score2 .* betahat(2), etc...
Cheers!
3 Comments
José-Luis
on 23 Aug 2012
Yes, I don't doubt that ANN's will give a better simulation result. It would be fun to compare the predictive power of the two methods. Either we get more data from the op or we use some jackknifing/bootstrapping on what he gave us.
Greg Heath
on 24 Aug 2012
Edited: Greg Heath
on 24 Aug 2012
The Linear Model from REGRESS should yield EXACTLY the same results as my Backslash Linear Model ( y0 = W0*[ones(1,N); x ] ) AND my ten Linear ( H =0) NN models. They have been compared (Ra^2 = 0.82 vs 0.72) using the DOF-adjusted R^2 because the same data is used for both training and testing.
A more satisfying unbiased comparison via testing on nondesign data is readily achieved by randomly dividing the data into design (training + validation) and testing subsets.
I forgot what jackknifing is. Although I have designed and used several Bootstrapping codes, they are no longer available to me. Furthermore, it is not an option in the NN Toolbox. Therefore, the easiest way for me to otain an unbiased estimate of prediction error is to stick with multiple trials using random subset division.
What fractions do you suggest for training/validation/testing? If I use an adjusted R^2 stopping rule for the gradient descent optimization, I can eliminate the validation set.
What trn/tst splits and how many trials per split do you suggest?
As demonstrated above, because of random weight initializtion, the NN steepest descent optimization is only expected to converge ~ 70% of the time when H = 1.
What trn/tst splits and how many trials per split do you suggest?
Greg
Matt Fig
on 21 Aug 2012
Edited: Matt Fig
on 22 Aug 2012
Here is another way that uses no toolbox. Find the weights....
% First the data.
D = [33 48 47 68 79 6;
26 32 32 53 56 11;
60 17 19 57 59 16;
49 29 13 44 51 23;
0 19 25 44 73 24;
31 9 3 56 63 31;
3 22 24 43 39 37;
-3 4 39 32 52 41;
79 3 7 48 24 51;
9 26 24 49 5 68;
7 -2 0 57 41 75;
93 18 26 0 -2 76;
33 22 39 -1 5 103;
14 26 7 31 11 114;
17 -2 19 1 40 138];
% And the value we want to predict
P = [60 50 30 50 60];
% P = [100 100 100 100 100] % Check we should get number 1
% Now find our weights! We only need to do this once!
% Once the M is found, we can use it to predict as many times
% as needed. See below to understand how to use M.
S = (D(:,1:5)+33.33333)/133.33333; % Scale from 0 to 1.
R = D(:,6);
T = (1:15).';
cnt = 0;
C = zeros(1,5);
for jj=1:30000
V = rand(1,5);
[~,I] = sort(sum(bsxfun(@times,S,V),2)/5,'descend');
if sum(T~=I)==0
C = C + V;
cnt = cnt + 1;
end
end
M = C/(cnt); % This we use to predict. Calculate once, store.
X = mean(bsxfun(@times,S,M),2); % For interp1
% Just to check and show we are close here(borrowed from IA.)
for ii = 1:size(S,1)
Y = interp1(X,R,mean(S(ii,:).*M),'linear');
fprintf('For row #%d, predicted value = %.1f, observed value = %.1f\n',...
ii, Y, R(ii));
end
% So, it works. Now for your P. This is how to use M.
% Once M is found we can do this for any P.
Y = round(interp1(X,R,mean((P+33.33333)/133.33333.*M),'linear'));
if isnan(Y) % In case Y is outside range in data [6 132]
Y = round(interp1(X,R,mean((P+33.33333)/133.33333.*M),...
'linear','extrap')); % Extrapolate
Y = max(Y,1);
end
1 Comment
Greg Heath
on 26 Aug 2012
Matt,
I took a closer look at your approach and have several comments.
1. Depending on the RNG seed, only about 5 to 20 out of 30,000 weight vectors are chosen.
2. It appears that the nearest neighbor option for interp1 yields the best extrapolation results.
3. Correlations seem to be important. Only C45, C46 and C56 are significant
4. Simulated data should probably be generated from a distribution with the same mean and covariance. Truncated Gaussian is probably the easiest
close all, clear all, clc, plt = 0;
% >First the data.
D = [33 48 47 68 79 6;
26 32 32 53 56 11;
60 17 19 57 59 16;
49 29 13 44 51 23;
0 19 25 44 73 24;
31 9 3 56 63 31;
3 22 24 43 39 37;
-3 4 39 32 52 41;
79 3 7 48 24 51;
9 26 24 49 5 68;
7 -2 0 57 41 75;
93 18 26 0 -2 76;
33 22 39 -1 5 103;
14 26 7 31 11 114;
17 -2 19 1 40 138];
% >And the value we want to predict
P = [60 50 30 50 60];
S = (D(:,1:5)+33.33333)/133.33333; % Scale from 0 to 1.
R = D(:,6);
T = (1:15).';
rng(0)
n = 0
M = zeros(5,1);
for jj=1:30000
V = rand(5,1)/5;
[ ~, I ] = sort( S*V,'descend');
if I == T
n = n+1;
M = M + (V-M)/n;
end
end
n = n % 12 = Numer of successful weight vectors
X = S*M
YS = round(interp1( X, R, S*M , 'linear' ));
MSE = mse(R-YS) %0
Ptst = (P+33.33333)/133.33333;
YS = round(interp1( X, R, Ptst*M , 'linear' )) % 9
Stst = [ min(S); mean(S); median(S); max(S) ]
Ytst(:,1) = round(interp1( X, R, Stst*M , 'nearest' )) ;
Ytst(:,2) = round(interp1( X, R, Stst*M , 'nearest' ,'extrap')); % Best Extrapolator
Ytst(:,3) = round(interp1( X, R, Stst*M , 'linear' )) ;
Ytst(:,4) = round(interp1( X, R, Stst*M , 'linear' ,'extrap')) ;
Ytst(:,5) = round(interp1( X, R, Stst*M , 'spline' )) ;
Ytst(:,6) = round(interp1( X, R, Stst*M , 'spline' ,'extrap')) ;
Ytst(:,7) = round(interp1( X, R, Stst*M , 'cubic' )) ;
Ytst(:,8) = round(interp1( X, R, Stst*M , 'cubic' ,'extrap')) ;
Ytst = Ytst
% Ytst =
%
% NaN 138 NaN 246 2126 2126 -1628 -1628
% 37 37 35 35 35 35 35 35
% 31 31 32 32 32 32 32 32
% NaN 6 NaN 4 315 315 6 6
Alex Sha
on 3 Jun 2019
Use the model function below:
y = p1+p2*x1*x2+p3*x1*x3+p4*x1*x4+p5*x1*x5+p6*x2*x3+p7*x2*x4+p8*x2*x5+p9*x3*x4+p10*x3*x5+p11*x4*x5+p12*x1+p13*x2+p14*x3+p15*x4;
Root of Mean Square Error (RMSE): 2.11832321999897E-13
Sum of Squared Residual: 6.73093989658018E-25
Correlation Coef. (R): 1
R-Square: 1
Adjusted R-Square: 1
Determination Coef. (DC): 1
Chi-Square: 2.0707595735892E-26
F-Statistic: 0
Parameter Best Estimate
---------- -------------
p1 380.808647233767
p2 0.091333306284959
p3 0.0786391054081521
p4 0.135391754934587
p5 -0.0976320653522757
p6 -0.0232045598710267
p7 0.0743731650301964
p8 -0.011603371246762
p9 0.0745880654094293
p10 -0.0816092155858619
p11 0.0424039517835738
p12 -5.41887696727284
p13 -4.56307328623204
p14 -2.83539663209154
p15 -6.89992749398479
For prediction: with the input of (60 50 30 50 60), y=76.227553787907
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!