NaN in Neural network training and simulation; tonndata

21 views (last 30 days)
Hello,
I have two questions. Thank you very much for any inputs and ideas!!! 1) How to deal with NaN in Neural network training and simulation? The datasets I used as following. The input dataset is a 6*204 matrix with several NaNs. The output dataset is a 6*204 matrix with many NaNs. My simulation dataset is a 6*864000 matrix with many NaNs. I used nntool GUI to train network and do simulation. But the simulation results have numbers even for the simulation samples with all NaNs. I want to ask if there is a way I can set that NaN is not replaced by anything. Just keep it as NaN when do training and simulation.
2) When I try cell array generated by tonndata, the neural network treat the cell array as one sample, so there is no way to separate all samples into training data, test data, validation data. Anyone can share me why using cell array in neural network?
I googled but could not find a good answer or document about these two issues. Thank you very much for any inputs and help!
  3 Comments
Rong Yu
Rong Yu on 19 Jun 2015
Hi Greg,
Thank you very much for your comments!
An example of data can be as below. No values are in input, target, and simulation datasets. So I am wondering if neural network will take NaNs as no values for training and simulation and if the trained neural network can predict no values since there are lots of no values in my target datasets.
INPUT = [47.1166687000000 12.5424995400000 12.4942998900000 35.6557006800000 37.4290008500000; 970 5 38 1200 545; 0.308835652000000 NaN 0.340448478000000 NaN 0.361919623000000; 920.754849700000 1479.82682300000 1509.00637800000 834.223571300000 765.186007600000;
141.011858700000 240.689134100000 244.049134300000 195.169579100000 173.674052000000; 3.24222217100000 27.4396667500000 27.5893885300000 12.0933616600000 13.2053058200000];
TARGET = [2.40500000000000 NaN NaN NaN NaN; NaN NaN 1.20000000000000 NaN NaN; NaN NaN 0.676000000000000 0.949000000000000 1.55900000000000; NaN NaN NaN -1.60500000000000 NaN; NaN 1.50000000000000 NaN NaN NaN; 1.06700000000000 NaN NaN NaN NaN];
SIMU_DATA = [[30.1250038146973,30.3750038146973,30.6250038146973;921.869995117188,994.147766113281,1023.59332275391;0.0893665217391298,NaN,NaN;139.424161800494,138.530641879061,141.447721067883;242.061606131660,242.608550919427,243.153662321303;19.1725284152561,18.4053618537055,18.0085001627604]]
Thank you very much!!!~~~
Rong Yu
Rong Yu on 19 Jun 2015
Thank you Greg for your time and help! Do you mind to try it again. I copied INPUT, TARGET, and SIMU_DATA from here and pasted them to matlab command window directly. And they work fine. INPUT and TARGET are 6*5 double matrix; SIMU_DATA is 6*3 double matrix.
Thank you!~~~

Sign in to comment.

Accepted Answer

Greg Heath
Greg Heath on 22 Jun 2015
Do not refer to NaN as " No value ". It stands for "Not a Number" and is just referred to as NaN pronounced as en-ay-en.
close all, clear all, clc
[ I N ] = size(x) % [ 6 5 ]
[ O N ] = size(t) % [ 6 5 ]
net = fitnet; % H=10
net.divideFcn = 'dividetrain'; % Not much data
Hub = -1+ceil((N*O-O)/(I+O+1)) % 1 H= 10 is overfitting: need overtraining mitigation
rng('default')
for i = 1:20
net = configure(net,x,t);
[ net tr ] = trainbr(net,x,t); %mitigate overtraining
y = net(x)
stopcrit{i,1} = tr.stop;
MSE(i,1) = mse(t-y);
end
lasty = y
% = [ 2.405 2.405 2.405 2.405 2.405
% 1.2 1.2 1.2 1.2 1.2
% 1.6819 NaN 0.676 NaN 1.559
% -1.605 -1.605 -1.605 -1.605 -1.605
% 1.5 1.5 1.5 1.5 1.5
% 1.067 NaN 1.9648 NaN 1.1768 ]
stopcrit1 = stopcrit{1}
% = Minimum gradient reached.
stopcrit = stopcrit % repmat(stopcrit1,20,1)
MSEp = MSE'
% MSEp = e-17 x [ ...
% 104.51 6.77 8.44 103.22 200.35 0.30 0.14
% 131.41 177.22 195.06 832.83 0.55 0.22 0.89
% 0.87 0.85 583.46 430.18 0.54 0.33 ]
  3 Comments
Greg Heath
Greg Heath on 23 Jun 2015
Don't skip the calculation. It helps set an upper bound on the search for an optimal H. For example, it is desirable to have H << Hub.

Sign in to comment.

More Answers (1)

Eric Lin
Eric Lin on 19 Jun 2015
  1 Comment
Rong Yu
Rong Yu on 19 Jun 2015
Thank you, Eric! You are right. The link you shared answered part of my question. It looks that the Neural Network Toolbox fills the missing values in the input data with values calculated by a specific function (such as averages). Do you know if they deal with the missing values in the target data in the same way? Thank you!

Sign in to comment.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!