Convolutional neural network regression - low RMSE but poor prediction performance?

14 views (last 30 days)
Dear community,
I'm using a convolutional neural network to predict a numerical output from onehot encoded matrices. After a long time finally I managed to configure a network which doesn't seem to overfit my data based on the learning curves:
Being happy with my achievement, I used my trained network to predict the test set, however, I get extremely poor performance:
Now I'm totally puzzled. I expected that similarly good RMSE values would yield me nearly perfect prediction, but it seems my network just memorizes my training data. But shouldn't I get typical overfitting learning curves then? Can the problem be related to the quality of the data?
As a supplementary information, here's my code:
% network configuration
filtersize=[3,3];
layer=[imageInputLayer(inputsize,"Normalization","none")
convolution2dLayer(filtersize,8,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2],'Stride',2)
convolution2dLayer(filtersize,8,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2],'Stride',2)
convolution2dLayer(filtersize,16,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2],'Stride',1)
convolution2dLayer(filtersize,32,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2])
convolution2dLayer(filtersize,64,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2])
flatLayer % this is a flatten layer imported from Keras
dropoutLayer(.65)
fullyConnectedLayer(2048)
dropoutLayer(.65)
fullyConnectedLayer(128)
fullyConnectedLayer(32)
reluLayer
fullyConnectedLayer(1)
regressionLayer];
lgraph=layerGraph(layer);
% training options
miniBatchSize=8;
options = trainingOptions('adam', ...
'MaxEpochs',4096,...
'MiniBatchSize',miniBatchSize, ...
'Shuffle','every-epoch', ...
"OutputNetwork",'best-validation-loss',...
'Plots','training-progress', ...
'Verbose',0,...
'VerboseFrequency',200,...
"ValidationData",{Xval,Yval},...
'ValidationFrequency',32,...
'ValidationPatience',128,...
'InitialLearnRate',1e-4,...
'LearnRateSchedule','none',...
'GradientDecayFactor',.7,...
'L2Regularization',10^(-4),...
'ExecutionEnvironment','auto');
Thanks for any kind of help

Answers (1)

Sai Pavan
Sai Pavan on 27 Sep 2023
Hi Daniel,
I understand that you want to know the possible reason for low performance of CNN regression model on the test set despite low RMSE values on the training set.
As you have rightly hypothesized, the most likely reason for this type of behaviour of the model, where it doesn’t overfit the training and validation datasets but perform poorly on the test set, is mismatch between training and test data distributions. It's crucial to have a balanced and representative test dataset that follows a similar distribution as the training dataset and captures the true variability of the real-world scenarios you want your model to perform well on.
Hope it helps.
Regards,
Sai Pavan

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!