Transfer Learning of various deep neural networks, validation accuracy is significantly lower than training accuracy.

Question

Ted on 22 Mar 2023

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/1933345-transfer-learning-of-various-deep-neural-networks-validation-accuracy-is-significantly-lower-than-t

Commented: Ted on 27 Mar 2023

So just an overview of what I am trying to achieve, I am transfer learning numerous of MATLAB's pre-trained deep neural networks onto a brain tumour dataset, to see which gives the highest accuracy.

I am using a brain tumour dataset where the training images contains around 1200 images, and the testing images is around 300, and it is split equally in both for the 3 different classifers.

The method I have used is the transfer learning code provided by MathWorks (here: https://uk.mathworks.com/help/deeplearning/ug/transfer-learning-using-pretrained-network.html), and I have swapped out "net = GoogLeNet" for the various models that I have chosen.

The issue I am facing is that while the training accuracy is very high, the validation accuracy seems extremely low in comparison. Here is the training process of ResNet-50. This achieved the highest accuracy out of all the models I have transfer learned.

As you can see the training accuracy is very good, and the validation accuracy is significantly lower, this is the same case for the loss aswell.

Now I am sure that the issue is related to overfitting, but I do not know how to solve this issue. I understand that using data augmentation or dropout layers could help in preventing overfitting, but I have no idea how to implement that in this context of transfer learning.

If anyone could help me with getting the accuracy and loss validation to properly fit the accuracy and loss training I would be extremely grateful.

Thanks in advance, and if any of my actual code is needed to solve this problem I can provide no issue.

2 Comments
Show NoneHide None

Christopher Erickson on 24 Mar 2023

Using dropout in this context can be very straightforward by simply adding dropoutLayers to your layer graph. I am supposing from your question that you are applying transfer learning in the Deep Network Designer, in which case please refer to to the augmentation options referenced in the transfer learning page. Before attempting either of these, however, I would suggest seeing what happens if you just train for a longer period of time. The phenomena of grokking or double descent might occur, so don't necessarily assume that a low training loss indicates that your model is done learning.

Ted on 27 Mar 2023

Hi,

Thanks for the help with your responce. To answer you're first question, no I am not using Deep Network Designer, I am using the full code provided here: Transfer Learning Using Pretrained Network - MATLAB & Simulink - MathWorks United Kingdom, to complete this task manually. I have provided my code at the end of this responce for reference,

For your point on what I should do first with training for a longer time to see what happens, would I just achieve this for running it with more epochs, or is there another method?
For your point on simply adding dropoutLayers, I have tried this since you provided you're responce but it produces many errors that I cannot understand. I have tried to add additional code to try to add a dropout layer but am unsure if I am implementing it correctly as it spits out a range of errors. The dropout code I used for implementation was:

newDropoutLayer = dropoutLayer(0.5, 'Name', 'new_dropout');

lgraph = addLayers(lgraph, newDropoutLayer);

Please see the full code below to see its implementation with the rest of the code. One such error I got was that the code thinks that layer "new_dropout" is disconnected, but I dont know how to solve this as I thought this was alreadya achieved through the addLayers command.

Full code:

unzip('train.zip');

imdsTrain = imageDatastore('train\','IncludeSubfolders',true, 'LabelSource', 'foldernames');

unzip('test.zip') ;

imdsTest = imageDatastore('test\', 'IncludeSubfolders', true, 'LabelSource', 'foldernames');

net = googlenet;

inputSize = net.Layers(1).InputSize;

lgraph = layerGraph(net);

[learnableLayer,classLayer] = findLayersToReplace(lgraph);

numClasses = numel(categories(imdsTrain.Labels));

% Replace Fully Connected Layer

if isa(learnableLayer,'nnet.cnn.layer.FullyConnectedLayer')

newLearnableLayer = fullyConnectedLayer(numClasses, ...

'Name','new_fc', ...

'WeightLearnRateFactor',10, ...

'BiasLearnRateFactor',10);

elseif isa(learnableLayer,'nnet.cnn.layer.Convolution2DLayer')

newLearnableLayer = convolution2dLayer(1,numClasses, ...

'Name','new_conv', ...

'WeightLearnRateFactor',10, ...

'BiasLearnRateFactor',10);

end

lgraph = replaceLayer(lgraph,learnableLayer.Name,newLearnableLayer);

% Replace Classification Layer

newClassLayer = classificationLayer('Name','new_classoutput');

lgraph = replaceLayer(lgraph,classLayer.Name,newClassLayer);

% Add Dropout Layer

newDropoutLayer = dropoutLayer(0.5, 'Name', 'new_dropout');

lgraph = addLayers(lgraph, newDropoutLayer);

layers = lgraph.Layers;

connections = lgraph.Connections;

layers(1:10) = freezeWeights(layers(1:10));

lgraph = createLgraphUsingConnections(layers,connections);

pixelRange = [-30 30];

scaleRange = [0.9 1.1];

imageAugmenter = imageDataAugmenter('RandRotation',[-10 10],'RandXReflection',true,'RandXTranslation',pixelRange,'RandYTranslation',pixelRange,'RandXScale',scaleRange,'RandYScale',scaleRange);

augimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain,'DataAugmentation',imageAugmenter);

augimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsTest);

miniBatchSize = 10;

valFrequency = floor(numel(augimdsTrain.Files)/miniBatchSize);

options = trainingOptions('sgdm', ...

'MiniBatchSize',miniBatchSize, ...

'MaxEpochs',6, ...

'InitialLearnRate',3e-4, ...

'Shuffle','every-epoch', ...

'ValidationData',augimdsValidation, ...

'ValidationFrequency',valFrequency, ...

'Verbose',false, ...

'Plots','training-progress');

net = trainNetwork(augimdsTrain,lgraph,options);

outputDir = "C:\Documents\Transfer Learning\GoogLeNet";

outputFile = fullfile(outputDir, "GoogLeNet.mat");

save(outputFile, "net");

Sign in to comment.

Sign in to answer this question.

Answer 1

Sandeep on 27 Mar 2023

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1933345-transfer-learning-of-various-deep-neural-networks-validation-accuracy-is-significantly-lower-than-t#answer_1201664

Open in MATLAB Online

Hi Ted,

It's great that you're working on transfer learning for a brain tumor dataset. Overfitting is indeed a common issue in deep learning models and data augmentation and dropout layers can help preventing it from happening.

Data augmentation can help by artificially increasing the size of the dataset and creating more variation in the training images. You can use MATLAB's built-in augmentedImageDatastore function to apply random transformations to the images, such as rotation, scaling, and flipping.

Here is an example on how to use the augmentedImageDatastore function,

augmenter = imageDataAugmenter('RandRotation',[-10 10],'RandXReflection',true,'RandYReflection',true);
trainingData = augmentedImageDatastore(imageSize, trainData, 'DataAugmentation', augmenter);

Dropout randomly sets some of the activations in the network to zero during training, which helps prevent the network from relying too heavily on any single input or feature. You can add dropout layers to your network using the dropoutLayer function.

An example for the dropoutLayer function is as follows,

layers = [ ...
    % Add some layers from the pretrained network
    dropoutLayer(0.5), ...
    fullyConnectedLayer(numClasses), ...
    softmaxLayer(), ...
    classificationLayer()]
;

1 Comment
Show -1 older commentsHide -1 older comments

Ted on 27 Mar 2023

Hi Sandeep,

The points you have made are very inciteful and helpful to my issue so I thank you.

However, as you can see from the code I have provided below, I have already implement some of the augmentation techniques.

Also from the code below, you can see how I have replaced the Fully Connected and Classification Layers. With this I have tried to add additional code to try to add a drop out layer, such as

newDropoutLayer = dropoutLayer(0.5, 'Name', 'new_dropout');

lgraph = addLayers(lgraph, newDropoutLayer);

But this has thrown many errors that I cannot solve, so I am very interested to implement your way but unsure how to as I have implemented the different layers in a different method as you can see in my full code.

Here is my full code:

unzip('train.zip');

imdsTrain = imageDatastore('train\', 'IncludeSubfolders' , true, 'LabelSource', 'foldernames');

unzip('test.zip') ;

imdsTest = imageDatastore('test\','IncludeSubfolders',true,'LabelSource','foldernames');

net = googlenet;

inputSize = net.Layers(1).InputSize;

lgraph = layerGraph(net);

[learnableLayer,classLayer] = findLayersToReplace(lgraph);

numClasses = numel(categories(imdsTrain.Labels));

% Replace Fully Connected Layer

if isa(learnableLayer,'nnet.cnn.layer.FullyConnectedLayer')

newLearnableLayer = fullyConnectedLayer(numClasses, ...

'Name','new_fc', ...

'WeightLearnRateFactor',10, ...

'BiasLearnRateFactor',10);

elseif isa(learnableLayer,'nnet.cnn.layer.Convolution2DLayer')

newLearnableLayer = convolution2dLayer(1,numClasses, ...

'Name','new_conv', ...

'WeightLearnRateFactor',10, ...

'BiasLearnRateFactor',10);

end

lgraph = replaceLayer(lgraph,learnableLayer.Name,newLearnableLayer);

% Replace Classification Layer

newClassLayer = classificationLayer('Name','new_classoutput');

lgraph = replaceLayer(lgraph,classLayer.Name,newClassLayer);

% Add Dropout Layer

newDropoutLayer = dropoutLayer(0.5, 'Name', 'new_dropout');

lgraph = addLayers(lgraph, newDropoutLayer);

layers = lgraph.Layers;

connections = lgraph.Connections;

layers(1:10) = freezeWeights(layers(1:10));

lgraph = createLgraphUsingConnections(layers,connections);

pixelRange = [-30 30];

scaleRange = [0.9 1.1];

imageAugmenter = imageDataAugmenter('RandRotation',[-10 10],'RandXReflection',true,'RandXTranslation',pixelRange,'RandYTranslation',pixelRange,'RandXScale',scaleRange,'RandYScale',scaleRange);

augimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain,'DataAugmentation',imageAugmenter);

augimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsTest);

miniBatchSize = 10;

valFrequency = floor(numel(augimdsTrain.Files)/miniBatchSize);

options = trainingOptions('sgdm', ...

'MiniBatchSize',miniBatchSize, ...

'MaxEpochs',6, ...

'InitialLearnRate',3e-4, ...

'Shuffle','every-epoch', ...

'ValidationData',augimdsValidation, ...

'ValidationFrequency',valFrequency, ...

'Verbose',false, ...

'Plots','training-progress');

net = trainNetwork(augimdsTrain,lgraph,options);

outputDir = "C:\Documents\Transfer Learning\GoogLeNet";

outputFile = fullfile(outputDir, "GoogLeNet.mat");

save(outputFile, "net");

Sign in to comment.

Transfer Learning of various deep neural networks, validation accuracy is significantly lower than training accuracy.

2 Comments
Show NoneHide None

Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Transfer Learning of various deep neural networks, validation accuracy is significantly lower than training accuracy.

2 Comments Show NoneHide None

Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

2 Comments
Show NoneHide None

1 Comment
Show -1 older commentsHide -1 older comments