Issues: Training CNN on LFW database.

4 views (last 30 days)
Shammakh Naseer
Shammakh Naseer on 9 Jan 2021
Answered: Jack Xiao on 22 Feb 2021
Working on a personal project, I am trying to learn about CNN's. I have been using the "transfered training" method to train a few CNN's on "Labeled faces in the wild" and at&t database combination, and I want to discuss the results.
I took 100 individuals LFW and all 40 from the AT&T database and used 75% for training and the rest for validation.
I also lack proper understanding in the relationship between CNN parameters and layers, so can someone please clarify it. I think you will be able to understand where I am getting confused after I explain the data I have.
I first trained Alexnet on it and I got this plot
So Alexnet has very few layers and is a small light net (even though it has alot of parameters) which is why I think it underfit the data?
I trained resnet50 on it and I get a similar result so I believe it also underfit the data? But this one flucuates and sometimes reaches 100% training accuracy, so maybe not underfit?
I also trained inceptionresnetv2 on the data and I get this result. I am not sure about what is going on here.
I wanted to take a closer look and so I trained it again and with a lower learning rate just to make sure it wasn't that. Could this be attributed to the mini batch size?
I also trained the efficientnet with this data and reached and pretty much stayed at 100% training accuracy and a constant 70% accuracy. Maybe that was overfitting or just alright?
The last ones which gave the best results was xception and densenet CNN which had 100% training accuracy and 80% validation accuracy. Densenet overfit I think but am not sure. Perhaps xception did too?
Can someone explain the data and suggest improvements please
Edit #1
I forgot to mention that the LFW database sometimes has 2 faces in a picture (very few pictures tho) and a good number of people who look similar. The validation accuracy is most likely around 80% because of that. During my testing, I figured out that in a few images, it gave an output based on the face in the background. Sometimes it couldn't distinguish between two different people who looked similar
CODE EXPLANATION
% Random translations
pixelRange = [-10 10];
scaleRange = [0.5 1.5];
imageAugmenter = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange, 'RandYTranslation',pixelRange,...
'RandXScale',scaleRange, 'RandYScale',scaleRange, 'RandRotation', [-45 45]);
%==========================================================================
inputSize = g.Layers(1).InputSize;
% Resize images in both Training & Validation (Different folders)
augimdsTrain = augmentedImageDatastore(inputSize(1:2),imda, ...
'DataAugmentation',imageAugmenter);
augimdsValidation = augmentedImageDatastore(inputSize(1:2),imda2);
%==========================================================================
miniBatchSize = 20;
valFrequency = 80;
opts=trainingOptions('sgdm','InitialLearnRate',0.0004,'LearnRateSchedule', ...
'piecewise','LearnRateDropPeriod',7,'LearnRateDropFactor',0.4,'ExecutionEnvironment','gpu','WorkerLoad', 1,'Shuffle',...
'every-epoch','ValidationData',augimdsValidation, 'ValidationFrequency',valFrequency, ...
'MaxEpochs',200,'MiniBatchSize',miniBatchSize,'Plots','training-progress', 'CheckpointPath', './DCHK');
myNet1=trainNetwork(augimdsTrain,lg,opts)
This was my code when training all the networks.
I only adjusted the learning rates, and also batch sizes but that only so it works with my gpu.
The learning rates above were for alexnet.
I increased the learning rate nad drop period for the deeper nets a bit like Initialrate was 0.001 for xception net and drop period was 10.

Answers (1)

Jack Xiao
Jack Xiao on 22 Feb 2021
reduce the learning rate to a smaller value such as 0.0001, try to add more data and add dropout layer, or change a little weak net.

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!