Clear Filters
Clear Filters

Zero Mini-batch Accuracy and drastic increase in Mini-batch Loss during LSTM training Network

6 views (last 30 days)
I am training a weighted loss LSTM on an unbalanced data, as belew:
% classes with fewer number of samples have larger weight
classWeights = [.10,.60,.30]
% sum up to 1
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits,'OutputMode','last')
dropoutLayer(0.2)
fullyConnectedLayer(100)
fullyConnectedLayer(50)
fullyConnectedLayer(numClasses)
softmaxLayer
weightedClassificationLayerLSTM(classWeights,"classoutput")]
miniBatchSize = 36;
options = trainingOptions('adam', ...
'InitialLearnRate',0.0001, ...
'SquaredGradientDecayFactor',0.99, ...
'Epsilon',1.0000e-08, ...
'MaxEpochs',130, ...
'GradientThreshold',2, ...
'MiniBatchSize',miniBatchSize , ...
'ValidationFrequency', 10, ...
'Verbose',1, ...
'Plots','training-progress',...
'LearnRateDropFactor',0.9, ...
'LearnRateDropPeriod',10, ...
'LearnRateSchedule','piecewise', ...
'Shuffle', 'never')
During the training sometimes the Mini-batch Accuracy drops to zero and simultaneously the Mini-batch Loss increases, I appreciate any hint to resolve this issue.

Answers (1)

Krishna
Krishna on 24 Apr 2024
Hello Poorya,
I understand that you're experiencing an issue where your network's mini-batch accuracy suddenly falls to zero. If this is happening with the training accuracy, it could be a sign that your model is overfitting the data. To confirm this, I suggest splitting your dataset into training, validation, and testing segments. If your training accuracy is good but the validation and testing accuracy are not good enough, then the data is overfitting. To prevent this,
  1. Consider shuffling your dataset at the start of each epoch. This would ensure all the dataset are used for testing, validation and training.
  2. Additionally, simplifying your network's complexity can help reduce the risk of overfitting. I observe that you are using two ‘fullyConnectedLayer’ consecutively which without an activation function in between is same as one fullyConnectedLayer, so you can delete one.
Hope this helps.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!