augmentedImageDatastore for image segmentation

10 views (last 30 days)
Hello,
I wish to create an augmented image datastore that I can use in training. Previously I used to augment all image pairs with my own custom function before training but then my project supervisor gave me the idea to augment during training to let the network see many more different images. I understand another approach would be to just augment even more images before training and decrease the number of epochs but I wish to succeed using MATLAB's built in augmenter as well. Here is the problem I am facing:
size(X_train) = [224 224 3 200]
size(Y_train) = [224 224 200]
For the provided example in MATLAB's documentation of augmentedImageDatastore, Y_train is just a 1D categorical array. In my case, I need to augment the X data as well as the Y data, with the same augmentation on each pair. I tried something like this:
%% Built-in augmenter
imageAugmenter = imageDataAugmenter( ...
'RandRotation',[0 360], ...
'RandXTranslation',[-5 5], ...
'RandYTranslation',[-5 5], ...
'RandXReflection', true, ...
'RandYReflection', true );
training = combine(ds_X_training, ds_Y_training);
aug_training = augmentedImageDatastore([224 224 3], training, 'DataAugmentation', imageAugmenter);
And I get the error:
This works fine, however:
X_aug_training = augmentedImageDatastore([224 224 3], ds_X_aug_training, 'DataAugmentation', imageAugmenter);
I understand the error arrises because I can't feed a combined datastore or pixelLabelDatastore into augmentedImageDatastore. I saw some examples on augmentation of pixellabel images; Augment Pixel Labels for Semantic Segmentation but the article did not mention anything about augmentedImageDatastore, which is the one I am interested in because it wont save augmented images in memory while training.

Accepted Answer

Matt J
Matt J on 8 Mar 2024
Edited: Matt J on 8 Mar 2024
Supply the training data in numeric form:
X_training = rand([224 224 3 200]) ; %Fake
Y_training = rand([224 224 1 200]) ; %Fake
imageAugmenter = imageDataAugmenter( ...
'RandRotation',[0 360], ...
'RandXTranslation',[-5 5], ...
'RandYTranslation',[-5 5], ...
'RandXReflection', true);
aug_training = augmentedImageDatastore([224 224], X_training, Y_training,...
'DataAugmentation', imageAugmenter)
aug_training =
augmentedImageDatastore with properties: NumObservations: 200 MiniBatchSize: 128 DataAugmentation: [1x1 imageDataAugmenter] ColorPreprocessing: 'none' OutputSize: [224 224] OutputSizeMode: 'resize' DispatchInBackground: 0
  1 Comment
Alexander Resare
Alexander Resare on 9 Mar 2024
Thank you, I can now load aug_training into memory. When I aim to use it however, I get a new error:
options = trainingOptions('adam', ...
'ExecutionEnvironment', 'gpu', ...
'MaxEpochs', 12, ...
'MiniBatchSize', 16, ...
'ValidationData', validation, ...
'InitialLearnRate',0.001, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor', 0.2, ...
'LearnRateDropPeriod', 3, ...
'ValidationFrequency', 10, ...
'OutputFcn', @(info) opt_func(info));
trained_net = trainNetwork(aug_training, lgraph, options);
I then tried to convert Y_training from arrays of numeric labels to a categorical array, which my supervisor had recomended me earlier in order to avoid unexpected values in the groundtruth data, resulting from an inappropriate interpolation method. Then I was faced with another error, even before attempting to train:
It doesn't make sense to me that when I pass Y_train as a categorical 224x224x1x200 array, it complains about Y_train not being a vector. I assume augmentedImageDatastore expects a 1D vector containing single labels of each image. Do you know how I can work around this issue for my segmentation task?
If useful knowledge, here are some details regarding the end layer of the network:
classNames = ["live" "dead" "background"];
classWeights = [20 20 1];
numClasses = 3;
imageSize = [224 224 3];
labelIDs = [255, 128, 0];
network = 'mobilenetv2';
lgraph = deeplabv3plusLayers(imageSize,numClasses,network);
end_layer = genDiceLossPixelClassificationLayer('end_layer', classWeights, labelIDs, classNames, true);
lgraph = replaceLayer(lgraph, 'classification', end_layer);
So in essence I am contemplating wether each pixel in the Y images should be 255, 128 or 0 / "live", "dead" or "background" before augmentation.

Sign in to comment.

More Answers (1)

Birju Patel
Birju Patel on 1 Apr 2024
I recommend combining imageDatastore and pixelLabelDatastore and then using a transform to implement data augmentation for semantic segmentation.
Here is an example:
augmentedImageDatastore was not designed to augment data for semantic segmentation.
  1 Comment
Matt J
Matt J on 1 Apr 2024
Unfortunately, though, this requires the Computer Vision Toolbox.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!