How do I use polygon labeling for an instance segmentation neural network?

11 views (last 30 days)
I am trying to perform instance segmentation using a network trained with the trainFasterRCNNObjectDetector function from Computer Vision Toolbox. The end-goal is a network that can find location and size of defects in my images. My data source is 1200x2200 X-ray material images in greyscale. I have 200 images labelled with the Image Labeller app, both in pixel (from a previous Semantic Segmentation Network attempt) and now in Polygon.
As my images are in greyscale, I have opted for not using a pre-trained network, instead of copying the data into RGB channels and trying anyway.
In order to increase the amount of training data, I augment it liberally.
patch = [480 480 1];
%% Make training data
augmenter = imageDataAugmenter( ...
'RandXReflection', true, ...
'RandYReflection', true, ...
'RandRotation', [0 360], ...
'RandScale', [0.3 1.5], ...
'RandXShear', [-30 30], ...
'RandYShear', [-30 30], ...
'RandXTranslation', [-5 5], ...
'RandYTranslation', [-5 5]); %Transformations for more training data
batchsize = 16; %How many patches extracted per image, and that neural net uses in Option
patchds = randomPatchExtractionDatastore(imds,pxds,patch(1), ...
'PatchesPerImage', batchsize, 'DataAugmentation',augmenter);
%% Options
options = trainingOptions("adam", ...
MaxEpochs=5, ...
ExecutionEnvironment='parallel',...
InitialLearnRate=0.0001, ...
L2Regularization=0.0001, ...
LearnRateSchedule='piecewise',...
LearnRateDropPeriod=10,...
LearnRateDropFactor=0.1,...
MiniBatchSize=512, ...
Plots="training-progress", ...
Shuffle="every-epoch", ...
ValidationData=valdata, ...
ValidationFrequency=50,...
ValidationPatience=inf);
The network I'm using is a modified ResNet_18, created using the Deep Network Designer App.
THE PROBLEMS:
options and patchds worked when using trainNetwork and a Unet-based layer structure (with its own issues), but when I run this:
>> trainFasterRCNNObjectDetector(patchds,lgraph,options);
Error using vision.internal.inputValidation.checkGroundTruthDatastore
The read method of the training input datastore must return an M-by-3 cell or table.
Error in trainFasterRCNNObjectDetector>iParseInputs (line 1094)
params.ClassNames = vision.internal.inputValidation.checkGroundTruthDatastore(trainingDs);
Error in trainFasterRCNNObjectDetector (line 442)
[trainingData, options, params] = iParseInputs(...
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
I went online to find where my issues could be:
This answer seemed to indicate polygons just... Would not work.
Staff stated that this now worked, however:
I followed the lead from there:
Looked very promising, and I landed on
"Postprocess Exported Labels for Instance or Semantic Segmentation Networks
You can use the exported, labeled ground truth for training an instance segmentation network or a semantic segmentation network.
Follow these steps to process the polygon data for either semantic segmentation or instance segmentation." Huzzah!
So I follow these steps.
out = gatherLabelData(gTruth,[labelType.Polygon],'GroupLabelData','LabelType');
imageSize = [480 480];
for q=1:10
polygons = out{1}.PolygonData{q}(:,1);
numPolygons = size(polygons,1);
maskStack = false([imageSize(1:2) numPolygons]);
for i = 1:numPolygons
maskStack(:,:,i) = poly2mask(polygons{i}(:,1), ...
polygons{i}(:,2),imageSize(1),imageSize(2));
end
filename = sprintf('%s_%d','C:\...\PolygonLabelData\mask',q);
save(filename,"maskStack")
end
And I get a bunch of files with a depth = number of masks. Then... What? Where could I use this output? Any guide I can find just says to get TrainingData as an Mx2 table, with the first collumn being images, and the second being rectangles of interest. combine() isn't useful since my maskStack isn't a datastore.
Error using objectDetectorTrainingData
The input groundTruth object/s do not contain any valid object detector training data.
So my questions
1 Is there a better way; a better method or network, for example?
2 What should I do with the maskStacks in order to teach my net?
3 (bonus) since my idea is to find dark-grey boxes on light-grey backgrounds, it sounds... Hard. But how do I balance training? Any time I do it it either trends to 0 or 100% REAL fast. I've tried a 2,3,4 layer Unet, and the ResNet-18, but I am still not fully grasping it.
Thankful for any input, as I'd rather not have to hand-label 200 images again...

Accepted Answer

Birju Patel
Birju Patel on 17 Nov 2022
For instance segmentation, you should first try Mask R-CNN via trainMaskRCNN:
Faster R-CNN is designed for 2-D object detection, not instance segmentation.
  1 Comment
Alex
Alex on 25 Nov 2022
Thank you!
I was already using a ResNet18 structure, which is supposed to be implemented with the MaskRCNN. So this might indeed be the cause.
I'll see if this solves it.

Sign in to comment.

More Answers (0)

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!