Creating an RCNN with two image inputs and a regression output

4 views (last 30 days)
I'm trying to create an RCNN that compares two image samples (each 40x40x3, in the example below) and then gives a numerical response.
I feel that the two halves of the model need to be segregated until quite a way along and then joined using a cat layer.
The problem I have is that I can't seem to find a way to make the model load two images to two seperate sides of the model.
However, this relies upon an input format that doesn't seem to work with a regression output layer, it uses an imageDataStore, which in turn specifies a custom image loader "inRead" < imds = imageDatastore(trainImages.imageFilename, 'LabelSource', 'foldernames','IncludeSubfolders',true,'ReadFcn',@inRead); >
I've tried adapting Mahmoud's code to take a single image and cut it in half and then output the left and right half of the image to the two sides of the RCNN. This seems to work in terms of functionality. But I get an error from the "analyse" button on the Deep Network Designer. I also get an error when I try to train it, see below.
I get the following error:
Error using trainNetwork (line 170)
Invalid network.
Caused by:
Layer 'leftConv': Input size mismatch. Size of input to this layer is different from the expected input size.
Inputs to this layer:
from layer 'leftSplitter' (40×40 output)
Layer 'rightConv': Input size mismatch. Size of input to this layer is different from the expected input size.
Inputs to this layer:
from layer 'rightSplitter' (40×40 output)
As the input layer image is 40x80x3 and the subsequent output of the splitter items is 40x40* matlab seems to 'know' the size of the convolution layer should be, yet it doesn't work, anyone have any suggestions?
There's some demo code below, that creates a smalll model below which should learn to return the difference in mean value between the left and right image. No idea if the architecture is correct for this kind of problem, I just knocked it together to show you the error I'm getting.
Thanks,
George
* not sure what happened to the colour planes? Should it be 40x40x3 is that the problem?
% Some simple code to demonstrate the problem
imageSize = [40 80 3]
input = imageInputLayer(imageSize,'Name','InputLayer','Normalization','zerocenter');
layerL=splittingLayerLeftRight('leftSplitter','left')
layerR=splittingLayerLeftRight('rightSplitter','right')
RtwodConvLayer = convolution2dLayer(5, 32, 'Padding', 2, ...
'BiasLearnRateFactor', 2, 'Weights', ones([5 5 1 32]),'Name','rightConv','numChannels',1);
LtwodConvLayer = convolution2dLayer(5, 32, 'Padding', 2, ...
'BiasLearnRateFactor', 2, 'Weights', ones([5 5 1 32]),'Name','leftConv','numChannels',1);
leftCol = [input; layerL;LtwodConvLayer]
rightCol= [layerR;RtwodConvLayer]
% outputlayers
% cat convs
numInputs = 2;
cat_dim = 3; %third dimension
cat_Layer = concatenationLayer(cat_dim,numInputs,'Name','Cat-Layer');
fc1 = fullyConnectedLayer(1024,'Name', 'FC-1');
relu1 = reluLayer('Name','ReLu-FC-1');
dropout1 = dropoutLayer('Name','dropOut-FC-1');
fcWidth=10;
fc = fullyConnectedLayer(fcWidth,'Name', 'FC-out');
%softmxLayer = softmaxLayer('Name','Softmaxx');
%endLayer = classificationLayer('Name','outLayer');
rLayer = regressionLayer("Name","regressionoutput");
outputLayers = [ cat_Layer
fc1
relu1
dropout1
fc
rLayer];
%fullModel = [leftCol; outputLayers; rightCol]
layers= layerGraph([leftCol; outputLayers]);
layers= addLayers(layers,rightCol);
% % layers = connectLayers(layers,sprintf('rightConv',...
% % length(layerDepths),length(layerDepths)),'Cat-Layer/in2');
layers = connectLayers(layers,'rightConv','Cat-Layer/in2');
% connect input to column 2
layers = connectLayers(layers,'InputLayer','rightSplitter');
tempoutputfolder = tempname;
mkdir(tempoutputfolder)
numImages = 100;
filenames = {};
imDiffs = [];
for i=1:numImages
leftImg = rand(40,40,3)+rand;
rightImg = rand(40,40,3)+rand;
filenames{i} = [tempoutputfolder filesep 'inputImg_' num2str(i) '.png'];
imwrite([leftImg rightImg],filenames{i});
imDiffs(i) = mean(leftImg(:))-mean(rightImg(:));
end
trainData = table(filenames',imDiffs');
MiniBatchSize= 128;
InitialLearnRate= 1e-3;
LearnRateSchedule= 'piecewise';
LearnRateDropFactor= 0.1;
MaxEpochs = 3000;
LearnRateDropPeriod= 1000;
Verbose= true;
ValidationFrequency=100;
Verbose=false;
Plots='training-progress';
options = trainingOptions('sgdm', ...
'MiniBatchSize', MiniBatchSize, ...
'InitialLearnRate', InitialLearnRate, ...
'LearnRateSchedule', LearnRateSchedule, ...
'LearnRateDropFactor', LearnRateDropFactor, ...
'LearnRateDropPeriod', LearnRateDropPeriod, ...
'MaxEpochs', MaxEpochs, ...
'Verbose', Verbose, ...
'Verbose',Verbose,...
'Plots',Plots);
rcnn=trainNetwork(trainData,layers,options)
% Modified by George Lovell from original code written by Mahmoud Afifi -- mafifi@eecs.yorku.ca | m.3afifi@gmail.com
% Split an image into left/right halves for output to a network with
% multiple image inputs.
%
% Requires Matlab 2019b or higher
classdef splittingLayerLeftRight < nnet.layer.Layer
properties
target
end
properties (Learnable)
end
methods
function layer = splittingLayerLeftRight(name,target)
layer.Name = name;
layer.Description = "splittingLayerLeftRight";
layer.target = target;
end
function Z = predict(layer, X)
imWidth = size(X,2);
if rem(imWidth,2)~=0
error('To split an image into two left/right halves it needs to have an even width');
else
imHalf=imWidth/2;
end
switch layer.target
case 'left'
Z = X(:,1:imHalf,1:3);
case 'right'
Z = X(:,imHalf+1:end,1:3);
end
%figure;imagesc(Z);
end
end
end
  2 Comments
George Lovell
George Lovell on 12 Dec 2019
Some progress:
It seem the the size of the input layer [40 80 3] is propogated forward so that the convolution layer is expecting this size, the splitting layer is making the image [40 40 3] so that's why it fails. If I modifiy the splitting layer so that the two halves of the input image are copied to the centre with some padding either side then the model seems to run without error, see below.
So it seems my question is ultimately concerned with how I 'tell' the convolution layer that it's input should be [40 40 3] and not [80 40 30]?
switch layer.target
case 'left'
Z = X; % First copy
Z(:) = 0; % make it a big zero
Z(:,(imHalf/2):(imHalf+(imHalf/2))-1,:) = X(:,1:imHalf,:); % now copy what I want into the middle
case 'right'
Z = X; % First copy
Z(:) = 0; % make it a big zero
Z(:,(imHalf/2):(imHalf+(imHalf/2))-1,:) = X(:,imHalf+1:end,:); % now copy what I want into the middle
end
George Lovell
George Lovell on 12 Dec 2019
I think I've solved it, posting here just in case someone else has a similar problem.
Instead of splitting the input layer using a new layer type, I've cropped the input layer, taking the leftside down one stream and the rightside down the other. To acheive this I have my input layer [128 256 3] and this feeds into a crop2dLayer on each side. The crop2Dlayer needs a reference to know what size it will receive. For this I created a new type of lay which simply cuts in half an input. The halfCropLayer takes an input from the inputLayer and outputs an image that is the left half of the input, it doesn't really matter what this output contains, it's just used as a reference for the array size. This feeds into the reference inputs of the crops.
takingTwoHalvesOfAnInput.png
The crops in the crop2dLayer are custom and set to [1 1] and [129 1]
Then the halfCropLayer looks like this:
% HalfCropLayer
% Requires Matlab 2019b or higher
classdef halfcropLayer < nnet.layer.Layer
properties
target
end
properties (Learnable)
end
methods
function layer = halfcropLayer(name,target)
layer.Name = name;
layer.Description = "halfcropLayer";
if nargin > 1
layer.target = target;
end
end
function Z = predict(layer, X)
imWidth = size(X,2);
if rem(imWidth,4)~=0
error('To split an image into two left/right halves it needs to have a width that is a multiple of 4');
else
imHalf=imWidth/2;
end
Z = X(:,1:imHalf,:); % now copy what I want into the middle
end
end
end

Sign in to comment.

Accepted Answer

Kenta
Kenta on 31 Mar 2020
As of 2019b, a new system called "custom training loop" which enables you to implement multi-input CNN is available.
For example, you can refer to the example below. As you are trying, you should separate the input images into 2 streams after the input layer in your way, but it seems a little bit complicated to implement. I think the demo below will provide you with some tips for your study.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!