Error using trainNetwork (line 184) Invalid network. Caused by: Layer 4: The size of the pooling dimension of the padded input data must be larger than or equal to the pool si

24 views (last 30 days)
Hi everyone,
I want to study the performance of a given network varying the size and number of filters in convolution1dLayer and the number of windows provided as input to the trainNetwork function itself.
As you can see from the picture, I already defined (in a way that is suitable for Matlab) x_train and y_train, respectively the input and the output to my network. It should indeed receive a sequence of data from 3 sensors (in the Workspace, variable defined by n_window) and 5783 time points (in the variable length). Also, if I wish to, I could give to the network an input sequence (x_train) as a 1x1 cell (as it already is in the picture) with even greater number of windows (repeated sequences of 3*5723 matrices in x_train). This obviously implies to change the y_train as well, and the code does it well enough. For completeness, I sat the following options: solver to 'adam', maxEpochs equal to 100, SequencePaddingDirection to left and Verbose to 0.
Nevertheless, I was about to run the algorithm (just once, and that explains the break in the inner loop), to check whether it was about to work or not, but I get the following error. I don't know what is wrong, because the inputSize, parameter of the sequenceInputLayer layer, is [3 5723]. Therefore, since I sat the pooling to 2 and a stride of 2, there should be no errors, at least theoretically. I also tried to change the stride to 1, but no change in the error occurred. What am I doing wrong? Thanks in advance for your help! :)
Also, I just tried to add a MinLength parameter in sequenceInputLayer, as suggested by the error message: setting it to either 1, n_features (equal to 3) or length (equal to 5723) doesn't change the final result.

Accepted Answer

Ben
Ben on 21 Jun 2022
The issue is that the 2nd layer convolution1dLayer(3,10) has no padding. Currently its input size is [3, 5723, ?] where 3 is n_features, 5723 is length and ? represents the sequence dimension that is variable.
For a sequenceInputLayer where the input size is a two dimensional vector like [n_features, length] we interpret the first dimension as space and the 2nd dimension as features/channels.
The convolution1dLayer(3,10) operates on just the space dimension here and it will "squash" that size to [1, 10, ?] which is too small for pooling window of size 2.
You can see this by calling analyzeNetwork(layers) and checking the Activations column.
You can set the "Padding" to make things run - for example convolution1dLayer(3,10,"Padding","same") - however I think you probably want to adjust some things as the 5723 appears to be sequence length, which you shouldn't need to specify in sequenceInputLayer's input size - you might use it in the "MinLength" name-value pair though.
  7 Comments
Ben
Ben on 7 Sep 2023
Hi Kwasi,
Do you mean the computations in my last comment to find the minimum sequence length?
Unfortuantely I don't have a specific reference. The computations follow from the definition of the convolution and pooling operations with padding and strides.
There are some diagrams for 2d convolution on our documentation page, the general idea for padding and stride is the same in 1d and for pooling.
I can try explain the logic behind my argument above:
  • I used T as just an arbitrary sequence length variable for computations. Lets call the sequence
  • A convolution with filter size 3 and no padding reduces T to because without padding you can't compute convolutions at the end points of the sequence and . You could write this as where for some size 3 convolution kernel - you can see this general formula can't make sense for and because and aren't defined (without padding). This logic extends to arbitrary convolution sizes k to give .
  • Strides of k roughly divide the size of the input by k. Technically it's a little more complex. Given an input of length τ, a pool with stride k is computed starting at , or for up to some maximum N. If th pool has size p then to compute the pool at requires we have values of the sequence up to - i.e. . The is because the pool includes the current element it's starting at, and the proceding elements. Rearranging the above gives . So for the explicit example of a pool of size with stride on a sequence (outputted from the first convolution) of length you get . Since N has to be an integer you technically want as to get the sequence length of the output of the strided pool.
  • The argument for the 2nd convolution is the same as the first, so the sequence length drops to .
  • The final argument is that a pool of size p requires the input sequence to have at least length p, so with you get .
Hope that helps.

Sign in to comment.

More Answers (0)

Categories

Find more on Image Data Workflows in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!