LSTM padding and masking

12 views (last 30 days)
Ao Du
Ao Du on 10 Dec 2020
Answered: Haijun Ruan on 21 Jul 2021
I am solving a sequence-to-sequence classification problem based on LSTM using Matlab 2020b. The sequences have varaible length so padding within each minibatch is needed. However, I am not sure if Matlab automatically do the masking when calculating the crossentroy loss as well as the training/validation accuracy. From the training plot, the reported accuracy (around 70%) is much lower than those manually calculated by using checkpoints (where I get around 90% accuracy). I suspect although Matlab 2020b supports sequence padding and validation data in LSTM, it still did not offer the option of masking to reduce the influence caused by padding. Any insights?

Answers (2)

Aditya Patil
Aditya Patil on 22 Dec 2020
Currently, masking is not supported in MATLAB. I have brought the request to the notice of concerned people.
As a workaround, you can sort the inputs so that the amount of padding required is minimized. You may also set the minibatch size to 1, so that no padding is required.
  1 Comment
Yildirim Kocoglu
Yildirim Kocoglu on 16 Jan 2021
Thank you! I was really curious about this as well since it can be done in python. I really hope they can add this feature.

Sign in to comment.


Haijun Ruan
Haijun Ruan on 21 Jul 2021
I am wondering whether masking is supported in MATLAB now.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!