Input structure for BiLSTM

3 views (last 30 days)
Sanjana Sankar
Sanjana Sankar on 19 Aug 2019
Commented: Sanjana Sankar on 29 Aug 2019
I keep getting this error when i run my BiLSTM model "Error using trainNetwork (line 165)
Invalid training data. Predictors must be a N-by-1 cell array of sequences, where N is the number of
sequences. All sequences must have the same feature dimension and at least one time step."
My X_train data is a 134949x1 cell with Nx1 nested cells that hold cells with 70x1 double
and my Y_train data is a 134949x1 cell with Nx1 nested cells that hold cells with 65x1 double.
I do not know what is wrong as the error says it should be N-by-1 cell array and I believe that is how my data is stored.
  2 Comments
Pravin Jagtap
Pravin Jagtap on 22 Aug 2019
Hello Sanjana,
I would like to know the way you maintained the structure of data has any significance? I need some more information on the description of data than only sizes of cell arrays so that we can think of reshaping/unwrapping the data in the required form.
Kind Regards
~Pravin
Sanjana Sankar
Sanjana Sankar on 29 Aug 2019
Hello!
Yes, basically I am storing data as words, split down to letters for the input and correspondingly, into phoneme in the output. Hence I cannot just expand all the cells. I need the demarcation between words.

Sign in to comment.

Answers (1)

Pravin Jagtap
Pravin Jagtap on 23 Aug 2019
Edited: Pravin Jagtap on 23 Aug 2019
Hello Sanjana,
Refer to following example in order to understand the desired structures of cell arrays of training data for Bi-LSTM model by going through the example:
In your case, there are 3 nested cell arrays whereas example mentioned above uses 2 nested cell arrays. Important thing to note here is we can have 2-D cell structure inside the main cell array. In your case you can combine N X 1 and 70 X 1 into one cell array 2-D cell array) which will be consistent with the desired structure of cell array as per above example.
To achieve the unwrapping/reshaping you can use the horizontal or vertical concatenation using following code:
%% Generating the Data
>>data={{1;2;3};{4;5;6};{7;8;9}};
%% Vertical concatenation
>>cell2mat(vertcat(data{:}));
%% Horizontal concatenation
>>cell2mat(horzcat(data{:}));
In general, for Bi-LSTM, make sure ‘X_Train‘ should have cell array size of ‘n X 1’ and each cell represents ‘f X l’ cell array size where f is number of features and l is length of each cell array data
( Note: f should be fixed whereas l can vary and follow the example given above to understand the desired structure of cell array for Bi-LSTM model.)
Kind Regards
~Pravin

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!