Converting Numeric Matrix to Cell Array nesting columns into line (LSTM training input data)

Please, how can I convert a Numeric Matrix (1400 lines/steps x 30 columns/features) to Cell Array (1400x1) Nx1 as required for LSTM training?
In this case, the 30 columns by line must be nested into the one cell array each line.
In outer view, cell array will be (1400x1), but when click on cell it will open as (1x30) containing the 30 features columns in one line.
Simplified example:
From (5x4) matrix A (lines: timesteps/observations, columns: input features data)
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8
To (5x1) cell array C (Nx1 as requested by LSTM, N = steps) nesting features columns into the cell:
{1x4} => (1 2 3 4) inside line cell (1x4)
{1x4} => (2 3 4 5) inside line cell (1x4)
{1x4} => (3 4 5 6) inside line cell (1x4)
{1x4} => (4 5 6 7) inside line cell (1x4)
{1x4} => (5 6 7 8) inside line cell (1x4)
I can build it manually from an empty cell array by entering in each line cell and pasting data (30 columns) into them, but my data has 1400 lines.
The label (output) categorical vector (5x1 example) for the LSTM is ready (1400x1 real) and contains binary classification (0, 1) for each line as expected responses.
Thanks!

 Accepted Answer

Use mat2cell(). Here is an example
A = [ ...
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8];
B = mat2cell(A, ones(size(A,1),1), size(A,2));

6 Comments

Thank you very much! Your answer is perfect for what I asked! Resulted 1400x1 cell array (1x30 cells).
But please, why LSTM now display error:
The training sequences are of feature dimension 1 but the input layer expects sequences of feature dimension 30.
Why the 30 nested columns inside the line cell is not being recognized as 30 dimension feature?
Please, how to convert these 30 columns inside cells to 30 lines instead? Res: 1400x1 cell array (30x1 cells).
So I can see if LSTM reads features data in lines sequence instead of columns in cells. Thanks!
I haven't used LSTM in MATLAB. However, if you can give a sample code that produces this error, I can give it a try.
I'm following the basic LSTM classification tutorial but adapting it to my training data. Tutorial:
In that tutorial, now I see that the 12 features input data are structured by 12 lines inside each cell array, not by columns as I did. Outside cell, each line is timestep (Nx1) N=steps. But inside cells, lines are input features.
What needs change in your answers is just inside each cell: convert columns to lines. My training data is like A (5x4) matrix in your answer, but real is 1400x30 matrix. Those 30 columns must be turned 30 lines inside the cell. Using A (5x4) example, outside cell array will be 5x1 (Nx1), inside the cell it will be 4x1 (4 features):
{4x1} => (1
2
3
4) inside line cell (4x1)
{4x1} => (2
3
4
5) inside line cell (4x1)
{4x1} ...
{4x1} ...
{4x1} ...
Thanks!
Try this
A = [ ...
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8];
B = mat2cell(A.', size(A,2), ones(size(A,1),1)).';
Wow... thank you! this is perfect and now the LSTM training works with no errors.
Just before you wrote this solution, I got your previous solution and applied cell transposing to it and works as well. B = mat2cell(A, ones(size(A,1),1), size(A,2)); D = cellfun(@transpose,B,'UniformOutput',false);
But I prefer your last solution since it's straight to the point. I'll will go on using your last solution.
B = mat2cell(A.', size(A,2), ones(size(A,1),1)).';
I'll send you a private message (thru your profile page) regarding the field I'm trying using LSTM to predict, maybe your are interested trying too. You are so smart..
Thank you very much!
Hello Eric,
I am delaing with 2 feature time series, where I have to give one as endogenous or exogenous because the lagged value of one time series correlates with current value of 2nd one. I am applying LSTM but it is taking both series as input, while defining LSTM, I need to give
numFeatures = 2;
numResponses = 1;
but I am unable to train the network. I am attaching the code can you tell me how one can tackle this issue. where I am making mistake?
%% loading data
data = data_p2;
% Use the first 30 time steps for training the model and the rest of the data to test the model
trainingData = data(:,1:30);
testData = data(:,31:end);
figure
subplot(2,1,1);
plot(1:30,trainingData(1,:),31:120,testData(1,:))
title("X1")
legend("Training Data", "Test Data")
subplot(2,1,2)
plot(1:30,trainingData(2,:),31:120,testData(2,:))
title("X2")
legend("Training Data", "Test Data")
%% Prepare training data
%in this case, the model will take one time step as input and provide the next time step as output.
%Prepare the data into this by creating a 'output' (y) that is one time step ahead of the 'input' (x)
XTrain = trainingData(:,1:end-1);
YTrain = trainingData(:,2:end);
%% define model
numFeatures = 2;
numResponses = 1;
numHiddenUnits1 = 50;
numHiddenUnits2 = 50;
layers = [
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits1,'OutputMode','sequence' )
dropoutLayer(0.2)
lstmLayer(numHiddenUnits2,'OutputMode','last')
dropoutLayer(0.2)
fullyConnectedLayer(numResponses)
regressionLayer
];
%% train model
options = trainingOptions('adam', ...
'MaxEpochs',400, ...
'MiniBatchSize', 32,...
'GradientThreshold',1, ...
'InitialLearnRate',0.005, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',125, ...
'LearnRateDropFactor',0.2, ...
'Verbose',0, ...
'Plots','training-progress');
net = trainNetwork(XTrain,YTrain,layers,options);
%% Run Inference: Predict the Future
net = resetState(net);
[net,YBefore] = predictAndUpdateState(net,trainingData);
%The last of the predictions from inputs 'before' is the first of predictions beyond the training data
YPred = YBefore(:,end);
numTimeStepsTest = 90;
for i = 2:numTimeStepsTest
XThis = YPred(:,i-1);
[net,YNext] = predictAndUpdateState(net,XThis);
YPred(:,i) = YNext;
end
% Plot
figure
%tiledlayout("flow")
%nexttile
subplot(2,1,1)
plot(1:30,trainingData(1,:),31:120,testData(1,:),1:30,YBefore(1,:),31:120,YPred(1,:))
title("x1")
legend("Training Data", "Test Data", "Prediction", "Forecast")
%nexttile
subplot(2,1,2)
plot(1:30,trainingData(2,:),31:120,testData(2,:),1:30,YBefore(2,:),31:120,YPred(2,:))
title("x2")
legend("Training Data", "Test Data", "Prediction", "Forecast")

Sign in to comment.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Products

Release

R2020a

Asked:

on 21 May 2020

Commented:

on 15 May 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!