How can I work with hybrid inputs (numerical + categorical variables) to create a neural network

I want to create a neural network model with two inputs:
Input 1: numerical values
Input 2: categorical dummy variables (calendar date: Sunday, Monday,..., January, february,...)
The first input is vector. The second input is matrix.
How to deal with this hybrid input data to create a NN model or any other machine learning technique.
Thanks

 Accepted Answer

Hi Ismael,
One thing you could do is create a very simple neural network using two featureInputLayers. Let's break down the workflow into steps.
Preparing the raw data
The standard data format for the featureInputLayer is numObservations x numFeatures. If I understand your data correctly, it seems like your first input has a single feature, so it will be a numObservations x 1 vector, and your second input will be a matrix of dummified categorical inputs (neural networks can't process categorical inputs so we need to turn them into numeric matrices by dummifying, or 'one-hot encoding', them). A very useful function you can use to transform your categorical array to a dummifed matrix is the onehotencode function. The idea is that you will have a numObservations x numFeatures matrix, where numFeatures is equal to the total number of categories of your categorical array.
Creating the datastores
The way to feed in multiple inputs to trainNetwork is using a combinedDatastore. A combinedDatstore objects hold multiple underlying datastores, and reads from them all at training time. Because your inputs are 2D arrays, you can have arrayDatastores as the underlying datastores. Attached below is some code where I create some dummy data and responses, store them in individual arrayDatastores, and then combine them into one combinedDatastore:
% Create data
XTrain1 = rand(10, 1); % numObservations (10) x numFeatures (1)
XTrain2 = rand(10, 7); % numObservations (10) x numFeatures (7)
YTrain = rand(10, 1); % numObservations (10) x numResponses(1)
% Create arrayDatastores. We transpose the arrays because the datastore needs to read out
% predictors in the format numFeatures x 1, and so the 'IterationDimension' becomes the
% second one.
dsX1 = arrayDatastore(XTrain1',"IterationDimension", 2,'OutputType','cell');
dsX2 = arrayDatastore(XTrain2',"IterationDimension", 2,'OutputType','cell');
dsY = arrayDatastore(YTrain',"IterationDimension", 2,'OutputType','cell');
% Create the combined datastore.
ds = combine(dsX1, dsX2, dsY);
% Read one observation from the datastore
ds.read
For more information regarding how to build datastores for feature input, see the appropriate section of the trainNetwork doc.
Creating the network
I've attached code below to build a simple network that fully-connects on each input branch, concatenates both branches, and then fully-connects one more time before output. Feel free to modify this network to suit your needs!
inputOne = [
featureInputLayer(1)
fullyConnectedLayer(20, 'Name', 'fc1')];
inputTwo = [
featureInputLayer(7)
fullyConnectedLayer(10, 'Name', 'fc2')];
concat = concatenationLayer(1,2,'Name','concat');
lgraph = layerGraph(inputOne);
lgraph = addLayers(lgraph, inputTwo);
lgraph = addLayers(lgraph, concat);
lgraph = connectLayers(lgraph, 'fc1', 'concat/in1');
lgraph = connectLayers(lgraph, 'fc2', 'concat/in2');
outputLayers = [
fullyConnectedLayer(numResponses,'Name','fc3')
regressionLayer('name', 'class')];
lgraph = addLayers(lgraph, outputLayers);
lgraph = connectLayers(lgraph, 'concat', 'fc3');
% Visualize the network architecture
analyzeNetwork(lgraph)
Training the network
All that's left to do is train the network!
options = trainingOptions('sgdm', ...
'MaxEpochs', 5, ...
'MiniBatchSize', 8, ...
'Verbose', true);
net = trainNetwork(ds, lgraph, options);
Let me know if this solves your problem and if there is anything more I can do to help!

4 Comments

Thank you Tomaso for your detailed answer and example. I never though about the two functions given by combinedDatastore and arrayDatastores.
I see you used two inputs with different dimension which is similar to my case in which I want to combine a vector and matrix inputs. If the second input has different format than the one you used (categorial values rather than numerical), should I use the function onehotencode for that trnasformation?
This is a sample of my data inputs:
Input 1: numerical values for each day
Input 2: Date to be converted into numerical matrix using onehotencode
Would you eloberate a bit on how to use the below data to create a network that accepts these inputs with different data types? I appreciate it.
Thanks a lot
37.32787162 1-Dec-19
36.67737722 2-Dec-19
37.41777652 3-Dec-19
39.734368 4-Dec-19
37.88293372 5-Dec-19
38.26238362 6-Dec-19
39.77109503 7-Dec-19
39.86013046 8-Dec-19
39.68261584 9-Dec-19
40.29640182 10-Dec-19
41.6400048 11-Dec-19
39.95059449 12-Dec-19
39.62802039 13-Dec-19
37.40819874 14-Dec-19
36.68214825 15-Dec-19
36.73912537 16-Dec-19
36.68314781 17-Dec-19
37.54965602 18-Dec-19
37.11374408 19-Dec-19
36.32551903 20-Dec-19
37.98692758 21-Dec-19
37.01819373 22-Dec-19
38.05062417 23-Dec-19
39.36474734 24-Dec-19
39.02832305 25-Dec-19
35.89698223 26-Dec-19
38.72212976 27-Dec-19
40.78112308 28-Dec-19
40.17113391 29-Dec-19
40.86563348 30-Dec-19
40.67546353 31-Dec-19
Hi Ismael,
Thanks for the extra information! There are a few couple things you can do here.
The feature input route
To me it feels like, in any case, you don't need two inputs to your network. You can hold all the information in one structure. If the reason you felt like you needed two inputs was because one input is numeric and one input is categorical, we can work around this by putting everything into a table, and then one-hot encoding the categorical variable. An example of this is shown below:
% Define categorical vector of dates
dates = categorical(["12-Dec-2019", "13-Dec-2019","14-Dec-2019","15-Dec-2019",...
"16-Dec-2019","17-Dec-2019","18-Dec-2019","19-Dec-2019","20-Dec-2019",...
"21-Dec-2019","22-Dec-2019","23-Dec-2019","24-Dec-2019","25-Dec-2019",...
"26-Dec-2019","27-Dec-2019","28-Dec-2019","29-Dec-2019","30-Dec-2019"]');
% Define numeric vector of random numbers
sz = size(dates,1);
X = rand(sz,1);
% Create table with both variables
tbl = table(X, dates);
% Replace the categorical variable with its one-hot encoded version.
oh = onehotencode(tbl(:,2));
tbl = addvars(tbl,oh);
tbl(:,2) = [];
tbl = splitvars(tbl);
numFeatures = size(tbl,2);
This gives you a table that you can directly feed into a network with a single featureInputLayer (with numFeatures as the inputSize), and you can forget all about arrayDatastores and combinedDatastores! The code above is adapted from this example, which might give you more insight into feature input workflows. However, there might be a good reason to have the two variables go into seperate input branches, so you can do that too if need be!
The sequence input route
Depending on how we want to think about your 'dates' variable, it might be better to treat the input data as sequences. I'm not too sure what problem you're trying to solve here (e.g. classification or regression), but if your 'dates' variable is just a chronological list of evenly spaced days, and the days don't repeat each other (i.e. '21-Dec-2019' doesn't appear twice), then you won't gain much by one-hot encoding this variable, and depending on how many days you have, your data table could get very big (because each day becomes one column). With a little more information about your data and what problem you're trying to solve with your neural network, I might be able to offer more insight into the optimal soluton. But if the feature input route outlined above works for you, then feel free to use it!
Tomaso
This is exactly what I was looking for;
An energy-forecasting problem with multiple inputs:
1. Energy measurement per day (nonrepeatable numerical values)
2. Weather information (nonrepeatable numerical values)
2. Date: mm-dd-yyyy (repeatable dummified variables)
3. Date: Weekdays (repeatable dummified variables)
4. Date type: working days\holidays (repeatable dummified variables)
The aim is to solve a forecasting problem given the above information and reaching an estimation of one vector of data, energy forecasting.
I believe training a network with the input like above (large matrix input) would be my next challenging problem but at least, I have a method now to start with. Since I have many zeros\ones in the input data, I will think about using a sparse matrix if possible.
Thank you very much Tomaso for the help.
Ismael
No worries Ismael, let me know if there is anything else I can help with!

Sign in to comment.

More Answers (0)

Products

Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!