MATLAB Answers

Handling imbalanced data with patternnet

36 views (last 30 days)
Dr D
Dr D on 23 Jan 2020
Commented: Dr D on 28 Jan 2020
I have been experiementing with training different machine learning methods for a classification problem. A typical dataset might have 40,000 samples representing four different classes, but highly imbalanced where maybe 98.5% of the samples are one class and each of the other three classes have about 0.5% representation. In such a domain, which is typical of the datasets I deal with in my domain, an AI object can achieve a high degree of accuracy by always returning the one class. This is the least important class in the overall scheme. The other three are of much greater interest. (For those familiar with the classic credit card dataset, this is like finding the few incidents of fraud among the ocean of proper transactions.)
When using something like fitcensemble, I can specify a 'cost' or 'prior' to modify the misclassification penalty or prior probabilities, respectively, to help deal with the imbalanced data. I would like to do something similar with patternnet, but don't see how to do this. The default performance function for patternnet is 'crossentropy'. When used on its own, crossentropy has an optional parameter perfWeights which looks like it might act very similar to cost in fitcensemble. However, there doesn't seem to be any way to access perfWeights from within patternnet. I've tried below, but it returns the message: "Warning: 'perfWeights' is not a legal parameter."
oNet = patternnet( nNeurons, 'trainrp' );
oNet.performFcn = 'crossentropy';
oNet.performParam.perfWeights = nWeights;
Scouting around patternnet documentation and the object itself, I don't see anywhere to specify or modify penalty weights for the various classes. Please tell me I've missed something super obvious, I'm pulling my hair out.
Thanks in advance!

  0 Comments

Sign in to comment.

Accepted Answer

Vimal Rathod
Vimal Rathod on 28 Jan 2020
Hi,
To weight the errors during training, one can specify the "error weights" (EW) property in 'train', see here:
At prediction time, one can pass the error weights to the performance function, either by calling 'perform' with 'ew' or 'crossentropy' with 'perfWeights'.
For example (to weight the classes in the Iris dataset by 0.1, 0.2 and 0.7):
[x,t] = iris_dataset;
net = patternnet(10);
net = train(net,x,t,[],[],[0.1,0.2,0.7]');
y = net(x);
perf = perform(net,t,y,[0.1,0.2,0.7]')
Or:
perf = crossentropy(net,t,y,[0.1,0.2,0.7]')
Hope this helps!

  1 Comment

Dr D
Dr D on 28 Jan 2020
Followup question: Is it necessary to pass the weights to perform() and crossentropy() if they have been used in train()? Doesn't using them in train() bake them into the neural network by influencing the internal neuron weights?

Sign in to comment.

More Answers (0)

Sign in to answer this question.