Clear Filters
Clear Filters

Evaluate performance of Self-organizing map for classification

31 views (last 30 days)
I'm trying to use a Self-Organizing map (SOM) as a clustering method on the Iris dataset.
% Load dataset
load iris.dat
X = iris(:,1:end-1);
true_labels = iris(:,end);
% Train SOM
net = newsom(X',[10,10],'hextop','linkdist');
net.trainParam.epochs = 100;
net = train(net,X');
% Assign examples to clusters
outputs = sim(net,X');
[~,assignment] = max(outputs);
How can I evaluate the performance of this SOM? I tried to compute the Adjusted Rand Index (ARI) between the true_labels and the assignments, but I don't know if this makes any sense. The ARI seems to converge to 0 for a large grid.

Answers (1)

Sourabh
Sourabh on 21 Mar 2023
Instead of using ARI, you can try to evaluate the SOM by visualizing the results.
One common way to see how the data is being clustered by the SOM is by plotting the data points along with their corresponding neuron on a two-dimensional map.
Another approach is to use internal validation metrics, such as the quantization error or topographic error, which measure the distance between the input data and the nearest neuron on the SOM. These metrics can help you optimize the hyperparameters of the SOM, such as the grid size or learning rate.
Therefore, instead of relying solely on ARI or other traditional clustering metrics, you may want to use a combination of visualization, internal validation metrics, and comparison to other clustering methods to evaluate the performance of your SOM.
This guide on Iris Clustering using SOM might give you some insight on how to visualise the performance of your network:
You might also find these links useful:
Also, note that “newsom” has now been replaced by selforgmap in newer versions of MATLAB.
  1 Comment
Mikael
Mikael on 24 Oct 2023
How does one obtain output to calculate the internal validation metrics? Both quantization and topographic error are computed using the distance of data vectors from the map nodes, but evaluating a selforgmap network object results in only the best matched node being identified (I assume since selforgmap uses a competitive transfer function). I know it's possible to get these distance values with the externally developed "SOMPAK toolbox" so I'm a bit surprised it takes so much effort with the built-in function. Thanks for any guidance you can provide!

Sign in to comment.

Categories

Find more on Function Approximation, Clustering, and Control in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!