I need to know each point to which cluster in kmeans fun.
Show older comments
hi, i used this function kmeans, and take simple example
x=[100 2 4 10 200; 50 100 20 1 5];
[p o]=kmeans(x,2)
what i got:
p =
1
2
o =
100 2 4 10 200
50 100 20 1 5
what i need is know each point for which cluster belong, what the code do is just get the two clusters i gave it and the points that i gave it.
how I know the clusters that each point belong to?
thanks
Accepted Answer
More Answers (1)
Walter Roberson
on 2 Aug 2012
0 votes
The first output from kmeans() is the cluster number for each sample.
IDX = kmeans(X,k) partitions the points in the n-by-p data matrix X into k clusters. This iterative partitioning minimizes the sum, over all clusters, of the within-cluster sums of point-to-cluster-centroid distances. Rows of X correspond to points, columns correspond to variables. kmeans returns an n-by-1 vector IDX containing the cluster indices of each point. By default, kmeans uses squared Euclidean distances. When X is a vector, kmeans treats it as an n-by-1 data matrix, regardless of its orientation.
7 Comments
huda nawaf
on 2 Aug 2012
huda nawaf
on 2 Aug 2012
Peter Perkins
on 3 Aug 2012
Please read my previous comment.
huda nawaf
on 5 Aug 2012
Walter Roberson
on 5 Aug 2012
I do not mean that the output will be the number of clusters for each row. I mean that the output is will be the cluster number for each row. "number of clusters" would be "how many clusters occur in that row". "cluster number" is the index of which cluster the row was assigned to.
The cluster numbers have no separate meaning. It does not matter whether the order is "cluster 2 then cluster 3" or "cluster 3 and then cluster 2": the important part is consistency and that the cluster centroid is the right one (you are not outputing the cluster centroids.)
huda nawaf
on 6 Aug 2012
Peter Perkins
on 6 Aug 2012
Walter, huda is using hirarchical clustering now, not k-means, so I think he was responding to my earlier post.
huda, again, you have given clusterdata three points,andf asked for three clusters, and it has returned what you asked for. The cluster number is completely arbitrary. I suspect that if you used linkage, then dendrogram, you'd see what you were expecting
d = [0 10 20; 10 0 5; 20 5 0] % d is a distance matrix
z = linkage(squareform(d)) % convert d to vector form first
dendrogram(z)
The dendrogram plot has labels on its horizontal axis that refer to the original points, not the cluster numbers).
Categories
Find more on k-Means and k-Medoids Clustering in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!