How to plot 2D location vs corresponding data in MATLAB
Show older comments
Hi,
I have location of users in terms of lattitude and longitude, each user have a particular demand value. I have to create clusters using the position of users, and demand. Can anyone help me in designing it as there are three variables.
One another part of the problem is to how to plot it in such a way that lattitude comes in x axis, longitude in y axis, and user demand corresponding to them
It is like this
User Lattitude Longitude Demand
1 38.8643 9.2866 13
and so on
Please help me out.
9 Comments
Dyuman Joshi
on 14 May 2022
What is the criteria for making cluster? What is the data type? Numeric/String/Char/Cell/Table? Just copy-pasting doesn't help. Mention specifically so that it is easy for us to help you.
Shwet Kashyap
on 16 May 2022
Shwet Kashyap
on 16 May 2022
Dyuman Joshi
on 16 May 2022
Can you share the data here?
Shwet Kashyap
on 16 May 2022
Dyuman Joshi
on 16 May 2022
I plotted a 3d plot > x - lattitude, y - longitude, z - demand
This is what I obtained. I still don't know what do you want to plot. And what did you try?

Shwet Kashyap
on 16 May 2022
Shwet Kashyap
on 18 May 2022
Walter Roberson
on 18 May 2022
(Context appears to be fixed-beam signals for geostationary communication satellites.)
Answers (3)
REad abour kmeans
T = readtable('https://in.mathworks.com/matlabcentral/answers/uploaded_files/999455/MATLAB.xlsx')
x = T.(1) ;
y = T.(2) ;
d = T.(3) ;
idx = kmeans([x y d],71) ;
scatter3(x,y,d,[],idx,'filled')
colorbar
cmap = turbo(71) ;
colormap(cmap)
view(2)
5 Comments
Shwet Kashyap
on 19 May 2022
KSSV
on 19 May 2022
Shwet Kashyap
on 19 May 2022
Shwet Kashyap
on 20 May 2022
Shwet Kashyap
on 22 May 2022
Walter Roberson
on 18 May 2022
0 votes
You can create a triangulation object and then use trimesh() to plot the data.
Or you can create a scatteredInterpolant() object and interpolate over a grid of coordinates and then imagesc() or pcolor() to create a map.
7 Comments
Walter Roberson
on 18 May 2022
"i want to make 71 clusters in which the user demand becomes uniform"
Suppose that you have one of the 555 demand sites, and then a gap, and then (say) 500 km away you had one of the 13 demands. Suppose that you had 3 satellites available for this subset. (555+13)/3 is about 189 load per satellite. So you park two of them directly over the 555 demand, covering 378 of the 555 demand with high efficiency near-vertical direction. The remaining 177 of the 555 and the 13 of the other have to be covered by the remaining satellite.
Do you park that third satellite directly over the 555, giving full signal to all 555 units but 200 km and steeper angle inferior service to the 13 site? Do you declare that each site is equal importance regardless of its demand, and so position half the way between, with the 177 units and the 13 units each being served mediocre from 100 km away — so the first 378 units of the 555 would be excellent service but the remaining would be mediocre?
Do you position a distance proportional to the demand, so 13/190*200km away from the 555 (more generally, at the center of mass of the demand being served)?
... or do you position two satellites over the 555 and have them serve the 555 between them, and position the third over the 13, giving efficient service to all of them, assuming that the 555 load can be satisfied between the two satellites?
What is the consequence of exceeding the ideal load? Failure of all communication? Smooth (linear) degradation of service? Exponential degradation of service (finding a communication slot gets less and less probable due to competing and signals from independent stations clash forcing both to back off)? Attempts beyond the limit just do not get served?
Is it truly the case that the allocation is to be purely by total demand over the area, and no consideration is to be given to cost of beaming the signal longer distances, or the fact that steeper angles through the atmosphere can result in more skipping or more path loss, or that for reliability, longer distances might call for a lower payload rate (increased length of error correcting code)?
It would be my expectation that allocation strictly by total demand on the satellite, as if all demand is the same cost, would not be appropriate.
Walter Roberson
on 18 May 2022
Suppose that you have a 555 demand and a 13 demand, with the two being substantial distance apart. And suppose that you have two satellites available. You put one satellite over the 555, and the other goes where? Based on dividing the demand equally, 284 of the 555 gets served by one satellite, and the other has to serve the remainder and the 13. But if they are far enough apart then any possible position of the second satellite is over the horizon from at least one of the two sites.
If you position the satellites equal distance from each other and the sites, then site 555, then 1/3 of the distance between sites to the first satellite, then another third of the distance to the second satellite, then the remaining third to site 13. Seems fair, right? But now the satellites might be over the horizon from both sites, and over the horizon from each other! No communication at all!
Maybe give up on equal demand per satellite, and position one over each of the two sites so that each gets service?
... though how are the satellites communicating with each other? Don't you need to reserve some satellites to be relay stations, possibly not in geostationary orbits in order to be able to reach multiple satellites? There could potentially be a hierarchy of distances away rather than relying on each satellite to store-and-foreward in series to the next geostationary satellite within view...
Shwet Kashyap
on 19 May 2022
Walter Roberson
on 22 May 2022
Given the conditions you have set out, the algorithm is this:
- adjust all beams to cover the exact same area, all locations
- each endpoint should generate a random beam number to associate with. If the throughput drops below an acceptable value, the endpoint should re-associate with a random beam.
This algorithm works because the throughput of a beam is not affected by the area the beam is covering, so you might as well have all beams cover all areas.
Walter Roberson
on 23 May 2022
%kmeans
target_clusters = 71;
[idx_kmeans, centers_kmeans] = kmeans([x y d], target_clusters, 'EmptyAction', 'singleton') ;
T.cluster_kmeans = idx_kmeans;
%dbscan
min_per_cluster = floor(numel(x)/target_clusters * 0.95);
[idx_dbscan, core_dbscan] = dbscan([x y d], 0.5, min_per_cluster);
idx_dbscan(idx_dbscan < 0) = nan;
T.cluster_dbscan = idx_dbscan;
%plot kmeans
subplot(2,1,1)
scatter3(x, y, d, [], idx_kmeans, 'filled');
colorbar();
title('cluster by kmeans');
%plot dbscan
subplot(2,1,2)
scatter3(x, y, d, [], idx_dbscan, 'filled')
colorbar();
title('cluster by dbscan');
%illustrating taking a subset by cluster number
idx = T.cluster_kmeans == 1;
subset1 = T(idx,:);
This is the code you asked for.
This code will not do what you want .
When you use kmeans, the only control over cluster size is EmptyAction -- you can prevent a cluster from becoming completely empty.
kmeans makes no attempt to equalize the cluster size. None. Equalizing cluster size is completely absent from the algorithm.
For example if you had
13 255
and two clusters, then it would put one of the clusters at the 13, and the other at the 255, and absolutely would not consider putting a cluster in the middle to split the 255 to give more equal cluster sizes.
dbscan, on the other hand, has no way to request a specific number of clusters. You might notice that I have configured a minimum cluster size of 95% of what would give you an equal distribution for the target number of clusters -- but since those are minimum dbscan could decide to just put everything into (say) 5 clusters.
Furthermore, in both cases, the "demand" (d) information is being used as a coordinate, not as a replicate. A location with a demand of 255 will not be split between two beams for either kmeans or dbscan. Instead, the demand will be taken pretty much as a Z coordinate, and treated as a distance. And since "13" is a fair difference from "255", the effect would tend to be to group all of the 13 together providing that they are geographically not too far apart (250-ish degrees... which is basically further away than it is possible to get on the world.)
In order to have any possibility of splitting a location between different beams, instead of using [x, y, d] coordinates, what you would need to do is similar to
repmat([x, y], d, 1)
and then add "jitter" to the coordinates, converting a single point with demand 255 into 255 nearby points with demand 1. That at least could result in the location being split. But in practice, unless you add an amount of jitter roughly the same as half the distance between locations, then you will just get a cluster placed at the centroid that will "absorb" the 255 individual points. See again what I said about kmeans never even attempting to use clusters the same size.
Do algorithms exist at all? Probably yes. For example the task is much the same as dividing land up into political districts of equal-ish population. See for example http://autoredistrict.org/
Do you have a hope using kmeans or dbscan? NO.
I would like to take this opportunity to remind you that I already posted a solution that is fully compliant with all rules that you have established: namely to have all of the beams cover the entire area, and then join beams at random. If that is not an acceptable solution, then the implication is that you have other criteria that you have not discussed.
Shwet Kashyap
on 24 May 2022
Walter Roberson
on 24 May 2022
kmeans and dbscan are completely incompetent at making the distribution fair. I believe some of the other algorithms you mentioned are as well.
Shwet Kashyap
on 25 May 2022
0 votes
11 Comments
Walter Roberson
on 25 May 2022
https://www.mathworks.com/matlabcentral/answers/1719025-how-to-plot-2d-location-vs-corresponding-data-in-matlab#comment_2173025
take that code and remove d from the [x y d] of the kmeans and dbscan calls. You will get clusters that ignore demand.
Walter Roberson
on 25 May 2022
"Can this data be processed using support vector machine method"
Yes, of course it can be. The question you should be asking is whether it can be usefully processed with SVM.
SVM aims to try to find a hyper dimensional line that best separates two groups of classes. So to use it, each item being processed must be associated with a class label. For example you could label each item with its demand, and then ask to find a parabola that best separates the 13s on one side and the 255s on the other side.
Would that help? No.
You could assign class 71 different class labels, one for each beam, and ask SVM to find 70 dividing lines.
Would that help? No. It would not give you any guidance as to which label to assign to which point to start with.
But, hey, you can totally use SVM if you want to. All that will happen is that you will waste your time, other than you will be able to point to the failure of the method in your report. Make-work reports always appreciate if you prove experimentally that a technique is doomed to failure, rather than just proving from theory that it is doomed to failure, since the working hypothesis of such useless reports is that you probably don't understand the theory.
Shwet Kashyap
on 27 May 2022
Walter Roberson
on 27 May 2022
By default kmeans initializes cluster centers randomly.
To get repeatably behaviour, you can tell kmeans to initialize the particular values that you supply, or you can use rng() to set the random number generator to a consistent value before each call to kmeans that you need to repeat.
To set the number of iterations for kmeans, pass 'MaxIter', and the maximum number you want.
kmeans++ is not supplied by Mathworks. If you use https://www.mathworks.com/matlabcentral/fileexchange/28804-k-means then there is no way to set the number of iterations.
Shwet Kashyap
on 29 May 2022
Walter Roberson
on 29 May 2022
"MaxIter'
You start with double quote but you end with single quote
Shwet Kashyap
on 30 May 2022
Walter Roberson
on 30 May 2022
Edited: Walter Roberson
on 30 May 2022
[idx_kmeans, centers_kmeans] = kmeans([x y d], target_clusters, 'EmptyAction', 'singleton','MaxIter',1000) ;
Shwet Kashyap
on 31 May 2022
Walter Roberson
on 31 May 2022
I already explained multiple times that the kmeans algorithm makes absolutely no attempt to balance the number of items in a cluster. Balance is completely outside of the kmeans algorithm.
Walter Roberson
on 31 May 2022
The entire point of kmeans is to group points that are close together, and separate the groups (clusters). If you have a demand of 555 at one location and a distance to a demand of 13 and you ask for two clusters, then it is going to put one at the 13 and the other at the 555, and will not try to split the load.
Categories
Find more on Reference Applications in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!