how to create a structure?

I have a data set (DATA) that contains 100 stations with daily prices and demand.
Using this code I can have 100 stations with the data that belong to each of them.
Can you help me to fix the code??
N=length(DATA(:,1))
for i=1:N
x=DATA(i,1)
station=['S', num2str(x)]
station=[station;DATA(i,:)]
end

1 Comment

Using numbered variables is a sign that you are doing something wrong.
Indexing is simple and efficient (unlike what you are trying to do).

Sign in to comment.

 Accepted Answer

Walter Roberson
Walter Roberson on 5 Nov 2019
You can use num2cell to split the data into cell array by column.
You can construct an array of field names in a couple of ways, including using sprintfc or string objects with the + (concatenation) operation.
Once you have both of the above you can use cell2struct to create your structure.
However we wonder why you are not just leaving the data in the numeric matrix, as that is the fastest and smallest storage?

12 Comments

gjashta
gjashta on 5 Nov 2019
Edited: gjashta on 5 Nov 2019
Thanks Walter Roberson!Yeah, I can leave the data as in the matrix that I uploaded below but it takes time to write a code for each station. Hence, I want to split the stations and then calculate the mean price and demand for each month in each station.
I uploaded the data if you can help me with a simple code.
Why write code for each station when you could just loop?
Calculating using a struct array is not going to be faster than using a numeric array.
In the code above I create a for loop but is not working for this data set.
In your loop replace
station=[station;DATA(i,:)]
With
MyStructureThatIsJustGoingToMakeEverythingHarder.(station) =DATA(i,:);
Stephen23
Stephen23 on 5 Nov 2019
Edited: Stephen23 on 5 Nov 2019
"How can I create a structure like in this link:"
What you are creating is nothing like the structure in that link. Instead of creating a non-scalar structure you are using slow, complex, inefficient code to generate multiple structures.
It is very unlikely that creating variable names dynamically is a good solution to whatever you are trying to achieve. Much better would be if you explained what your actual goal is.
N=length(DATA(:,1))
for i=1:N
x=DATA(i,1)
stations(i).station_number = x(1);
stations(i).month = x(2);
stations(i).day = x(3);
stations(i).price = x(4);
stations(i).quantity = x(5);
end
But possibly you are trying to aggregate by station number? If so then what information do you want to group on? Your earlier discussion implies you want to group by month.
Have you ever considered just using accumarray() or grpstats() ?
[unique_stations, ~, station_idx] = unique(DATA(:,1));
array_idx = [station_idx, DATA(:,2)];
mean_price_by_month = accumarray(array_idx, DATA(:,4), [], @mean);
total_demand_by_month = accumarray(array_idx, DATA(:,5));
total_sales_by_month = accumarray(array_idx, DATA(:,4).*DATA(:,5));
No structure needed. The output in each case would be a 2D array, one row for each unique station ID, and one column for each month. The contents would be the mean price, the total units, and the total dollar sales, depending on the variable.
[~, ~, id_idx] = unique(DATA(:,2), 'row');
num_id = length(id_idx);
stations(id_idx) = struct('station_number', {[]}, 'month', {[]}, price', {[]}, 'quantity', {[]});
for i = 1 : num_id
x = DATA(i,:);
stations(i).station_number = x(1);
stations(i).month = x(2);
stations(i).price(end+1) = x(4);
stations(i).quantity(end+1) = x(5);
end
There, a struct all aggregated by month and station. And now... ?
Thank you, Walter! The last question: using your code how can I know the number of the days in a specific month and specific station because I know that there are some missing daily data in some months for several stations.
For the structure version (not recommended):
mask = [stations.station_number] == specific_station & [stations.month] == specific_month;
length(stations(mask).price)
For the accumarray version:
day_counts = accumarray(array_idx, 1);
mask = ismember(unique_stations, specific_stations);
day_counts(mask,specific_month)
gjashta
gjashta on 5 Nov 2019
Edited: gjashta on 5 Nov 2019
It’s a bit hard for me to understand your code but thank you very much for the time you spent Walter and your help.
I would suggest that you spend some time studying the documentation for accumarray(). It can take a bit of getting accustomed to, but it can be very useful.
Alternately you might want to read about findgroups(), and grpstats(), and splitapply(), which can do similar tasks in ways that are sometimes less cryptic.
(If you do use splitapply() then be sure to use @(x) mean(x,1) instead of just @mean, to account for the possibility that there might only be a single row of data for a combination.)
I have not done much work with linear regression.

Sign in to comment.

More Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!