Binning 2 columns and summing 3rd

I have three very large datasets of latitude, potential temperature and ozone. I want to binning the data of latitude and potential temperature and according to the bins, I want to sum ozone data and put it into each bins that I have created. I have tried using accumaray which shows me,
"Error using accumarray
First input SUBS must contain positive integer subscripts."
Indeed ozone data contains float numbers.
Am I missing something or what I could not understand.
I am attaching script as under (also pasting the copied script):
I have also attached mat files of the data.
clc;
clear all;
close all;
ozonef = cell2mat(ozone);
poslatf = cell2mat(poslat);
Tpotf = cell2mat(Tpot);
% % %bining the data-
poslatbin = -90:2:90;
Tpotbin = 250:5:380; %%%% I have values in between this range
nposlat = length(poslatbin);
nTpot = length(Tpotbin);
PL = discretize(poslatf, poslatbin);
TP = discretize(Tpotf, Tpotbin);
ozonef(isnan(ozonef))=min(ozonef);
idx = sub2ind([nposlat nTpot], PL, TP);
FO = accumarray(idx, ozonef, [nposlat*nTpot 1], @(x) mean(x));
contourf(PL, TP, FO);

 Accepted Answer

I would put your data into a table, then use groupsummary to bin and sum the data as you describe. You will find the Access Data in Tables page helpful.
load latitude
load temperature
load ozone
% create table
data = table(latitude,temperature,ozone)
data = 47101×3 table
latitude temperature ozone ________ ___________ _____ 24.876 290.97 1.41 24.866 290.92 1.41 24.856 291.11 1.41 24.845 291.27 1.41 24.833 291.28 1.41 24.825 291.13 1.41 24.815 291.24 1.41 24.805 291.43 1.41 24.795 291.5 1.41 24.785 291.56 1.41 24.775 291.52 1.41 24.765 291.43 1.41 24.755 291.54 1.41 24.746 291.75 1.41 24.736 291.8 1.41 24.726 292.03 1.41
% define bins
poslatbin = -90:2:90;
Tpotbin = 250:5:380;
% create table of binned data
binTbl = groupsummary(data,["latitude","temperature"],{poslatbin,Tpotbin},'sum',"ozone")
binTbl = 370×4 table
disc_latitude disc_temperature GroupCount sum_ozone _____________ ________________ __________ __________ [-34, -32) [330, 335) 36 15973 [-34, -32) [335, 340) 56 21635 [-34, -32) [340, 345) 743 52250 [-34, -32) [345, 350) 455 45569 [-34, -32) <undefined> 497 30712 [-32, -30) [335, 340) 44 16214 [-32, -30) [340, 345) 517 1.3762e+05 [-32, -30) [345, 350) 648 78325 [-32, -30) <undefined> 351 19507 [-30, -28) [335, 340) 39 11469 [-30, -28) [340, 345) 953 1.1891e+05 [-30, -28) [345, 350) 475 50072 [-30, -28) <undefined> 195 18177 [-28, -26) [335, 340) 332 16639 [-28, -26) [340, 345) 1003 62685 [-28, -26) [345, 350) 540 51063

6 Comments

I looked at your m-file. It looks like the desired end result is a filled contour plot. To achieve that, you have to make some adjustments to the settings in groupsummary. By default, it does not return the results for empty and missing bins (for example, you have no data in the [-90,-88) bin. To create a contour plot, you need data for every bin. Here's what that code might look like.
% Start the same as before
load latitude
load temperature
load ozone
data = table(latitude,temperature,ozone);
poslatbin = -90:2:90;
Tpotbin = 250:5:380;
% The two new settings will include data for every bin
binTbl = groupsummary(data,["latitude","temperature"],{poslatbin,Tpotbin},'sum',"ozone",...
"IncludeEmptyGroups",true,"IncludeMissingGroups",true)
binTbl = 2457×4 table
disc_latitude disc_temperature GroupCount sum_ozone _____________ ________________ __________ _________ [-90, -88) [250, 255) 0 0 [-90, -88) [255, 260) 0 0 [-90, -88) [260, 265) 0 0 [-90, -88) [265, 270) 0 0 [-90, -88) [270, 275) 0 0 [-90, -88) [275, 280) 0 0 [-90, -88) [280, 285) 0 0 [-90, -88) [285, 290) 0 0 [-90, -88) [290, 295) 0 0 [-90, -88) [295, 300) 0 0 [-90, -88) [300, 305) 0 0 [-90, -88) [305, 310) 0 0 [-90, -88) [310, 315) 0 0 [-90, -88) [315, 320) 0 0 [-90, -88) [320, 325) 0 0 [-90, -88) [325, 330) 0 0
% need to rearrange the data into a nxm matrix
% rows correspond to y (poslatbin), and columns to x (Tpotbin)
cdata = reshape(binTbl.sum_ozone,length(poslatbin),length(Tpotbin));
cdata(cdata==0)=min(ozone); % Replace 0s with min ozone value (might not be necessary)
% Create your contour plot
contourf(Tpotbin,poslatbin,cdata)
colorbar
Thank you so much. I appreciate your help.
It worked, but I want to plot this data. I do not know how to plot data, it seems complex to me. I want contour plot as I mentioned above in my script.
Thanks a lot for your help. It is working.
Vaidehi Joshi
Vaidehi Joshi on 18 Jan 2022
Edited: Vaidehi Joshi on 18 Jan 2022
I have a one question here. I am changing X and Y axis (in my case right now it is temperature and latitude,respectivley), matrix changes too but values would be false. how can I check what we have done here are the correct values or not?
I don't understand the question. My suggestion - make the changes, and just be sure to update the corersponding variable names in the code I shared, and see if it works.
Vaidehi Joshi
Vaidehi Joshi on 18 Jan 2022
Edited: Vaidehi Joshi on 18 Jan 2022
I did change the axises. As I made the changes in these lines
cdata = reshape(binTbl.sum_ozone,length(Tpotbin instead of poslatbin),length(postlatbin instead of Tpotbin)); and
contourf(Tpotbin,poslatbin,cdata) same as above.
cdata matrix will be changed right!
Here what I noticed with the reshape function is, values are also changing, accordingly plot.
I want to crosscheck whether these values are correct or not.

Sign in to comment.

More Answers (0)

Categories

Find more on 2-D and 3-D Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!