MATLAB Answers

Run this code for every certain columns of tables in a cell and saving the result in new cell

1 view (last 30 days)
Hi all,
I have a cell named C. In the C I have 71 tables. Each table has a column named rrr24. I want to run this code below for each rrr24 in the tables and save the results in another cell named C_new.
Here this code which in the first line just uses the first table in the C.
I want the name of each table at the new cell (C_new) to be the name of each table highlighted in the station_name column of each table. Or copy station_name column from each table and paste it as a first column of the same table in the C_new.
I attached C.
Thank you
td = C{1, 1}.rrr24; % example for C{1, 1}
n=length(td);
sc_set= [3,6,12,24];
SI=zeros(n,length(sc_set));
for q = 1:length(sc_set)
sc = sc_set(k);
if length(td(td>=0))/length(td)~=1
SI(n,q)=nan;
else
SI(1:sc-1,q)=nan;
A1=[];
for i=1:sc,
A1=[A1,td(i:length(td)-sc+i)];
end
Y=sum(A1,2);
nn=length(Y);
SI1=zeros(nn,1);
for k=1:12
d=Y(k:12:nn);
nnn=length(d);
bp=zeros(nnn,1);
for i=1:nnn
bp(i,1)=sum(d(:,1)<=d(i,1));
end
y=(bp-0.44)./(nnn+0.12);
SI1(k:12:nn,1)=y;
end
SI1(:,1)=norminv(SI1(:,1));
%output
SI(sc:end,q)=SI1;
end
end

  2 Comments

Guillaume
Guillaume on 5 Feb 2020
I'm sure I've answered a question of yours before, where I advised that having several tables is not a good idea. Most likely, it would be simpler to do whatever you want if you all your data in just one table.
Indeed, here it is: Counting all NaNs..., where the 1st thing I did is merge back all these tables to make the processing much easier.
And I see that Stephen gave you the same advice more recently.
With regards to the code you've posted above, I'm sorry it's unreadable. Good code should have comments and variable names than describe the content of the variable.
Behzad Navidi
Behzad Navidi on 5 Feb 2020
Dear Guillaume,
You are truly right, I'm sorry I should mention about:
C = vertcat(C{:});
When I have one integrated table it's mean I have one column of stations_name and corresponding rrr24, So then I should use the same rrr24 of stations_name in the code. I'm sorry for my ignorant. Now I have a merged table named C, I want to recognize the rrr24 column of each station_name and use it as an input to this code below, and after that save separated results in a cell.
This code gains SPI in 3, 6, 12, and 24 months intervals which is a Drought Index,
About the output, I want the output to get me for every 71 stations: station_name column and calculated SPI for 3, 6,12, 24, and 48 columns.
If I describe it more clear:
I want to use each rrr24 of station_name as an input to this code against td. and save results for each station_name.
I use this code from Matlab file exchange here is the commented code, sorry this process is complicated for me and I don't know how to handle them. maybe I should use groupsummaty again but haven't any idea about how to do it.
% Source code: https://www.mathworks.com/matlabcentral/fileexchange/51080-nonparametric-standardized-precipitation-index-spi
% this code calculates SPI 1, 3, 6, 12, 24, 48 months for a given station.
%if I had splited C
td = Abadan.rrr24; % get rrr24 of first station_name I want do it for all station_name's rrr24
Date = Abadan.date % date is constant for all station_name
sc_vect=[1 3 6 12 24 48]; % a set including all values for iterations
% sc: scale of the index (>1, e.g., 3-month SPI or SSI)
n=length(td);
SI=zeros(n,numel(sc_vect)); % columns correspond to values in sc_vect
% iterate over sc_vect
for p=1:numel(sc_vect)
sc = sc_vect(p);
%For some grid, no observation exist.
if length(td(td>=0))/length(td)~=1
SI(n,p)=nan;
else
% Obtain the prcp and smc for the specified time scale and
% compute the standarized drought index (for SPI and SSI)
SI(1:sc-1,p)=nan;
A1=[];
for i=1:sc,
A1=[A1,td(i:length(td)-sc+i)];
end
Y=sum(A1,2);
% Compute the SPI or SSI
nn=length(Y);
SI1=zeros(nn,1);
for k=1:12
d=Y(k:12:nn);
%compute the empirical probability
nnn=length(d);
bp=zeros(nnn,1);
for i=1:nnn
bp(i,1)=sum(d(:,1)<=d(i,1));
end
y=(bp-0.44)./(nnn+0.12);
SI1(k:12:nn,1)=y;
end
SI1(:,1)=norminv(SI1(:,1));
%output
SI(sc:end,p)=SI1;
end
end
SI_table = array2table(SI);
SI_table.Properties.VariableNames = {'SPI_1month' 'SPI_3month' 'SPI_6month'...
'SPI_12month' 'SPI_24month' 'SPI_48month'};
Thank you so much

Sign in to comment.

Accepted Answer

fred  ssemwogerere
fred ssemwogerere on 5 Feb 2020
Hello, I have taken a look at your workspace variable ("C"), and i think you should consider writing that program as a function file, so that you can use it for all tables in the cell array. You can use the code below and save it as your function file in the working (current) folder.
function SI = myfunction(Cn,rnum,colnum,tcol) % save this as a functionfile with filename: "myfunction.m"
%Summary of "myfunction"
% SI = output
% Cn = Cell array storing all tables
% rnum = row index of table in cell array
% colnum = column index of table in cell array
% tcol = column number of table variable ("rrr24") on which you are doing the computations
td = table2array(Cn{rnum, colnum}(:,tcol)); % example for C{1, 1}
n=numel(td);
sc_set= [3,6,12,24];
SI=zeros(n,length(sc_set));
for q = 1:length(sc_set)
sc = sc_set(q); % I have replaced "k" with "q" on this line
if numel(td(td>=0))/numel(td)~=1
SI(n,q)=nan;
else
SI(1:sc-1,q)=nan;
A1=[];
for i=1:sc
A1=[A1,td(i:numel(td)-sc+i)]; %#ok<AGROW>
end
Y=sum(A1,2);
nn=numel(Y);
SI1=zeros(nn,1);
for k=1:12
d=Y(k:12:nn);
nnn=numel(d);
bp=zeros(nnn,1);
for i=1:nnn
bp(i,1)=sum(d(:,1)<=d(i,1));
end
y=(bp-0.44)./(nnn+0.12);
SI1(k:12:nn,1)=y;
end
SI1(:,1)=norminv(SI1(:,1));
%output
SI(sc:end,q)=SI1;
end
end
end
After that, consider using the code below. It should do nicely:
Cnew=cell(1,size(C,2)); % pre-allocating your new array
for r=1:size(C,1)
for m=1:size(C,2)
SI{r,m}= myfunction(C,r,m,11);% On this line, "11" is the column index of "rrr24" (your table variable)
Cnew{r,m}=cell2table(C{r,m}.station_name); % input names under the "station_name" variable of your old tables into the new cell array based on their indexed locations
Cnew{r,m}.Output=SI{r,m}; % Add column variable, "Output" holding computed values to each table in the new cell array based on their indexed positions
Cnew{r,m}.Properties.VariableNames{1}='Station Name'; % variable name for first column of each table
end
end

  3 Comments

Behzad Navidi
Behzad Navidi on 5 Feb 2020
It's worked thank you. just one answer do you know how to change the output part into this: SPI_1month, SPI_3month, SPI_6month, SPI_12month, SPI_24month, and SPI_48month?
Capture.JPG
Thank you
fred  ssemwogerere
fred ssemwogerere on 5 Feb 2020
Hello, something like this for example, should do nicely;
varNames={'Station_Name','SPI_3month','SPI_6month','SPI_12month','SPI_24month'};
for k=1:size(Cnew,1)
for y=1:size(Cnew,2)
CEdit{k,y} =[(Cnew{k,y}.Station_Name),array2table(Cnew{k,y}.Output)]; %#ok<*SAGROW>
CEdit{k,y}.Properties.VariableNames=varNames;
end
end
Behzad Navidi
Behzad Navidi on 10 Feb 2020
Dear fred ssemwogerere,
Hello and Respect
Thank you for your answer. This code worked perfectly and accurately. Today I want to run this code using new input (same as C but with little change in values and number of rows and columns, named Cgrid). I tried to adapt this code (just code, not function) and use it for my new input. Everything is good, the code even runs successfully, But I saw that in the output all tables have similar results, all is 2.02! At the first, I think this is my data issue so I calculate for one table manually and saw there is a difference between manually results and code output. I check everything I cant recognize what is the problem everything seems similar.
Again I want to tell that this code works very well with C. My problem is when I want to use Cgrid as input against C.
Thank you so much.

Sign in to comment.

More Answers (1)

Guillaume
Guillaume on 5 Feb 2020
Whatever the calculation is, yes it can be done with groupsummary. As Fred wrote, you will first have to write a function to perform that calculation. However, this function should take as input: the precipitation for a station and that index scale, something like:
function spi = NonParametricStandardPrecipitationIndex(precip, indexscale)
%???
end
Then it's a straighforward groupsummary
results = groupsummary(bigtable, 'GroupingVariables', 'station_name', ...
{@(p) NonParametricStandardPrecipitationIndex(p, 1), ...
@(p) NonParametricStandardPrecipitationIndex(p, 3), ...
..etc}, 'rrr24');
Or you can make it neater by building the list of processing function for your desired scale indices beforehand:
npspi = arrayfun(@(sc) @(p) NonParametricStandardPrecipitationIndex(p, sc), [1 3 6 12 24 48], 'UniformOutput', false);
results = groupsummary(bigtable, 'GroupingVariables', 'station_name', npspi, 'rrr24');
Now with regards to the code for that function, first are you sure that the code in that fileexchange entry does what it's supposed to? Having looked at it a bit, it's very badly written! For example I was puzzling over
if length(td(td>=0))/length(td)~=1
which is simply
if any(td < 0)
why such a convoluted expression? Why write if A/B ~= 1 when if A ~= B is a lot clearer. Why calculate length(A(A>=0)) when sum(A>=0) does the same faster?
You may find that starting from scratch may be more efficient than adapting poorly written code.

Sign in to answer this question.

Products


Release

R2018b