Clear Filters
Clear Filters

How to compute lognormal distribution

2 views (last 30 days)
Hi everyone,
I have 19 years of data and I have to compute lognormal distribution of each month of each year. I've already computed monthly average values and monthly standard deviation on each year. I know the more quickly way to do this kind of operation is for loop but I'm a beginner on Matlab and i don't know how to use for loop i these situations. I think that I need of two for loop, one with index on the years and the other with index on the months but I don't know. I've done the code below that it is manual and too much long. I need to make it authomatic. Thank you in advance!
format long g
folder = 'D:\Valerio\data\IPCC_midcent\RCP4.5\BCC_CSM\BCC_CSM.xlsx';
file = xlsread(folder);
dt = datetime([file(:,1:3) file(:,4)/1E4 repmat([0 0],size(file,1),1)]);
tt = timetable(dt, file(:,5:end));
data = tt.Var1;
% file(:, 3:4) = []; % delete day and hour columns as they are not important for yearly mean
[grps, Years, Months] = findgroups(file(:,1), file(:,2));
result_mean = splitapply(@(x) mean(x, 1), file(:,5:end), grps);
result_mean = [Years Months result_mean];
result_std = splitapply(@(x) std(x, [], 1), file(:,5:end), grps);
result_std = [Years Months result_std];
TR1 = timerange('01-Jan-2026 00:00:00','01-Feb-2026 00:00:00');
tt_TR1 = tt(TR1,:);
Hs1 = tt_TR1.Var1(:,1);
Tp1 = tt_TR1.Var1(:,2);
d_Hs1 = lognpdf(Hs1,result_mean(1,3),result_std(1,3));
d_Tp1 = lognpdf(Tp1,result_mean(1,4),result_std(1,4));
TR2 = timerange('01-Feb-2026 00:00:00','01-Mar-2026 00:00:00');
tt_TR2 = tt(TR2,:);
Hs2 = tt_TR2.Var1(:,1);
Tp2 = tt_TR2.Var1(:,2);
d_Hs2 = lognpdf(Hs2,result_mean(2,3),result_std(2,3));
d_Tp2 = lognpdf(Tp2,result_mean(2,4),result_std(2,4));
TR3 = timerange('01-Mar-2026 00:00:00','01-Apr-2026 00:00:00');
tt_TR3 = tt(TR3,:);
Hs3 = tt_TR3.Var1(:,1);
Tp3 = tt_TR3.Var1(:,2);
d_Hs3 = lognpdf(Hs3,result_mean(3,3),result_std(3,3));
d_Tp3 = lognpdf(Tp3,result_mean(3,4),result_std(3,4));
TR4 = timerange('01-Apr-2026 00:00:00','01-May-2026 00:00:00');
tt_TR4 = tt(TR4,:);
Hs4 = tt_TR4.Var1(:,1);
Tp4 = tt_TR4.Var1(:,2);
d_Hs4 = lognpdf(Hs4,result_mean(4,3),result_std(4,3));
d_Tp4 = lognpdf(Tp4,result_mean(4,4),result_std(4,4));
TR5 = timerange('01-May-2026 00:00:00','01-Jun-2026 00:00:00');
tt_TR5 = tt(TR5,:);
Hs5 = tt_TR5.Var1(:,1);
Tp5 = tt_TR5.Var1(:,2);
d_Hs5 = lognpdf(Hs5,result_mean(5,3),result_std(5,3));
d_Tp5 = lognpdf(Tp5,result_mean(5,4),result_std(5,4));

Accepted Answer

Srivardhan Gadila
Srivardhan Gadila on 29 May 2020
Edited: Srivardhan Gadila on 29 May 2020
You can try something like below:
format long g
folder = 'BCC_CSM.xlsx';
file = xlsread(folder);
dt = datetime([file(:,1:3) file(:,4)/1E4 repmat([0 0],size(file,1),1)]);
tt = timetable(dt, file(:,5:end));
data = tt.Var1;
% file(:, 3:4) = []; % delete day and hour columns as they are not important for yearly mean
[grps, Years, Months] = findgroups(file(:,1), file(:,2));
result_mean = splitapply(@(x) mean(x, 1), file(:,5:end), grps);
result_mean = [Years Months result_mean];
result_std = splitapply(@(x) std(x, [], 1), file(:,5:end), grps);
result_std = [Years Months result_std];
monthNames = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"];
count = 1;
for i = 1:numel(Years)
allMonths{count} = "01-"+monthNames(Months(i))+"-"+num2str(Years(i))+" 00:00:00";
count = count+1;
end
for i = 1:count-2
TR{i} = timerange(allMonths{i},allMonths{i+1});
tt_TR{i} = tt(TR{i},:);
Hs{i} = tt_TR{i}.Var1(:,1);
Tp{i} = tt_TR{i}.Var1(:,2);
%If the above variables TR, tt_TR, Hs & Tp are temporary only then don't use
% cell arrays for them
d_Hs{i} = lognpdf(Hs{i},result_mean(i,3),result_std(1,3));
d_Tp{i} = lognpdf(Tp{i},result_mean(i,4),result_std(1,4));
end
The above code covers the timeranges at the year end like '01-Dec-2026 00:00:00','01-Jan-2027 00:00:00' , If you don't want this timerange then simply use the below code;
Years = ["2026", "2027", "2028"];
Months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"];
i = 1;
for year = Years
for monthNum = 1:11
TR{i} = timerange("01-"+Months(monthNum)+"-"+year+" 00:00:00","01-"+Months(monthNum+1)+"-"+year+" 00:00:00");
tt_TR{i} = tt(TR{i},:);
Hs{i} = tt_TR{i}.Var1(:,1);
Tp{i} = tt_TR{i}.Var1(:,2);
%If the above variables TR, tt_TR, Hs & Tp are temporary only then don't use
% cell arrays for them
d_Hs{i} = lognpdf(Hs{i},result_mean(i,3),result_std(1,3));
d_Tp{i} = lognpdf(Tp{i},result_mean(i,4),result_std(1,4));
i = i + 1;
end
end
  3 Comments
Srivardhan Gadila
Srivardhan Gadila on 29 May 2020
@Valerio Gianforte, I have updated the code to work exaclty with your problem. The initial answer was to give a general idea or approach to solve your problem.
Valerio Gianforte
Valerio Gianforte on 29 May 2020
Thank you so much, now it works. Now I have to plot the results on one plot for each year and every single plot should to contain the log normal distribution of each month of that year. I tryed to put the scatter plot in the for loop but obviously I've obtained 217 plots that represent all months in 19 years. Have you some idea? Thank you

Sign in to comment.

More Answers (0)

Categories

Find more on Data Type Identification in Help Center and File Exchange

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!