Speed up nested loops with parfor

Question

0 votes

I'm trying to speed up this part of code. I have a constraint: the inputs of ReturnFn must all be scalars. If it was not for this restriction, I could easily vectorize the code. So I would like to know if there is a way to make the code below faster, while still satisfying this restriction of the inputs of ReturnFn.

Any help is really appreciated!

N_d = 50;
N_a = 300;
N_z = 10;
% ParamCell contains: K_to_L,alpha,delta,pen,gamma,crra
% I need the cell array to handle variable number of inputs in ReturnFn
Fmatrix=zeros(N_d*N_a,N_a,N_z);
parfor i4i5=1:N_z
    Fmatrix_z=zeros(N_d*N_a,N_a);
    for i3=1:N_a % a today
        for i2=1:N_a % a' tomorrow
            for i1=1:N_d % d choice
                Fmatrix_z(i1+(i2-1)*N_d,i3)=ReturnFn(d_gridvals(i1),a_gridvals(i2),a_gridvals(i3),z_gridvals(i4i5,1),z_gridvals(i4i5,2),ParamCell{:});
            end
        end
    end
    Fmatrix(:,:,i4i5)=Fmatrix_z;
end
function F = f_ReturnFn(d,aprime,a,e,age,K_to_L,alpha,delta,pen,gamma,crra)
% INPUTS (always 5 inputs, plus some extra parameter inputs)
%   d:       Hours worked
%   aprime:  Next-period's assets
%   a:       Current period assets
%   e:       Labor efficiency shock
%   age:     Age of individual: young or old
% TOOLKIT NOTATION
% (d,aprime,a,z), where z = [e;age]
F = -inf;
r = alpha*K_to_L^(alpha-1)-delta;
w = (1-alpha)*K_to_L^alpha;
income = (w*e*d)*(age==1)+pen*(age==2)+r*a;
c = income+a-aprime; % Budget Constraint
if c>0
    % NOTE: 0<d<1 is already built into the grid
    % WARNING: this will not work if crra=1
    inside = (c^gamma)*((1-d)^(1-gamma));
    F = inside^(1-crra)/(1-crra);
end
end

7 Comments
Show 5 older comments Hide 5 older comments

Alessandro Maria Marco on 9 Aug 2024

Edited: Alessandro Maria Marco on 9 Aug 2024

Open in MATLAB Online

This code is part of toolkit that runs in two different ways depending on whether the user's computer has a NVIDIA gpu or not. With gpu what we do is the following and is fast. But what if the user's computer does not have a supported gpu? On the cpu the best would be to vectorize f_ReturnFn, but this would require to keep two different versions of f_ReturnFn, one with scalar inputs and another one vectorized.

Please let me know if my explanation is not clear enough

ReturnFnParamsVec = [5.0,0.3,0.1,0.1,2.0,2.0];
ParamCell=cell(length(ReturnFnParamsVec),1);
for ii=1:length(ReturnFnParamsVec)
    ParamCell(ii,1)={ReturnFnParamsVec(ii)};
end
N_a         = 100;
N_z         = 50;
a_grid      = linspace(0,50,N_a)';
z_grid      = linspace(0.5,1.5,N_z)';
agrid_vals1 = a_grid; % column vector (N_a,1,1)
agrid_vals2 = shiftdim(a_grid,-1); % row vector (1,N_a,1)
z_grid_vals3 = shiftdim(z_grid,-2); % (1,1,N_z)
% IMPORTANT: arrayfun on the GPU uses automatic expansion
% arrayfun on the cpu does not and is too slow in practice
% Here it gives an error because it is trying to use the GPU but mastlab
% online does not have access to parallel computing toolbox
Fmatrix = arrayfun(@f_ReturnFn,agrid_vals1,agrid_vals2,z_grid_vals3,ParamCell{:});
Error using arrayfun
All of the input arguments must be of the same size and shape.
Previous inputs had size 100 in dimension 1. Input #3 has size 1
function F = f_ReturnFn(aprime,a,z,K_to_L,alpha,delta,pen,gamma,crra)
% INPUTS (always 5 inputs, plus some extra parameter inputs)
%   d:       Hours worked
%   aprime:  Next-period's assets
%   a:       Current period assets
%   e:       Labor efficiency shock
%   age:     Age of individual: young or old
% TOOLKIT NOTATION
% (d,aprime,a,z), where z = [e;age]
F = -inf;
r = alpha*K_to_L^(alpha-1)-delta;
w = (1-alpha)*K_to_L^alpha;
income = w*z+r*a;
c = income+a-aprime; % Budget Constraint
if c>0
    % WARNING: this will not work if crra=1
    F = c^(1-crra)/(1-crra);
end
end %end function "f_ReturnFn"

Alessandro Maria Marco on 12 Aug 2024

That's a good suggestion. Do you have something specific in mind? Like putting an if branch and using isscalar on the input?

Umar on 12 Aug 2024

Hi @Alessandro Maria Marco,

Below is a refactored version of your function that incorporates these suggestions:

       function F = f_ReturnFn(aprime, a, z, K_to_L, alpha, delta, pen,        gamma, crra)

       % Check if inputs are scalar or vector

       isScalar = @(x) isscalar(x);

       % Initialize F with -inf for all entries

       F = -inf(size(aprime));  % Make sure F has the same size as aprime

       % Calculate common parameters

       r = alpha * K_to_L^(alpha - 1) - delta;

       w = (1 - alpha) * K_to_L^alpha;

       income = w .* z + r .* a;  % Element-wise multiplication

       % Budget constraint calculation

       % Ensure this handles array operations correctly

       c = income + a - aprime;

       % Compute utility only where c > 0

       validIndices = c > 0;  % Logical array indicating valid consumption

       if any(validIndices)

           % Use element-wise operations for valid indices

           F(validIndices) = c(validIndices).^(1 - crra) / (1 - crra);

           if crra == 1

               % Handle CRRA=1 case separately

               F(validIndices) = log(c(validIndices));

end

end

end

Explanation of Changes: The modify code now incorporates the use of `.*` so that multiplications are applied element-wise across arrays.By creating a logical index array (`validIndices`), conditions can be applied directly to the output array `F`, allowing for efficient computation without separate handling for scalars and vectors and special cases like `crra = 1` are handled separately within the same function scope. Also, if you are interested, MATLAB provides functions such as `gpuArray` that can convert regular arrays into GPU-compatible arrays easily. Utilizing these functions in your code will optimize performance further. Fore more information on this function, please refer to

Speed up nested loops with parfor

7 Comments
Show 5 older comments Hide 5 older comments

Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

Speed up nested loops with parfor

7 Comments Show 5 older comments Hide 5 older comments

Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

7 Comments
Show 5 older comments Hide 5 older comments