Not able to calculate gradient of loss function in a neural network program

Question

Dr. Veerababu Dharanalakota on 8 Apr 2023

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/1943639-not-able-to-calculate-gradient-of-loss-function-in-a-neural-network-program

Commented: Jast on 4 Jan 2024

Accepted Answer: Richard

Open in MATLAB Online

Hi,

I am trying to solve a phisics-informed neural network problem in which I constructed a loss function as follows

function [loss,gradients] = loss_fun(parameters,x,C,alpha)
    % C is a complex-valued constant
    % alpha is a real-valued constant
    NN = model(parameters,x);                       % Feedforward neural network 
    f = C*NN;                                       % Intermediate function
    
    g = fxx+alpha*f;                                % Objective function
    
    gr = real(g);                                   % Real-part of g
    gi = imag(g);                                   % Imaginary-part of g
    
    zeroTarget_r = zeros(size(gr),"like",gr);       % Zero targets for the real-part    
    loss_r = l2loss(gr, zeroTarget_r);              % Real-part loss function
    
    zeroTarget_i = zeros(size(gi),"like",gi);       % Zero targets for the imaginary-part
    loss_i = l2loss(gi, zeroTarget_i);              % Imaginary-part loss function
    
    loss = loss_r+loss_i;                           % Total loss function (real-valued)
    
    gradients = dlgradient(loss,parameters);        % Loss function gradients with respect to parameters
end

The function 'model' returns a feedforward neural network

. I would like the minimize the function g with respect to the parameters (θ). The input variable x as well as the parameters θ of the neural network are real-valued. Here,

which is a double derivative of f with respect to x, is calculated as

. The presence of complex-valued constant C makes the objective function g a complex-valued. Hence, I split it into real and imaginary parts, calculated individual loss functions and added them.

While calculating the gradients I am encountering the following error

"Encountered complex value when computing gradient with respect to an input to fullyconnect. Convert all inputs to fullyconnect to real".

I checked indivial loss values and the parameter values. They are purely real.

I would be grateful to you if you could tell possible reasons for the error and resolution steps.

I am using fmincon with lbfgs hessian approximation for the optimization.

2 Comments
Show NoneHide None

Richard on 17 May 2023

I posted an answer regarding the complex value issue, but as an aside, you might be interested in the lbfgsupdate function which was recently added to Deep Learning Toolbox in R2023a.

Dr. Veerababu Dharanalakota on 18 May 2023

Thank you, Richard. Addition of lbfgsupdate function is a great help for the researchers working on physics-informed neural networks.

Sign in to comment.

Sign in to answer this question.

Answer 1

Richard on 17 May 2023

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1943639-not-able-to-calculate-gradient-of-loss-function-in-a-neural-network-program#answer_1238034

Open in MATLAB Online

I think this may be due to your introduction of the complex value into the output of the model, NN. Even though you are later splitting this into two real halves, the gradient backwards computation will be stepping back through this (complex) C*(real) NN operation which reintroduces a complex gradient during the backwards.

Try calling NN = real(NN) before this step to insulate the real-valued model from the complex part of the calculation:

    NN = model(parameters,x);                       % Feedforward neural network 
    NN = real(NN);
    f = C*NN;                                       % Intermediate function

It may seem counter-intuitive to apply this before the complex values are created, and indeed in the forwards computation this will have no effect because NN is already real. But in the backwards pass for gradients the computation flows in the other direction through the code, and the backwards for real(NN) will be after the backwards for C*NN. It will discard the imaginary parts of the gradient, which at this point have no meaning because there is no imaginary part of the NN value.

4 Comments
Show 2 older commentsHide 2 older comments

Richard on 18 May 2023

Open in MATLAB Online

Thanks for the code, I do reproduce the same issue. The reason that one call to real(U) did not fix it is that the dlgradient calls for Ux and Uxx are creating additional backward passes through U after your real(U) call. This means that they are before the real(U) when the final backwards pass is performed in the last dlgradient call.

The solution is to move/add more real(...) "assertions" so that they cover the Uxx output as well. You can either do this by adding a Uxx=real(Uxx) call after the dlgradient line that creates Uxx, or you can perform the real() call after U and Uxx are combined together, i.e. on the f variable, right before it is multiplied by the complex constant:

function [loss,gradients] = modelLoss2(net,X,X0,U0,k)
C = 2+3j;
% Make predictions with the initial conditions.
U = forward(net,X);
% Calculate derivatives with respect to X.
Ux = dlgradient(sum(U,"all"),X,EnableHigherDerivatives=true);
% Calculate second-order derivatives with respect to X.
Uxx = dlgradient(sum(Ux,"all"),X,EnableHigherDerivatives=true);
% Calculate mseF. Enforce Helmholtz equation.
f = Uxx + k^2*U;
% Enforce that f is real during forwards and backwards calculations
f = real(f);
f = f*C;
f_r = real(f);
f_i = imag(f);
zeroTarget_r = zeros(size(f_r),"like",f_r);
loss_r = l2loss(f_r,zeroTarget_r);
zeroTarget_i = zeros(size(f_i),"like",f_i);
loss_i = l2loss(f_i,zeroTarget_i);
U0Pred = forward(net,X0);
loss_b = l2loss(U0Pred,U0);
loss = loss_r + loss_i + loss_b;
% Calculate gradients with respect to the learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end

Dr. Veerababu Dharanalakota on 19 May 2023

Thank you, Richard. This resolved the issue.

Jast on 4 Jan 2024

This answer is really appreciated. Thanks so much! It helped me during a late night debugging session!!!

Sign in to comment.

Answer 2

Kartik on 17 May 2023

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1943639-not-able-to-calculate-gradient-of-loss-function-in-a-neural-network-program#answer_1237889

Open in MATLAB Online

Hi,

The error message suggests that there is a complex value in the input to the fully connected layer of your neural network model. This could be due to the fact that the output of the intermediate function "f" includes a complex constant "C" multiplied by the neural network output "NN". If "C" is complex, then "f" will be complex-valued as well, and the subsequent computations involving "f" may introduce complex values.

To resolve this error and perform backpropagation through your neural network, you need to ensure that all inputs to the neural network are real-valued. One way to do this would be to separate the real and imaginary parts of the complex input to the fully connected layer, and pass them separately as inputs. You can do this by using the "real" and "imag" functions to extract the real and imaginary parts of "f" separately:

NN = model(parameters,x);                              % Feedforward neural network 
f = C*NN;                                              % Intermediate function
f_real = real(f);                                      % Real part of f
f_imag = imag(f);                                      % Imaginary part of f
fc_in = [f_real; f_imag];                               % Concatenate f_real and f_imag
fc_out = fullyconnect(fc_in, weights_fc, bias_fc);      % Fully connected layer output

Here, the "fc_in" matrix is formed by concatenating the real and imaginary parts of "f", and then passed to the fully connected layer.

Refer to the following MathWorks documentation for more information:

https://www.mathworks.com/help/deeplearning/ug/train-network-with-complex-valued-data.html?searchHighlight=complex%20valued%20neural%20networks&s_tid=srchtitle_complex%2520valued%2520neural%2520networks_3

3 Comments
Show 1 older commentHide 1 older comment

Kartik on 18 May 2023

Yes, that can be a possible work around, if we seperate the real and imaginary parts and perform all the other calculations like loss calculation and gradient descent on them seperately.

Dr. Veerababu Dharanalakota on 19 May 2023

Okay. Splitting real and imaginary parts results in several loss functions which are to be optimized simultaneously. It may pose a problem during training. But, I will give a try and get back to you.

Sign in to comment.

Not able to calculate gradient of loss function in a neural network program

2 Comments
Show NoneHide None

Accepted Answer

4 Comments
Show 2 older commentsHide 2 older comments

More Answers (1)

3 Comments
Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Not able to calculate gradient of loss function in a neural network program

2 Comments Show NoneHide None

Accepted Answer

4 Comments Show 2 older commentsHide 2 older comments

More Answers (1)

3 Comments Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

2 Comments
Show NoneHide None

4 Comments
Show 2 older commentsHide 2 older comments

3 Comments
Show 1 older commentHide 1 older comment