Not able to calculate gradient of loss function in a neural network program
11 views (last 30 days)
Show older comments
Dr. Veerababu Dharanalakota
on 8 Apr 2023
Commented: Jast
on 4 Jan 2024
Hi,
I am trying to solve a phisics-informed neural network problem in which I constructed a loss function as follows
function [loss,gradients] = loss_fun(parameters,x,C,alpha)
% C is a complex-valued constant
% alpha is a real-valued constant
NN = model(parameters,x); % Feedforward neural network
f = C*NN; % Intermediate function
g = fxx+alpha*f; % Objective function
gr = real(g); % Real-part of g
gi = imag(g); % Imaginary-part of g
zeroTarget_r = zeros(size(gr),"like",gr); % Zero targets for the real-part
loss_r = l2loss(gr, zeroTarget_r); % Real-part loss function
zeroTarget_i = zeros(size(gi),"like",gi); % Zero targets for the imaginary-part
loss_i = l2loss(gi, zeroTarget_i); % Imaginary-part loss function
loss = loss_r+loss_i; % Total loss function (real-valued)
gradients = dlgradient(loss,parameters); % Loss function gradients with respect to parameters
end
The function 'model' returns a feedforward neural network . I would like the minimize the function g with respect to the parameters (θ). The input variable x as well as the parameters θ of the neural network are real-valued. Here, which is a double derivative of f with respect to x, is calculated as . The presence of complex-valued constant C makes the objective function g a complex-valued. Hence, I split it into real and imaginary parts, calculated individual loss functions and added them.
While calculating the gradients I am encountering the following error
"Encountered complex value when computing gradient with respect to an input to fullyconnect. Convert all inputs to fullyconnect to real".
I checked indivial loss values and the parameter values. They are purely real.
I would be grateful to you if you could tell possible reasons for the error and resolution steps.
I am using fmincon with lbfgs hessian approximation for the optimization.
2 Comments
Richard
on 17 May 2023
I posted an answer regarding the complex value issue, but as an aside, you might be interested in the lbfgsupdate function which was recently added to Deep Learning Toolbox in R2023a.
Accepted Answer
Richard
on 17 May 2023
I think this may be due to your introduction of the complex value into the output of the model, NN. Even though you are later splitting this into two real halves, the gradient backwards computation will be stepping back through this (complex) C*(real) NN operation which reintroduces a complex gradient during the backwards.
Try calling NN = real(NN) before this step to insulate the real-valued model from the complex part of the calculation:
NN = model(parameters,x); % Feedforward neural network
NN = real(NN);
f = C*NN; % Intermediate function
It may seem counter-intuitive to apply this before the complex values are created, and indeed in the forwards computation this will have no effect because NN is already real. But in the backwards pass for gradients the computation flows in the other direction through the code, and the backwards for real(NN) will be after the backwards for C*NN. It will discard the imaginary parts of the gradient, which at this point have no meaning because there is no imaginary part of the NN value.
4 Comments
Jast
on 4 Jan 2024
This answer is really appreciated. Thanks so much! It helped me during a late night debugging session!!!
More Answers (1)
Kartik
on 17 May 2023
Hi,
The error message suggests that there is a complex value in the input to the fully connected layer of your neural network model. This could be due to the fact that the output of the intermediate function "f" includes a complex constant "C" multiplied by the neural network output "NN". If "C" is complex, then "f" will be complex-valued as well, and the subsequent computations involving "f" may introduce complex values.
To resolve this error and perform backpropagation through your neural network, you need to ensure that all inputs to the neural network are real-valued. One way to do this would be to separate the real and imaginary parts of the complex input to the fully connected layer, and pass them separately as inputs. You can do this by using the "real" and "imag" functions to extract the real and imaginary parts of "f" separately:
NN = model(parameters,x); % Feedforward neural network
f = C*NN; % Intermediate function
f_real = real(f); % Real part of f
f_imag = imag(f); % Imaginary part of f
fc_in = [f_real; f_imag]; % Concatenate f_real and f_imag
fc_out = fullyconnect(fc_in, weights_fc, bias_fc); % Fully connected layer output
Here, the "fc_in" matrix is formed by concatenating the real and imaginary parts of "f", and then passed to the fully connected layer.
Refer to the following MathWorks documentation for more information:
3 Comments
Kartik
on 18 May 2023
Yes, that can be a possible work around, if we seperate the real and imaginary parts and perform all the other calculations like loss calculation and gradient descent on them seperately.
See Also
Categories
Find more on Operations in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!