Physics-informed NN for parameter identification
73 views (last 30 days)
Show older comments
Dear all,
I am trying to use the physics-informed neural network (PINN) for an inverse parameter identification for any ODE or PDE.
I followed the tutorial https://uk.mathworks.com/help/deeplearning/ug/solve-partial-differential-equations-with-lbfgs-method-and-deep-learning.html provided in the help center.
I am wondering does PINN could extract the identified parameters (coefficients in the PDE). Unfortunately, I do not know how to convert the identified parameters in NN to the real parameters.
Thanks in advance.
0 Comments
Answers (4)
Ben
on 24 Aug 2022
Hi Dawei,
The PINN in that example is assuming the PDE has fixed coefficients. To follow the method of Raissi et al. you can consider a parameterized class of PDEs, e.g. for the Burger's equation you can consider:
The method is then to simply minimize the loss with respect to both the neural network learnable parameters, and the coefficients .
To adapt the example you can extend the parameters in the Define Deep Learning Model section:
parameters.lambda = dlarray(0);
parameters.mu = dlarray(-6);
Next you will need to modify the modelLoss function to replace the line f = Ut + U.*Ux - (0.01./pi).*Uxx with the following:
lambda = parameters.lambda;
mu = exp(parameters.mu);
f = Ut + lambda.*U.*Ux - mu.*Uxx;
Finally you will have to fix the computation for numLayers in the model function, as adding lambda and mu to parameters invalidated it. I simply did the following:
numLayers = (numel(fieldnames(parameters))-2)/2;
This will make the example similar to the author's code. I didn't get very good results for coefficient identification when I tried this, this is possibly due to differences in the options between fmincon and the author's use of ScipyOptimizerInterface. I'm trying that out currently, but hopefully this much will help you get started.
15 Comments
Rohit
on 27 Jul 2023
Edited: Rohit
on 28 Jul 2023
Hello @CSCh @Ben @James Gross, I am stuck in the same position of solving an inverse problem code using PINN. Is it possible to share the latest full code for this? I believe that there are some nomenclature changes in the codes with ADAM and L-BFGS in the help center, which is making a bit confusing to understand. Specifcically, I wanted to know how to add the unknown paramters of the differential equation to the trainable paramters of the NN?
Thanks for your help.
Rohit
4 Comments
Wenyu Li
on 28 Dec 2023
@Ben Hi Ben, I have some difficulty doing your step 2. When you said "Implementing modelLoss to take parameters in place of net", I know how to modify my modelLoss to include the parameter mu but how should I modify modelLoss so the gradient contains information for both net.Learnables and mu? I can calculate the gradients using dlgradient but the output are in different types, one is a table and the other simply a dlarray.
Ben
on 2 Jan 2024
@Wenyu Li - In the above I use a struct called parameters to contain lambda and mu. You can also use a struct to contain the gradients, even though they are different types. Here's some toy code to demonstrate this - note that this doesn't actually train and solve the inverse problem well:
% Solve heat equation du/dt = c * d2u/dx^2 with neural net and unknown c.
batchSize = 100;
dimension = 2; % X = (t,x)
% Set up neural net.
hiddenSize = 100;
net = [
featureInputLayer(dimension)
fullyConnectedLayer(hiddenSize)
tanhLayer
fullyConnectedLayer(hiddenSize)
tanhLayer
fullyConnectedLayer(1)];
% Create struct of parameters
parameters.net = dlnetwork(net);
parameters.c = dlarray(1);
% There are multiple solutions to the heat equation above since we haven't specified initial or boundary conditions.
% We'll avoid this issue by passing in samples of the "true solution" we're trying to approximate.
% In practice this may be data you have collected from simulations.
% Use the heat kernel p(t,x) as true solution. The heat kernel solves dp/dt = d2p/dx^2
% for t>0.
% Setting q(c,t,x) = p(c*t,x) gives a solution dq/dt = c * d2q/dx^2.
heatKernel = @(X) exp(-(X(2,:).^2)./(4*X(1,:)) )./sqrt(4*pi*X(1,:));
trueC = 0.95;
solutionFcn = @(X) heatKernel([trueC;1] .* X);
% Training loop
avgG = [];
avgSqG = [];
maxIters = 1000;
lossFcn = dlaccelerate(@modelLoss);
% Sample X in (0.001, 1.001) x (0.001, 1.001). The 0.001 is to avoid t=0.
X = dlarray(0.001+rand(dimension,batchSize),"CB");
uactual = solutionFcn(X);
for iter = 1:maxIters
[loss,gradients] = dlfeval(lossFcn,X,parameters,uactual);
fprintf("Iteration : %d, Loss : %.4f \n",iter, extractdata(loss));
[parameters,avgG,avgSqG] = adamupdate(parameters,gradients,avgG,avgSqG,iter);
end
function [loss,gradients] = modelLoss(X,parameters,uactual)
% X = (t,x).
% PDE: du/dt = c * d2u/dx2
u = predict(parameters.net, X);
du = dlgradient(sum(u,2), X, RetainData=true, EnableHigherDerivatives=true);
dudt = du(1,:);
dudx = du(2,:);
d2u = dlgradient(sum(dudx,2),X,RetainData=true);
d2udx2 = d2u(2,:);
pdeLoss = l2loss(dudt, parameters.c*d2udx2);
reconstructionLoss = l2loss(u,uactual);
loss = pdeLoss + reconstructionLoss;
[gradients.net, gradients.c] = dlgradient(pdeLoss,parameters.net.Learnables,parameters.c);
end
Alternatively you can use a cell array: parameters = {net,c} and write:
function [loss,gradients] = modelLoss(X,parameters,uactual)
% X = (t,x).
% PDE: du/dt = c * d2u/dx2
u = predict(parameters{1}, X);
du = dlgradient(sum(u,2), X, RetainData=true, EnableHigherDerivatives=true);
dudt = du(1,:);
dudx = du(2,:);
d2u = dlgradient(sum(dudx,2),X,RetainData=true);
d2udx2 = d2u(2,:);
pdeLoss = l2loss(dudt, parameters{2}*d2udx2);
reconstructionLoss = l2loss(u,uactual);
loss = pdeLoss + reconstructionLoss;
[gradients{1},gradients{2}] = dlgradient(pdeLoss,parameters{1}.Learnables,parameters{2});
end
I find the struct approach more readable.
Finally you can also just have separate gradient outputs. This means writing multiple adamupdate calls but gives a little extra flexibility, for example you can set separate learn rates for the net and c parameters:
% Training loop
avgG = [];
avgSqG = [];
avgGCoeff = [];
avgSqGCoeff = [];
lrCoeff = 1e-1;
maxIters = 1000;
lossFcn = dlaccelerate(@modelLoss);
X = dlarray(0.001+rand(dimension,batchSize),"CB");
uactual = solutionFcn(X);
for iter = 1:maxIters
[loss,netGradient,coeffGradient] = dlfeval(lossFcn,X,parameters,uactual);
fprintf("Iteration : %d, Loss : %.4f \n",iter, extractdata(loss));
[parameters{1},avgG,avgSqG] = adamupdate(parameters{1},netGradient,avgG,avgSqG,iter);
[parameters{2},avgGCoeff,avgSqGCoeff] = adamupdate(parameters{2},coeffGradient,avgGCoeff,avgSqGCoeff,iter,lrCoeff);
end
function [loss,netGradient,coeffGradient] = modelLoss(X,parameters,uactual)
% X = (t,x).
% PDE: du/dt = c * d2u/dx2
u = predict(parameters{1}, X);
du = dlgradient(sum(u,2), X, RetainData=true, EnableHigherDerivatives=true);
dudt = du(1,:);
dudx = du(2,:);
d2u = dlgradient(sum(dudx,2),X,RetainData=true);
d2udx2 = d2u(2,:);
pdeLoss = l2loss(dudt, parameters{2}*d2udx2);
reconstructionLoss = l2loss(u,uactual);
loss = pdeLoss + reconstructionLoss;
[netGradient,coeffGradient] = dlgradient(pdeLoss,parameters{1}.Learnables,parameters{2});
end
This latter approach may be relevant, since if you inspect the gradients in this toy example, the gradients for the network learnables and the coefficient have different orders of magnitude, and may benefit from distinct learning rates.
Hope that helps.
Joshua Prince
on 1 Aug 2023
Hello @CSCh @Ben @James Gross, I am also in need of help. Would it be possible to use this same PINN but where the geometry of the problem is defined by a mesh (simialr to this type fo geometry https://www.mathworks.com/help/pde/ug/pde.pdemodel.geometryfrommesh.html).
2 Comments
Ben
on 2 Aug 2023
@Joshua Prince yes, you should be able to use a mesh to define the collocation points used to train the model. You may also be able to use the PDE Toolbox features to solve the PDE on the mesh, which you could use to either compare with the PINN approach, or use as extra data for training the PINN (which is mesh-free). Unfortunately I don't have code to hand working through such an example.
binlong liu
on 2 Aug 2023
Hello @Ben@James Gross, I want to use the PINN to solve PDEs by using two neural networks (net_1 with input of x,t and output of c_w; net_2 with inputs of x,t and r and output of c_intra) but with one loss function (loss_total = loss_PDE+lossBCIC+loss_OB). I tried to use Adam to update the hyperparameters of net_1 and net_2 and it works but the errors are not small enough and I want to try the L-BFGS method, but I have no idea how to do it in matlab. The following code shows the way how I did it in matlab, but it didn't work. Could you help me to solve this problem.
Thanks for your help!
[loss_total,loss_PDE,loss_BCIC,loss_OB,t_BTC,c_w_BTC_Pred,gradients_1,gradients_2]= dlfeval(@modelLoss,net_1,net_2,t_w,x_w,r_w,t_intra,x_intra,r_intra,...
tBC1_w,tBC2_w,xBC1_w,xBC2_w,cBC1_w,tBC1_intra,tBC2_intra,xBC1_intra,...
xBC2_intra,rBC1_intra,rBC2_intra,tIC_w,xIC_w,cIC_w,tIC_intra,xIC_intra,rIC_intra,cIC_intra);
% Initialize the TrainingProgressMonitor object. Because the timer starts
% when you create the monitor object, make sure that you create the object
% close to the training loop.
monitor = trainingProgressMonitor( ...
Metrics="TrainingLoss", ...
Info="Epoch", ...
XLabel="Epoch");
% Train the network using a custom training loop. Use the full data set at
% each iteration. Update the network learnable parameters and solver state using the lbfgsupdate function. At the end of each iteration, update the training progress monitor.
for i = 1:numEpochs
[net_1, solverState] = lbfgsupdate(net_1,[loss_total,gradients_1],solverState);
[net_2, solverState] = lbfgsupdate(net_2,[loss_total,gradients_2],solverState);
updateInfo(monitor,Epoch=i);
recordMetrics(monitor,i,TrainingLoss=solverState.Loss);
end
1 Comment
Ben
on 2 Aug 2023
@binlong liu - you appear to be using lbfgsupdate incorrectly. The 2nd input should be the loss function as a function_handle, see this part of the doc.
Note that you don't need two lbfgsupdate calls, you can put the two networks together in a cell-array or struct as described here. Actually this is likely to be necessary for lbfgsupdate since it calls the loss function for you, and you likely need both networks to do this, and to get the correct gradients.
See Also
Categories
Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!