Physics-informed NN for parameter identification

Question

1 投票

Dear all,

I am trying to use the physics-informed neural network (PINN) for an inverse parameter identification for any ODE or PDE.

I followed the tutorial https://uk.mathworks.com/help/deeplearning/ug/solve-partial-differential-equations-with-lbfgs-method-and-deep-learning.html provided in the help center.

I am wondering does PINN could extract the identified parameters (coefficients in the PDE). Unfortunately, I do not know how to convert the identified parameters in NN to the real parameters.

Thanks in advance.

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Ben 2022 年 8 月 24 日

0 投票

Hi Dawei,

The PINN in that example is assuming the PDE has fixed coefficients. To follow the method of Raissi et al. you can consider a parameterized class of PDEs, e.g. for the Burger's equation you can consider:

The method is then to simply minimize the loss with respect to both the neural network learnable parameters, and the coefficients

.

To adapt the example you can extend the parameters in the Define Deep Learning Model section:

parameters.lambda = dlarray(0);

parameters.mu = dlarray(-6);

Next you will need to modify the modelLoss function to replace the line f = Ut + U.*Ux - (0.01./pi).*Uxx with the following:

lambda = parameters.lambda;

mu = exp(parameters.mu);

f = Ut + lambda.*U.*Ux - mu.*Uxx;

Finally you will have to fix the computation for numLayers in the model function, as adding lambda and mu to parameters invalidated it. I simply did the following:

numLayers = (numel(fieldnames(parameters))-2)/2;

This will make the example similar to the author's code. I didn't get very good results for coefficient identification when I tried this, this is possibly due to differences in the options between fmincon and the author's use of ScipyOptimizerInterface. I'm trying that out currently, but hopefully this much will help you get started.

15 件のコメント
13 件の古いコメントを表示 13 件の古いコメントを非表示

James Gross 2023 年 6 月 12 日

MATLAB Online で開く

Hi Chris,

Just wanted to jump in and say that the discrepancies in results and performance between the old example with fmincon and the new one with lbfgsupdate are likely caused by two issues.

1) The fmincon example does all calculations in double precision by default while the new example does the calculations in single precision. In this case, the workaround for doing the calculations in double precision is to use the following code after creating the dlnetwork

% Convert learnables to double
learnables = pinn.Learnables;
learnables = dlupdate( @double, learnables );
pinn.Learnables = learnables;

2) The fmincon example has appropriate stopping conditions, while the new example does not. You can add these by adding the following code to the training loop

% Stop training if stopping conditions are met
if solverState.GradientsNorm < gradientTolerance || solverState.StepNorm < stepTolerance || strcmp(solverState.LineSearchStatus, "failed")
    break
end

where gradientTolerance and stepTolerance are tolerances you set yourself. I believe the default for these in fmincon are 1e-6.

I have gone ahead and created an enhancement request for that example so that our documentation team can consider adding the suggested code to the example. In the meantime, I hope the workarounds I have suggested work for you and allow you to get better results with improved performance. If it does not, please let me know!

Cheers,

James

James Gross 2023 年 6 月 14 日

MATLAB Online で開く

Hi Chris,

The issue with using single precision is that you might not be able to get as accurate of a result as double precision. This is an issue because if suitable break conditions are not set in the training loop, lbfgsupdate will perform lots of unnecessary calculations at the tail end of the optimization during the line search procedure because it can't find a suitable solution at that precision level.

One other suggestion to improve the performance would be to accelerate the custom training loop via dlaccelerate. In particular, you can replace the loss function specification with

accFun = dlaccelerate(@modelLoss);
lossFcn = @(net) dlfeval(accFun,net,X,T,X0,T0,U0);

Another thing to bear in mind is that the new example includes plotting which can add additional overhead. This means that when comparing these two examples, you should comment out any code for plotting in the new example for a fairer comparison.

Finally, it is important to note that L-BFGS is very sensitive to network initialization (even more so than stochastic optimizers), and if the network is initialized with a poor set of parameters, a good set of parameters may never be found (or at least may take a very long time to find). This means that, in order to make a fair comparison, multiple training runs should be performed and the results should be averaged.

I made the above modifications for a very similar example and can confirm that doing so improved the performance of the updated workflow via lbfgsupdate to the point where its performance was comparable and was even able to achieve more accurate results than the old workflow via fmincon.

I hope this information helps!

Cheers,

James

Mitra 2023 年 8 月 25 日

編集済み: Mitra 2023 年 8 月 25 日

Has anyone successfully implemented an inverse PINN using MATLAB? I tried, but the additional parameters don't seem to change and the corresponding gradient remains at 0.

Zhe 2024 年 1 月 8 日

@Ben

Hello Ben,

Could you please help me with this issue?

https://ww2.mathworks.cn/matlabcentral/answers/2067501-physics-informed-nn-for-parameter-identification

Thanks a lot

サインインしてコメントする。

Answer 2

Rohit 2023 年 7 月 27 日

編集済み: Rohit 2023 年 7 月 28 日

0 投票

Hello @CSCh @Ben @James Gross, I am stuck in the same position of solving an inverse problem code using PINN. Is it possible to share the latest full code for this? I believe that there are some nomenclature changes in the codes with ADAM and L-BFGS in the help center, which is making a bit confusing to understand. Specifcically, I wanted to know how to add the unknown paramters of the differential equation to the trainable paramters of the NN?

Thanks for your help.

Rohit

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

Wenyu Li 2023 年 12 月 28 日

@Ben Hi Ben, I have some difficulty doing your step 2. When you said "Implementing modelLoss to take parameters in place of net", I know how to modify my modelLoss to include the parameter mu but how should I modify modelLoss so the gradient contains information for both net.Learnables and mu? I can calculate the gradients using dlgradient but the output are in different types, one is a table and the other simply a dlarray.

Ben 2024 年 1 月 2 日

@Wenyu Li - In the above I use a struct called parameters to contain lambda and mu. You can also use a struct to contain the gradients, even though they are different types. Here's some toy code to demonstrate this - note that this doesn't actually train and solve the inverse problem well:

% Solve heat equation du/dt = c * d2u/dx^2 with neural net and unknown c.

batchSize = 100;

dimension = 2; % X = (t,x)

% Set up neural net.

hiddenSize = 100;

net = [

featureInputLayer(dimension)

fullyConnectedLayer(hiddenSize)

tanhLayer

fullyConnectedLayer(hiddenSize)

tanhLayer

fullyConnectedLayer(1)];

% Create struct of parameters

parameters.net = dlnetwork(net);

parameters.c = dlarray(1);

% There are multiple solutions to the heat equation above since we haven't specified initial or boundary conditions.

% We'll avoid this issue by passing in samples of the "true solution" we're trying to approximate.

% In practice this may be data you have collected from simulations.

% Use the heat kernel p(t,x) as true solution. The heat kernel solves dp/dt = d2p/dx^2

% for t>0.

% Setting q(c,t,x) = p(c*t,x) gives a solution dq/dt = c * d2q/dx^2.

heatKernel = @(X) exp(-(X(2,:).^2)./(4*X(1,:)) )./sqrt(4*pi*X(1,:));

trueC = 0.95;

solutionFcn = @(X) heatKernel([trueC;1] .* X);

% Training loop

avgG = [];

avgSqG = [];

maxIters = 1000;

lossFcn = dlaccelerate(@modelLoss);

% Sample X in (0.001, 1.001) x (0.001, 1.001). The 0.001 is to avoid t=0.

X = dlarray(0.001+rand(dimension,batchSize),"CB");

uactual = solutionFcn(X);

for iter = 1:maxIters

[loss,gradients] = dlfeval(lossFcn,X,parameters,uactual);

fprintf("Iteration : %d, Loss : %.4f \n",iter, extractdata(loss));

[parameters,avgG,avgSqG] = adamupdate(parameters,gradients,avgG,avgSqG,iter);

end

function [loss,gradients] = modelLoss(X,parameters,uactual)

% X = (t,x).

% PDE: du/dt = c * d2u/dx2

u = predict(parameters.net, X);

du = dlgradient(sum(u,2), X, RetainData=true, EnableHigherDerivatives=true);

dudt = du(1,:);

dudx = du(2,:);

d2u = dlgradient(sum(dudx,2),X,RetainData=true);

d2udx2 = d2u(2,:);

pdeLoss = l2loss(dudt, parameters.c*d2udx2);

reconstructionLoss = l2loss(u,uactual);

loss = pdeLoss + reconstructionLoss;

[gradients.net, gradients.c] = dlgradient(pdeLoss,parameters.net.Learnables,parameters.c);

end

Alternatively you can use a cell array: parameters = {net,c} and write:

function [loss,gradients] = modelLoss(X,parameters,uactual)

% X = (t,x).

% PDE: du/dt = c * d2u/dx2

u = predict(parameters{1}, X);

du = dlgradient(sum(u,2), X, RetainData=true, EnableHigherDerivatives=true);

dudt = du(1,:);

dudx = du(2,:);

d2u = dlgradient(sum(dudx,2),X,RetainData=true);

d2udx2 = d2u(2,:);

pdeLoss = l2loss(dudt, parameters{2}*d2udx2);

reconstructionLoss = l2loss(u,uactual);

loss = pdeLoss + reconstructionLoss;

[gradients{1},gradients{2}] = dlgradient(pdeLoss,parameters{1}.Learnables,parameters{2});

end

I find the struct approach more readable.

Finally you can also just have separate gradient outputs. This means writing multiple adamupdate calls but gives a little extra flexibility, for example you can set separate learn rates for the net and c parameters:

% Training loop

avgG = [];

avgSqG = [];

avgGCoeff = [];

avgSqGCoeff = [];

lrCoeff = 1e-1;

maxIters = 1000;

lossFcn = dlaccelerate(@modelLoss);

X = dlarray(0.001+rand(dimension,batchSize),"CB");

uactual = solutionFcn(X);

for iter = 1:maxIters

[loss,netGradient,coeffGradient] = dlfeval(lossFcn,X,parameters,uactual);

fprintf("Iteration : %d, Loss : %.4f \n",iter, extractdata(loss));

[parameters{1},avgG,avgSqG] = adamupdate(parameters{1},netGradient,avgG,avgSqG,iter);

[parameters{2},avgGCoeff,avgSqGCoeff] = adamupdate(parameters{2},coeffGradient,avgGCoeff,avgSqGCoeff,iter,lrCoeff);

end

function [loss,netGradient,coeffGradient] = modelLoss(X,parameters,uactual)

% X = (t,x).

% PDE: du/dt = c * d2u/dx2

u = predict(parameters{1}, X);

du = dlgradient(sum(u,2), X, RetainData=true, EnableHigherDerivatives=true);

dudt = du(1,:);

dudx = du(2,:);

d2u = dlgradient(sum(dudx,2),X,RetainData=true);

d2udx2 = d2u(2,:);

pdeLoss = l2loss(dudt, parameters{2}*d2udx2);

reconstructionLoss = l2loss(u,uactual);

loss = pdeLoss + reconstructionLoss;

[netGradient,coeffGradient] = dlgradient(pdeLoss,parameters{1}.Learnables,parameters{2});

end

This latter approach may be relevant, since if you inspect the gradients in this toy example, the gradients for the network learnables and the coefficient have different orders of magnitude, and may benefit from distinct learning rates.

Hope that helps.

サインインしてコメントする。

Answer 3

Joshua Prince 2023 年 8 月 1 日

0 投票

Hello @CSCh @Ben @James Gross, I am also in need of help. Would it be possible to use this same PINN but where the geometry of the problem is defined by a mesh (simialr to this type fo geometry https://www.mathworks.com/help/pde/ug/pde.pdemodel.geometryfrommesh.html).

2 件のコメント
なしを表示なしを非表示

Ben 2023 年 8 月 2 日

@Joshua Prince yes, you should be able to use a mesh to define the collocation points used to train the model. You may also be able to use the PDE Toolbox features to solve the PDE on the mesh, which you could use to either compare with the PINN approach, or use as extra data for training the PINN (which is mesh-free). Unfortunately I don't have code to hand working through such an example.

Joshua Prince 2023 年 8 月 3 日

Thanks for the help!

サインインしてコメントする。

Answer 4

binlong liu 2023 年 8 月 2 日

MATLAB Online で開く

0 投票

Hello @Ben @James Gross, I want to use the PINN to solve PDEs by using two neural networks (net_1 with input of x,t and output of c_w; net_2 with inputs of x,t and r and output of c_intra) but with one loss function (loss_total = loss_PDE+lossBCIC+loss_OB). I tried to use Adam to update the hyperparameters of net_1 and net_2 and it works but the errors are not small enough and I want to try the L-BFGS method, but I have no idea how to do it in matlab. The following code shows the way how I did it in matlab, but it didn't work. Could you help me to solve this problem.

Thanks for your help!

[loss_total,loss_PDE,loss_BCIC,loss_OB,t_BTC,c_w_BTC_Pred,gradients_1,gradients_2]= dlfeval(@modelLoss,net_1,net_2,t_w,x_w,r_w,t_intra,x_intra,r_intra,...
    tBC1_w,tBC2_w,xBC1_w,xBC2_w,cBC1_w,tBC1_intra,tBC2_intra,xBC1_intra,...
    xBC2_intra,rBC1_intra,rBC2_intra,tIC_w,xIC_w,cIC_w,tIC_intra,xIC_intra,rIC_intra,cIC_intra);
% Initialize the TrainingProgressMonitor object. Because the timer starts 
% when you create the monitor object, make sure that you create the object
% close to the training loop.
monitor = trainingProgressMonitor( ...
    Metrics="TrainingLoss", ...
    Info="Epoch", ...
    XLabel="Epoch");
% Train the network using a custom training loop. Use the full data set at 
% each iteration. Update the network learnable parameters and solver state using the lbfgsupdate function. At the end of each iteration, update the training progress monitor.
for i = 1:numEpochs
    [net_1, solverState] = lbfgsupdate(net_1,[loss_total,gradients_1],solverState);
    [net_2, solverState] = lbfgsupdate(net_2,[loss_total,gradients_2],solverState);
    updateInfo(monitor,Epoch=i);
    recordMetrics(monitor,i,TrainingLoss=solverState.Loss);
end

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Ben 2023 年 8 月 2 日

@binlong liu - you appear to be using lbfgsupdate incorrectly. The 2nd input should be the loss function as a function_handle, see this part of the doc.

Note that you don't need two lbfgsupdate calls, you can put the two networks together in a cell-array or struct as described here. Actually this is likely to be necessary for lbfgsupdate since it calls the loss function for you, and you likely need both networks to do this, and to get the correct gradients.

サインインしてコメントする。

Physics-informed NN for parameter identification

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (4 件)

15 件のコメント
13 件の古いコメントを表示 13 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

Physics-informed NN for parameter identification

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (4 件)

15 件のコメント 13 件の古いコメントを表示 13 件の古いコメントを非表示

4 件のコメント 2 件の古いコメントを表示 2 件の古いコメントを非表示

2 件のコメント なしを表示 なしを非表示

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

15 件のコメント
13 件の古いコメントを表示 13 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示