現在この質問をフォロー中です
- フォローしているコンテンツ フィードに更新が表示されます。
- コミュニケーション基本設定に応じて電子メールを受け取ることができます。
Can I disable intermediate calculations for fmincon
2 ビュー (過去 30 日間)
古いコメントを表示
In the optimization fmincon, there is always a lot of the 'intermediate calculations' (can be referred to https://www.mathworks.com/help/optim/ug/iterations-and-function-counts.html#mw_dc044841-a6b6-43c0-8b29-0af2fbbcb66c) that increase the function counts during optimization. In the link it says that the "intermediate calculations can involve evaluating the objective function and any constraints at points near the current iterate x_i. For example, the solver might estimate a gradient by finite differences."
What is the purpose of these intermediate calculations? Since I have provided the gradient calculation for my objective function, why would the optimizer need to calculate the finite difference gradient? My example objective function does not have any constaints.
My example code and the output for optimization are shown below. From the output, the 'Iter' term and 'F-count' term show that there are many intermediate calculations involved.
If the calculations for the objective and the gradient are expensive, the intermediate calculation can take a lot of time.
options = optimoptions('fmincon','SpecifyObjectiveGradient',true,'Display',...
'iter');
fun = @rosenboth;
x0 = [-1,2];
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = [];
[x,f] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 1.040000e+02 0.000e+00 3.960e+02
1 7 1.028667e+02 0.000e+00 6.444e+02 7.071e-01
2 9 8.035769e+01 0.000e+00 4.797e+02 7.071e-01
3 10 6.132722e+00 0.000e+00 7.816e+00 1.155e+00
4 11 6.065238e+00 0.000e+00 5.189e+00 3.681e-02
5 12 5.678075e+00 0.000e+00 6.330e+00 2.437e-01
6 14 5.112684e+00 0.000e+00 3.119e+01 5.825e-01
7 15 4.769085e+00 0.000e+00 2.229e+01 8.987e-02
8 16 4.630101e+00 0.000e+00 4.064e+01 6.700e-01
9 17 3.708221e+00 0.000e+00 7.080e+00 1.786e-01
10 18 3.175089e+00 0.000e+00 7.950e+00 2.845e-01
11 20 3.165815e+00 0.000e+00 2.115e+01 2.951e-01
12 21 2.899436e+00 0.000e+00 9.888e+00 1.282e-01
13 22 2.725372e+00 0.000e+00 7.164e+00 6.340e-02
14 23 2.382814e+00 0.000e+00 1.316e+01 3.822e-01
15 24 2.129017e+00 0.000e+00 4.236e+00 1.134e-01
16 25 1.874512e+00 0.000e+00 4.274e+00 1.216e-01
17 27 1.784218e+00 0.000e+00 1.310e+01 2.576e-01
18 28 1.522263e+00 0.000e+00 3.092e+00 1.092e-01
19 29 1.353081e+00 0.000e+00 2.848e+00 7.749e-02
20 31 1.178302e+00 0.000e+00 8.209e+00 1.660e-01
21 32 1.014260e+00 0.000e+00 3.229e+00 2.716e-02
22 33 7.723798e-01 0.000e+00 5.463e+00 1.596e-01
23 34 6.002908e-01 0.000e+00 3.264e+00 8.887e-02
24 35 4.638434e-01 0.000e+00 4.710e+00 1.346e-01
25 36 2.907823e-01 0.000e+00 5.478e+00 2.276e-01
26 39 1.881240e-01 0.000e+00 6.312e+00 1.949e-01
27 40 1.729287e-01 0.000e+00 1.782e+00 9.193e-02
28 41 1.396410e-01 0.000e+00 8.969e-01 6.110e-02
29 43 1.223560e-01 0.000e+00 2.758e+00 6.740e-02
30 44 1.073474e-01 0.000e+00 2.984e+00 4.174e-02
First-order Norm of
Iter F-count f(x) Feasibility optimality step
31 45 5.633254e-02 0.000e+00 1.855e+00 1.399e-01
32 46 3.253577e-02 0.000e+00 1.257e+00 9.971e-02
33 47 1.470709e-02 0.000e+00 1.393e+00 1.220e-01
34 48 1.418260e-02 0.000e+00 3.657e+00 9.564e-02
35 50 2.088770e-04 0.000e+00 4.220e-01 1.271e-01
36 51 1.699139e-04 0.000e+00 4.750e-02 7.371e-03
37 52 6.403872e-05 0.000e+00 7.905e-02 1.168e-02
38 53 7.152289e-06 0.000e+00 9.437e-02 1.448e-02
39 54 3.937940e-07 0.000e+00 2.330e-02 2.254e-03
40 55 1.379737e-10 0.000e+00 6.095e-05 4.873e-04
41 56 3.901588e-14 0.000e+00 1.103e-06 2.559e-05
42 57 1.179907e-20 0.000e+00 4.271e-09 4.383e-07
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
function [f, g] = rosenboth(x)
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;
if nargout > 1 % gradient required
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
end
end
採用された回答
Matt J
2022 年 6 月 24 日
編集済み: Matt J
2022 年 6 月 24 日
Since you are specifying the objective gradient, finite difference calculations will not be executed for that particular piece of the iteration loop. However, if second derivatives are needed and you haven't provided a Hessian calculation, finite differences will still be need for that. Also, multiple function evaluations may still be necessary, depending on the algorithm, for things like line searches.
15 件のコメント
Fangning Zheng
2022 年 6 月 24 日
Can interior-point algorithm not use hessian? Or is there any other algorithm that does not require the hessian calculation? Thank you!
Torsten
2022 年 6 月 24 日
All fmincon algorithms use the Hessian.
If you want an algorithm without Hessian, you could program the "steepest descent method" on your own.
Matt J
2022 年 6 月 24 日
You're quite welcome, but please Accept-click the answer if you consider the matter closed.
Matt J
2022 年 6 月 24 日
編集済み: Matt J
2022 年 6 月 24 日
I don't think the extra evaluations in your case are coming from finite differences approximations to the Hessian. The default interior-point algorithm settings use the gradient only. The extra evaluations are likely coming from line searches.
Also, you do not need to implement the steepest descent algorithm on your own. This example shows how to do it with fminunc,
However, steepest descent, or any algorithm that does not use at least an approximation to the Hessian, tends to perform badly, so it is not recommended.
Fangning Zheng
2022 年 6 月 24 日
If the extra evaluations are coming from line searches, then gradient is not required? Since what I understood for line search is that once the gradient is calculated, the algorithm use it to calculate the descend direction p_k. Then it use line search to find a step size alpha_k so that f(x_k + alpha _k * p_k) < f_k. In this case, gradient is not required. If we provide the gradient calculation in the same script of the objective function, gradient calculation is wasted?
Torsten
2022 年 6 月 24 日
編集済み: Torsten
2022 年 6 月 24 日
The gradient is needed in every call. This seems to indicate that the objective function is not called to make a line search.
Why do you specify "FiniteDifferenceStepSize" = 100 ???? It has to be in the order of 1e-8.
options = optimoptions('fmincon','SpecifyObjectiveGradient',true,'Display',...
'iter','FiniteDifferenceStepSize',100);
fun = @rosenboth;
x0 = [-1,2];
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = [];
[x,f] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
ans = 2
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 1.040000e+02 0.000e+00 3.960e+02
ans = 2
ans = 2
ans = 2
ans = 2
ans = 2
ans = 2
1 7 1.028667e+02 0.000e+00 6.444e+02 7.071e-01
ans = 2
ans = 2
2 9 8.035769e+01 0.000e+00 4.797e+02 7.071e-01
ans = 2
3 10 6.132722e+00 0.000e+00 7.816e+00 1.155e+00
ans = 2
4 11 6.065238e+00 0.000e+00 5.189e+00 3.681e-02
ans = 2
5 12 5.678075e+00 0.000e+00 6.330e+00 2.437e-01
ans = 2
ans = 2
6 14 5.112684e+00 0.000e+00 3.119e+01 5.825e-01
ans = 2
7 15 4.769085e+00 0.000e+00 2.229e+01 8.987e-02
ans = 2
8 16 4.630101e+00 0.000e+00 4.064e+01 6.700e-01
ans = 2
9 17 3.708221e+00 0.000e+00 7.080e+00 1.786e-01
ans = 2
10 18 3.175089e+00 0.000e+00 7.950e+00 2.845e-01
ans = 2
ans = 2
11 20 3.165815e+00 0.000e+00 2.115e+01 2.951e-01
ans = 2
12 21 2.899436e+00 0.000e+00 9.888e+00 1.282e-01
ans = 2
13 22 2.725372e+00 0.000e+00 7.164e+00 6.340e-02
ans = 2
14 23 2.382814e+00 0.000e+00 1.316e+01 3.822e-01
ans = 2
15 24 2.129017e+00 0.000e+00 4.236e+00 1.134e-01
ans = 2
16 25 1.874512e+00 0.000e+00 4.274e+00 1.216e-01
ans = 2
ans = 2
17 27 1.784218e+00 0.000e+00 1.310e+01 2.576e-01
ans = 2
18 28 1.522263e+00 0.000e+00 3.092e+00 1.092e-01
ans = 2
19 29 1.353081e+00 0.000e+00 2.848e+00 7.749e-02
ans = 2
ans = 2
20 31 1.178302e+00 0.000e+00 8.209e+00 1.660e-01
ans = 2
21 32 1.014260e+00 0.000e+00 3.229e+00 2.716e-02
ans = 2
22 33 7.723798e-01 0.000e+00 5.463e+00 1.596e-01
ans = 2
23 34 6.002908e-01 0.000e+00 3.264e+00 8.887e-02
ans = 2
24 35 4.638434e-01 0.000e+00 4.710e+00 1.346e-01
ans = 2
25 36 2.907823e-01 0.000e+00 5.478e+00 2.276e-01
ans = 2
ans = 2
ans = 2
26 39 1.881240e-01 0.000e+00 6.312e+00 1.949e-01
ans = 2
27 40 1.729287e-01 0.000e+00 1.782e+00 9.193e-02
ans = 2
28 41 1.396410e-01 0.000e+00 8.969e-01 6.110e-02
ans = 2
ans = 2
29 43 1.223560e-01 0.000e+00 2.758e+00 6.740e-02
ans = 2
30 44 1.073474e-01 0.000e+00 2.984e+00 4.174e-02
ans = 2
First-order Norm of
Iter F-count f(x) Feasibility optimality step
31 45 5.633254e-02 0.000e+00 1.855e+00 1.399e-01
ans = 2
32 46 3.253577e-02 0.000e+00 1.257e+00 9.971e-02
ans = 2
33 47 1.470709e-02 0.000e+00 1.393e+00 1.220e-01
ans = 2
34 48 1.418260e-02 0.000e+00 3.657e+00 9.564e-02
ans = 2
ans = 2
35 50 2.088770e-04 0.000e+00 4.220e-01 1.271e-01
ans = 2
36 51 1.699139e-04 0.000e+00 4.750e-02 7.371e-03
ans = 2
37 52 6.403872e-05 0.000e+00 7.905e-02 1.168e-02
ans = 2
38 53 7.152289e-06 0.000e+00 9.437e-02 1.448e-02
ans = 2
39 54 3.937940e-07 0.000e+00 2.330e-02 2.254e-03
ans = 2
40 55 1.379737e-10 0.000e+00 6.095e-05 4.873e-04
ans = 2
41 56 3.901588e-14 0.000e+00 1.103e-06 2.559e-05
ans = 2
42 57 1.179907e-20 0.000e+00 4.271e-09 4.383e-07
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
function [f, g] = rosenboth(x)
nargout
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;
if nargout > 1 % gradient required
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
end
end
Fangning Zheng
2022 年 6 月 24 日
編集済み: Fangning Zheng
2022 年 6 月 24 日
I was trying to see if there is any differences with different values of "FiniteDifferenceStepSize", you can diable this parameter....
I agree with you that every time it's calling both objective function and gradient calculation. I'm pretty confused with the intermediate calculations and what exactly is it calculating?
Fangning Zheng
2022 年 6 月 24 日
編集済み: Fangning Zheng
2022 年 6 月 24 日
I print out the fval:
I added the hessian calculation. The total iteration is smaller. But at some iterations the algorithm is still making intermediate calculation with calling gradient. I think it is doing line search but why is it still calling gradient calculation? It's a waste. Is it because it's doing exact line search?
options = optimoptions('fmincon','SpecifyObjectiveGradient',true,'Display',...
'iter','HessianFcn',@hessinterior);
fun = @rosenboth;
x0 = [-1,2];
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = [];
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
f = 104
nargout = 2
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 1.040000e+02 0.000e+00 3.960e+02
f = 7.381693e+02
nargout = 2
f = 1.097243e+02
nargout = 2
f = 6.568461e+00
nargout = 2
1 4 6.568461e+00 0.000e+00 5.317e+01 3.536e-01
f = 4.452518e+00
nargout = 2
2 5 4.452518e+00 0.000e+00 1.574e+01 7.071e-01
f = 4.330045e+00
nargout = 2
3 6 4.330045e+00 0.000e+00 3.727e+01 7.771e-01
f = 2.840799e+00
nargout = 2
4 7 2.840799e+00 0.000e+00 4.947e+00 7.604e-02
f = 3.832249e+01
nargout = 2
f = 4.105706e+00
nargout = 2
f = 2.398100e+00
nargout = 2
5 10 2.398100e+00 0.000e+00 1.131e+01 3.305e-01
f = 1.835225e+00
nargout = 2
6 11 1.835225e+00 0.000e+00 5.917e+00 1.914e-01
f = 1.485471e+00
nargout = 2
7 12 1.485471e+00 0.000e+00 1.023e+01 2.588e-01
f = 1.025018e+00
nargout = 2
8 13 1.025018e+00 0.000e+00 2.046e+00 1.030e-01
f = 1.820866e+00
nargout = 2
f = 8.166366e-01
nargout = 2
9 15 8.166366e-01 0.000e+00 6.841e+00 1.713e-01
f = 5.455115e-01
nargout = 2
10 16 5.455115e-01 0.000e+00 2.276e+00 1.271e-01
f = 5.033114e-01
nargout = 2
11 17 5.033114e-01 0.000e+00 9.925e+00 2.588e-01
f = 2.126112e-01
nargout = 2
12 18 2.126112e-01 0.000e+00 4.565e-01 1.061e-01
f = 1.093302e+00
nargout = 2
f = 1.626042e-01
nargout = 2
13 20 1.626042e-01 0.000e+00 6.961e+00 2.376e-01
f = 6.438567e-02
nargout = 2
14 21 6.438567e-02 0.000e+00 4.387e-01 1.038e-01
f = 1.012490e-01
nargout = 2
f = 3.497577e-02
nargout = 2
15 23 3.497577e-02 0.000e+00 2.614e+00 1.589e-01
f = 1.234638e-02
nargout = 2
16 24 1.234638e-02 0.000e+00 1.065e+00 1.239e-01
f = 3.343805e-03
nargout = 2
17 25 3.343805e-03 0.000e+00 1.357e+00 1.291e-01
f = 3.938513e-04
nargout = 2
18 26 3.938513e-04 0.000e+00 2.067e-01 5.722e-02
f = 1.223894e-05
nargout = 2
19 27 1.223894e-05 0.000e+00 1.079e-01 3.746e-02
f = 1.383574e-08
nargout = 2
20 28 1.383574e-08 0.000e+00 1.339e-03 4.663e-03
f = 2.260423e-14
nargout = 2
21 29 2.260423e-14 0.000e+00 4.744e-06 2.514e-04
f = 5.099602e-26
nargout = 2
22 30 5.099602e-26 0.000e+00 2.594e-12 2.046e-07
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
function [f, g] = rosenboth(x)
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;
fprintf('f = %d \n',f)
if nargout > 1 % gradient required
fprintf('nargout = %d \n',nargout)
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
end
end
function h = hessinterior(x,lambda)
h = [1200*x(1)^2-400*x(2)+2, -400*x(1);
-400*x(1), 200];
end
Torsten
2022 年 6 月 24 日
編集済み: Torsten
2022 年 6 月 24 日
The extra calls are only for the first iteration. Maybe a check whether the Hessian you supplied is correct. I don't know.
But obviously, the intermediate calls in your previous code are due to the calculation of the Hessian because they are absent in the code above.
But why do you care so much about the internals of "fmincon" ? You won't be able to give answers with absolute certainty. Just try the different options and see which are the fastest and/or most reliable for your problem.
Fangning Zheng
2022 年 6 月 24 日
編集済み: Fangning Zheng
2022 年 6 月 24 日
The extra calls happened at iter 1, 7 for two times and iter 13, 18, 21 for one time.
My own optimization problem requires function call (forward simulation) that the run time is about 1.5 hr. I use finite difference (my own script) to calculate the gradient, the total number of control variables is 45. So every time if it requires gradient calculation, it will require to run 45 forward simulation. I use parallel running of 30 nodes. So the average run time for calculating both objective function and gradient is about 3 hrs for 1 iteration (and this will be much more expensive if adding hessian calculation using FD so I do not use hessian). I cannot really afford the intermediate calculation unless it is needed...
Torsten
2022 年 6 月 24 日
編集済み: Torsten
2022 年 6 月 24 日
I cannot really afford the intermediate calculation unless it is needed...
But you cannot change fmincon's behaviour.
and this will be much more expensive if adding hessian calculation using FD so I do not use hessian
Fmincon will get its Hessian - either you supply it or it uses the gradients to get a finite-difference approximation.
Fangning Zheng
2022 年 6 月 24 日
I think I will switch to gradient descent and test without hessian calculation. Thank you for the answers!
John D'Errico
2022 年 6 月 24 日
You can. However, you should note that a basic gradient descent will be extremely slowly convergent on many problems.
その他の回答 (0 件)
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!エラーが発生しました
ページに変更が加えられたため、アクションを完了できません。ページを再度読み込み、更新された状態を確認してください。
Web サイトの選択
Web サイトを選択すると、翻訳されたコンテンツにアクセスし、地域のイベントやサービスを確認できます。現在の位置情報に基づき、次のサイトの選択を推奨します:
また、以下のリストから Web サイトを選択することもできます。
最適なサイトパフォーマンスの取得方法
中国のサイト (中国語または英語) を選択することで、最適なサイトパフォーマンスが得られます。その他の国の MathWorks のサイトは、お客様の地域からのアクセスが最適化されていません。
南北アメリカ
- América Latina (Español)
- Canada (English)
- United States (English)
ヨーロッパ
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
アジア太平洋地域
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)