Ezyfit: Single vs. Two-Step Fitting Problem

Question

Sim 2025 年 3 月 13 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2175120-ezyfit-single-vs-two-step-fitting-problem

編集済み: Sim 2025 年 3 月 17 日

Ezfit (actually I am using the updated version from Walter Roberson) fails to accurately determine all 4 parameters in a single curve fit (see Method A). A workaround, setting 1 parameter initially and then fitting the remaining 3 in a second step (Method B), looks instead successful.

Is there a method or setting within Ezfit to obtain accurate 4-parameter results in a single fitting run (Method A)?

% (1) Input Data
x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];
y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];
% (2) Method A: Perform a simple curve fitting with Ezfit
ft1 = ezfit(x,log(y),'log(a*x^(-b) + c*exp(-d*x))','MaxFunEvals', 1e6, 'MaxIter', 1e6);   %, 'usega', usega);
array2table([ft1.m(1) ft1.m(2) ft1.m(3) ft1.m(4) (ft1.r)^2],'VariableNames',{'a','b','c','d','R2'})
%   a        b         c         d        R2   
% _____    ______    ______    _____    _______
% 
% 56106    2.2236    -85026    49582    0.85991
% (3) Method B: Perform curve fitting by using Ezfit 2 times
% (3.1) With the first Ezfit, find the exponent "b" (among a range of values between 1.5 and 2.5) 
% which maximise the coefficient of determination R^2 (I set it as R^2 = 0.99). 
% The resulting exponent will be "b_range(idx_max_b) = 1.86".
i = 1;
b_range = 1.5 : 0.01 : 2.5;
for b = b_range
    ft2  = ezfit(x,log(y),sprintf('log(a*x^(-%.2f) + c*exp(-d*x))',b),'options',optimset('MaxFunEvals', 1e6, 'MaxIter', 1e6));
    ftr(i) = (ft2.r)^2;
    i = i + 1;
end
idx_max_b = find(ftr == max(ftr(ftr < 0.99)));
% (3.2) Following the initial determination of the optimal exponent "b" (i.e. "b_range(idx_max_b)"), 
% the second Ezfit step finds the remaining parameters
ft3  = ezfit(x,log(y),sprintf('log(a*x^(-%.2f) + c*exp(-d*x))',b_range(idx_max_b)),'options',optimset('MaxFunEvals', 1e6, 'MaxIter', 1e6));
array2table([ft3.m(1) b_range(idx_max_b) ft3.m(2) ft3.m(3) (ft3.r)^2],'VariableNames',{'a','b','c','d','R2'})
%   a        b        c          d          R2   
% ______    ____    _____    _________    _______
% 
% 2084.4    1.86    4.806    0.0064196    0.98323
% (4) Plot 
hold on
plot(x, y, 'o', 'MarkerFaceColor',[0.7 0.7 0.7],'DisplayName', 'Input Data');
xx = 10 : 10000;
plot(xx, ft1.m(1) .* xx.^(-ft1.m(2)) + ft1.m(3) .* exp(-ft1.m(4) .* xx), 'color','blue', 'Linewidth',2, 'DisplayName', 'Method A');
plot(xx, ft3.m(1) .* xx.^(-b_range(idx_max_b)) + ft3.m(2) .* exp(-ft3.m(3) .* xx), 'color', 'red', 'Linewidth',2, 'DisplayName', 'Method B')
xlabel('x');
ylabel('y');
legend('show');
set(gca, 'XScale', 'log', 'YScale', 'log')

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

Torsten 2025 年 3 月 14 日

編集済み: Torsten 2025 年 3 月 14 日

MATLAB Online で開く

The solution with "fminspeas"

https://uk.mathworks.com/matlabcentral/fileexchange/10093-fminspleas

from the File Exchange looks quite acceptable for the original data:

   x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];
   y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];
   funlist = {@(coef,xdata) xdata.^(-coef(1)),@(coef,xdata) exp(-coef(2)*xdata )};
   NLPstart = [1,1];
   options = optimset('disp','iter');
   [INLP,ILP] = fminspleas(funlist,NLPstart,x,y,[],[],[],options)
   hold on
   plot(x,ILP(1)*x.^(-INLP(1))+ILP(2)*exp(-INLP(2)*x))
   plot(x,y,'*')
   hold off

Sim 2025 年 3 月 14 日

MATLAB Online で開く

Thanks @Torsten :-) With that code, I get the following on a log-log scale (it looks like it does not fit well for middle-to-large values of "x"):

   x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];
   y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];
   funlist = {@(coef,xdata) xdata.^(-coef(1)),@(coef,xdata) exp(-coef(2)*xdata )};
   NLPstart = [1,1];
   options = optimset('disp','iter');
   [INLP,ILP] = fminspleas(funlist,NLPstart,x,y,[],[],[],options)
   hold on
   plot(x,ILP(1)*x.^(-INLP(1))+ILP(2)*exp(-INLP(2)*x))
   plot(x,y,'*')
   hold off
   set(gca, 'XScale', 'log','YScale', 'log')

Torsten 2025 年 3 月 14 日

編集済み: Torsten 2025 年 3 月 14 日

As said: you must decide whether you want to fit log(y) against log(x) using the function or y against x. Both will give different parameters for the fitting process. Usually, fitting y against x is intended.

Say you have the function y = a*x^b as fitting function. Then taking the log on both sides gives log(y) = log(a) + b*log(x), thus a function where log(y) depends linearily on log(x). The latter is numerically simpler, but the results for a and b will differ from the nonlinear fit of y against x.

Sim 2025 年 3 月 14 日

Thanks @Torsten for your clarifications, thanks :-)

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Walter Roberson 2025 年 3 月 13 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2175120-ezyfit-single-vs-two-step-fitting-problem#answer_1561761

Is there a method or setting within Ezfit to obtain accurate 4-parameter results in a single fitting run (Method A)?

No.

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

Sam Chak 2025 年 3 月 14 日

MATLAB Online で開く

Hi @Sim

It should be noted that although @Alex Sha's global solution yields the highest R-squared value for your chosen model and the model appears to fit the exponential decay region (the first 10% of the data range) reasonably well, it does not fit the remaining 90% of the data range. The data converges to zero, while the model continues to decrease as x increases.

In standard curve fitting practice, you should first analyze whether the chosen model theoretically fits the data.

%% Data

x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];

y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];

%% Alex Sha's Global Solution

a = 60599268711757.3;

b = 6.13357573642796;

c = 1.26384716322201;

d = 7.17292632604614E-5;

%% Model

xf = linspace(x(1), x(end), numel(x)*1e3+1);

yf = log(a*xf.^(-b) + c*exp(-d*xf));

%% Plot results

subplot(211)

plot(x, y, 'o'), hold on

plot(xf, yf), xlim([0 700]), grid on

legend('data', 'model')

xlabel('x'), ylabel('y')

title({'Displaying model fitness in interval $x \in [0, 700]$'}, 'interpreter', 'latex')

subplot(212)

plot(x, y, 'o'), hold on

plot(xf, yf), xlim([700 7000]), grid on

legend('data', 'model')

xlabel('x'), ylabel('y')

title({'Displaying model fitness in interval $x \in [700, 7000]$'}, 'interpreter', 'latex')

Analysis

syms f(x) a b c d

assume(a > 0)

assume(b > 0)

assume(c > 0)

assume(d > 0)

%% Chosen Model, f(x)

f = log(a*x^(-b) + c*exp(-d*x))

f =

%% Derivative of the model, df/dx

df = diff(f, x)

df =

%% Limit of the df/dx when x approaches 1 million.

ldf = limit(df, x, sym(1e6))

ldf =

%% Checks if the limit of df/dx < 0

tf = isAlways(ldf < sym(0))

tf = logical

1

Sim 2025 年 3 月 14 日

編集済み: Sim 2025 年 3 月 14 日

Hi @Torsten, thanks for your comment about the initial guesses :-)

Hi @Alex Sha, thanks for your comment and for the global and local solutions. Just a doubt: did you use ezfit or another function to get the global and local solutions? Maybe, did you use the @Torsten line of code, which includes the initial guesses? Indeed, I see the results but not the function/code used to get those results.

Hi @Sam Chak, thanks for your comment.. As far as I understand you make an analysis of the region of large x, by using the @Alex Sha's Global Solution. By analyzing the limit of the derivative as "x" approaches 10^6, you check whether the rate of change of the model remains negative ("lim(x-->10^6) df/dx <0"). The negative slope ensures that the model is consistently decreasing as "x" increases towards 10^6, which is expected for a valid decay model, right?

In summary, my understand from your comment is the following. By using the @Alex Sha's Global Solution, the model works for around 10% of the data, i.e. for very small "x", but it does not fit well the remaining 90% of the data. However, the theoretical analysis, related to the negative slope of the model's derivative for very large "x" (i.e. "lim(x-->10^6) df/dx <0") ensures that the model behaves correctly, i.e. "y" continues to decrease as "x" approaches 10^6. Hope my understanding is correct. Btw, in your code you only showed data, model and result, but I do not see the code to arrive to those results, i.e. did you use the the @Torsten line of code or other functions?

Thanks to everyone for your interesting comments! :-)

Walter Roberson 2025 年 3 月 14 日

@Sim

@Alex Sha used a third-party optimizer named 1stOpt. It is a somewhat expensive tool (about $US2000 per user), but from what I have seen, it produces excellent results much faster than competitors. I would consider buying it myself, but I cannot currently justify the price.

Sim 2025 年 3 月 17 日

Thanks @Walter Roberson for sharing! :-) I have checked it on the web, and it looks very interesting and promising (even though the academics might find the cost prohibitive if it is the same for them).

サインインしてコメントする。

Answer 2

Sam Chak 2025 年 3 月 14 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2175120-ezyfit-single-vs-two-step-fitting-problem#answer_1561810

MATLAB Online で開く

Hi @Sim

You can verify this. The model proposed by @Torsten fits quite well within the first 10% of the data range. However, after that point, the model decays much more slowly than the data, although, in theory, it will eventually converge to zero. Inspired by this model, it suggests that you should consider implementing a nonlinear power rate, as illustrated below. You can also find a better a nonlinear function, preferably a bounded one.

format long g

x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];

y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];

funlist = {@(coef, xdata) xdata.^(-coef(1)*(xdata).^(0.4)), @(coef, xdata) exp(-coef(2)*xdata)};

NLPstart = [1, 1];

options = optimset('disp', 'none');

[INLP, ILP] = fminspleas(funlist, NLPstart, x, y, [], [], [], options)

Warning: Rank deficient, rank = 1, tol = 1.957112e+05.

Warning: Rank deficient, rank = 1, tol = 6.998085e-03.

INLP = 1×2

0.0594819972748915 3.50973243531023

ILP = 2×1

1.0e+00 * 22.0279801300898 7302263919.12418

xf = linspace(x(1), x(end), numel(x)*1e3+1);

yf = ILP(1)*xf.^(-INLP(1)*(xf).^(0.4)) + ILP(2)*exp(-INLP(2)*xf);

%% Plot results

figure

plot(x, y, 'o'), hold on

plot(xf, yf), grid on, hold off

legend('data', 'model')

xlabel('x'), ylabel('y')

set(gca, 'XScale', 'log','YScale', 'log')

●

figure

subplot(211)

plot(x, y, 'o'), hold on

plot(xf, yf), xlim([0 700]), grid on, hold off

legend('data', 'model')

xlabel('x'), ylabel('y')

title({'Displaying model fitness in interval $x \in [0, 700]$'}, 'interpreter', 'latex')

subplot(212)

plot(x, y, 'o'), hold on

plot(xf, yf), xlim([700 7000]), grid on, hold off

legend('data', 'model')

xlabel('x'), ylabel('y')

title({'Displaying model fitness in interval $x \in [700, 7000]$'}, 'interpreter', 'latex')

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Sim 2025 年 3 月 14 日

編集済み: Sim 2025 年 3 月 14 日

MATLAB Online で開く

Hi @Sam Chak, thanks for your nice answer :-)

I do not know how to get R-squared with "fminspleas", but it looks a nice fit!

It looks like you used a "power tower" or "tetration", plus the exponential decay... this makes things a bit more complicated :-) I was thinking that, by using the "power tower", probably, the exponential decay is not necessary anymore (?):

x           = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];
y           = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];
funlist     = {@(coef, xdata) xdata.^(-coef(1)*(xdata).^(coef(2)))};
NLPstart    = [1, 1];
options     = optimset('disp', 'none');
[INLP, ILP] = fminspleas(funlist, NLPstart, x, y, [], [], [], options)
xf  = linspace(x(1), x(end), numel(x)*1e3+1);
yf  = ILP(1)*xf.^(-INLP(1)*(xf).^(INLP(2)));
hold on
plot(x, y, 'o')
plot(xf, yf)
set(gca, 'XScale', 'log','YScale', 'log')

However, the idea to use a "simple" power law and an exponential decay, would be supported by some theoretical argument... by using, instead, a power tower, the theoretical argument would be more complicated.... :-)

Sim 2025 年 3 月 14 日

編集済み: Sim 2025 年 3 月 14 日

Thanks @Sam Chak! I have just read your comment, just a second after I posted mine...! Yes, it could also be an exponential function-based model. I will look into it further. However, the sum of a simple power law and a simple exponential decay can be justified through a theoretical argument (in particular the power law term). If we use nonlinear power rates—whether power law-based or exponential-based one—it could make the results much harder to interpret.... :-)

...I can keep an eye on this webpage to see if @Alex Sha might be interested in adding more on this topic. By the way, there are some great fitting tools I wasn’t aware of, like "fminspleas", which looks quite impressive! :-)

サインインしてコメントする。

Answer 3

Alex Sha 2025 年 3 月 15 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2175120-ezyfit-single-vs-two-step-fitting-problem#answer_1561860

移動済み: Sam Chak 2025 年 3 月 15 日

MATLAB Online で開く

Hi, all, if add one more parameter to the original fitting function, i.e. y=ln(a*x^(-b)-c*exp(-d*x))*f, the result will be improved like:

Sum Squared Error (SSE): 62.0296923820521
Root of Mean Square Error (RMSE): 0.869746896054109
Correlation Coef. (R): 0.970745675455549
R-Square: 0.94234716641565
Parameter	Best Estimate      
---------	-------------      
a        	0.97506498591964   
b        	-0.0034886080404169
c        	1.00586174849318   
d        	0.0069594641193838 
f        	-5.56698125787981

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

Sam Chak 2025 年 3 月 15 日

MATLAB Online で開く

Hi @Alex Sha

I believe that your solution deserves recognition. It appears that @Sim wants the fitting model to follow the trend of the data distribution on a base-10 logarithmic scale for both the x-axis and the y-axis.

In fact, @Sim is open to suggestions for other more appropriate power law or exponential-based functions, where its derivative should eventually converge to zero, or at least to a very small negative value at x = 7000.

%% Data

x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];

y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];

%% Alex Sha's New Global Solution

a = 0.97506498591964;

b = -0.0034886080404169;

c = 1.00586174849318;

d = 0.0069594641193838;

f = -5.56698125787981;

%% Model

xf = linspace(x(1), x(end), numel(x)*1e3+1);

yf = log(a*xf.^(-b) - c*exp(-d*xf))*f;

%% Plot results

figure

subplot(211)

plot(x, y, 'o'), hold on

plot(xf, yf), xlim([0 700]), grid on

legend('data', 'model')

xlabel('x'), ylabel('y')

title({'Displaying model fitness in interval $x \in [0, 700]$'}, 'interpreter', 'latex')

subplot(212)

plot(x, y, 'o'), hold on

plot(xf, yf), xlim([700 7000]), grid on

legend('data', 'model')

xlabel('x'), ylabel('y')

title({'Displaying model fitness in interval $x \in [700, 7000]$'}, 'interpreter', 'latex')

●

figure

plot(x, y, 'o'), hold on

plot(xf, yf), grid on, hold off

legend('data', 'model')

xlabel('x'), ylabel('y')

set(gca, 'XScale', 'log','YScale', 'log')

Warning: Negative data ignored

Sim 2025 年 3 月 17 日

編集済み: Sim 2025 年 3 月 17 日

MATLAB Online で開く

Thanks a lot to @Alex Sha for your solution, which fits the data very nicely!

And thanks to @Sam Chak for further comments and analysis!

I was focusing on the @Alex Sha's improved model, i.e.

y = ln(a*x^(-b) - c*exp(-d*x)) * f

that can be rewritten as:

y = f * ln(a*x^(-b) - c*exp(-d*x))

with the following parameters:

%% Alex Sha's New Global Solution
a   =  0.97506498591964;
b   = -0.0034886080404169;
c   =  1.00586174849318;
d   =  0.0069594641193838;
f   = -5.56698125787981;

I was thinking that, maybe, given "a" and "c" being very close, or basically equal, to 1, the equation can be simplified as follows:

y = f * ln(x^(-b) - exp(-d*x))

(However, as shown by @Sam Chak the "model drops" way much faster than the data, after around x = 10^3)

Yes, these data are quite challenging to describe, and I truly appreciate your insightful comments and analysis!

This discussion on methods and approaches has raised a few doubts, which I believe could apply to other problems and analyses, not just this one. My doubts are the following ones:

From a curve fitting or regression perspective, does the "Method B" I initially described, which seems to reasonably fit the data (at least to me, to some extent), actually make sense?
Why does "Method B" appear to offer a generally good representation of the data (at least to me, to some extent)?

I think these are valid concerns from both computational and mathematical perspectives, even though I may never get a definitive answer. Probably, a more general discussion might be suitable for Stack Overflow. However, given the typical reactions from that community, the question is likely to be closed immediately for trivial reasons (I much prefer this forum!).

Thanks again to everyone for your thoughtful and engaging comments!

Alex Sha 2025 年 3 月 17 日

MATLAB Online で開く

Not sure whether or not my understanding is correct, if want fitting result good in log() scale, is it reasonab to change y data to y1=log(y) firstly, then the fitting function become "y1=log(a*x^(-b)-c*exp(-d*x))", the result will be:

Sum Squared Error (SSE): 3.16256795387294
Root of Mean Square Error (RMSE): 0.196387122481336
Correlation Coef. (R): 0.991266174106857
R-Square: 0.982608627928446
Parameter	Best Estimate      
---------	-------------      
a        	10.6590120083535   
b        	0.69245173591868   
c        	-1.65676470503943  
d        	0.00313457062827038

if taking function as "y1=f*log(a*x^(-b)-c*exp(-d*x))"

Sum Squared Error (SSE): 2.95997975085014
Root of Mean Square Error (RMSE): 0.189992931538933
Correlation Coef. (R): 0.99182732299325
R-Square: 0.983721438635958
Parameter	Best Estimate      
---------	-------------      
f        	4.78460068319797   
a        	0.918658120426212  
b        	0.0761834706279075 
c        	-0.540969703020373 
d        	0.00132426521534454

Sim 2025 年 3 月 17 日

編集済み: Sim 2025 年 3 月 17 日

MATLAB Online で開く

Thanks a lot @Alex Sha! Now it is clear, thanks! :-)

Therefore, we can get an R-square equal to 0.982608627928446, by using

y1 = log(a*x^(-b) - c*exp(-d*x))

and we can get an R-square equal to 0.983721438635958, by using

y1 = f * log(a*x^(-b) - c*exp(-d*x))

By multiplying by a factor "f", the fit improves further, but even without it, the fit is already quite good. :-)

Thanks a lot!

サインインしてコメントする。

Ezyfit: Single vs. Two-Step Fitting Problem

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

採用された回答

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

その他の回答 (2 件)

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Ezyfit: Single vs. Two-Step Fitting Problem

7 件のコメント 5 件の古いコメントを表示5 件の古いコメントを非表示

採用された回答

8 件のコメント 6 件の古いコメントを表示6 件の古いコメントを非表示

その他の回答 (2 件)

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

4 件のコメント 2 件の古いコメントを表示2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示