Broad confidence bound range when fitting in Matlab

24 ビュー (過去 30 日間)
Shaily_T
Shaily_T 2022 年 11 月 27 日
コメント済み: Walter Roberson 2022 年 11 月 29 日
Hey,
I fit a function to data in Matlab and for the obtained fitting parameters, I get quite large range from Matlab. I have attached the picture. The confidence bounds Matlab shows me for some of my parameters are way bigger than the lower and upper bound of my parameters. I know the function I am fitting is very sensitive to two of the fitting parameters and even very small changes in these two parameters make huge changes. I am wondering why I am getting this huge confidence bound from Matlab and if I can trust the fitting result in this situation?
Thaks in advance!

回答 (3 件)

John D'Errico
John D'Errico 2022 年 11 月 27 日
編集済み: John D'Errico 2022 年 11 月 27 日
Wide confidence limits are typically a symptom, a reflection of uncertainty in some form. And unfortunately, we don't have your data, so it is difficult to be positive where that uncetainty lies. I suppose I could make up an example of each problem I mention below, but that would take a lot of time to build.
It might be that you have insufficient data to fit the curve well. So too many parameters in your model for the information content in the data. This is not uncommon.
It might be that some of your parameters can trade off with each other to some extent. So a change in one parameter can be offset by a simiilar change in another. Even if there is a global optimum, it might be difficult to resolve. Again, you can call this a variation of the first issue, that your model is too complex for the data available to fit well.
It might be that your model is just not a good fit to your data. In that case, the wide confidence intervals are merely a reflection of the intrinsicly wrong model.
Another issue is how the confidence intervals are derived. They are only approximations that ignore correlations betwwen your parameters. Remember those tradefoffs I mentioned above? The confidence intervals you see assume tradeoffs don't exist.
Given some time, I could probably come up with some other scenarios too, but in the end, remember the word uncertainty. Wide confidence intervals suggest uncertainty, but that uncertainty might arise from different sources, in different ways. Can you trust the result? Hard question there, since we don't see anything beyond the confidence intervals. Trust is sort of meaningless in this context, not really a good word. They are numbers - trust that. Only as good as your data and the validity of your model. With more data and less noisy data, the confidence intervals will potentially be tighter. And remember that in a real world context, in the presence of noise and other confounding factors, no model is a perfect description of data.
Honestly, mine is not a very useful answer in my eyes. But we don't have your data. We don't have your model. We don't know why it is that you think that is a good model for your data.
  3 件のコメント
Bjorn Gustavsson
Bjorn Gustavsson 2022 年 11 月 29 日
In addition to John's points about the uncertainty, it is worth mentioning that you should try to get out at least the parameter covariance matrix from the fit. That would allow you to get the ellipsoid for the parameter uncertainty. This should help.
Shaily_T
Shaily_T 2022 年 11 月 29 日
Thanks for your comment @Bjorn Gustavsson! I am not familiar with the approach you mentioned. I searched a bit but stilll struggling. Could you please let me know if you know a source for it or clarify it more? I appreciate it.

サインインしてコメントする。


Star Strider
Star Strider 2022 年 11 月 27 日
The important thing to note here is that the confidence intervals for ‘n’, ‘L’ and ‘A’ include zero (have opposite signs) and so are not actually needed in the model and contribute nothing significant to the fit to the data. The idea of ‘trust’ is obviously subjective, however assuming that the model actually describes the process that created the data and the data measurements are accurate may not be appropriate.
I would examine the data to be certain that the process that created them and measured them (specifically that the measuring equipment was appropriately calibrated) conform to the assumptions of the model being used to estimate their parameters. If that is not actually the situation, then the model may not be appropriate to the data, and a different model (specifically one that describes the process that produced the data) may be required.
.
  2 件のコメント
John D'Errico
John D'Errico 2022 年 11 月 27 日
編集済み: John D'Errico 2022 年 11 月 27 日
If a model is nonlinear, that the confidence band includes zero does not necessaily indicate the parameter is not needed. Even if the model is linear in the parameters, it may simply indicate insufficient data (or too much noise in the data) for the complexity of the model. For example:
x = randn(5,1)/10;
y = x.^3 + randn(size(x));
mdl = fittype('a + b*x.^3','indep','x');
fittedmdl = fit(x,y,mdl)
Warning: Start point not provided, choosing random start point.
fittedmdl =
General model: fittedmdl(x) = a + b*x.^3 Coefficients (with 95% confidence bounds): a = 0.1464 (-1.43, 1.723) b = -540.3 (-2073, 992.9)
plot(fittedmdl,x,y)
So we have a model with relatively huge noise. In fact, the model is almost correct, though with the inclusion of an unnecessary constant term. But all necessary terms are included in the model too.
Note that the confidence interval on the constant term certainly includes zero, so your assertion would be correct there. It was unnecessary. But the cubic term would then also be deemed just as unnecessary. In fact, the estimated sign of the cubic term was completely wrong.
Here the width of the confidence intervals is a signal that the data is wildly inadequate to fit that model, given the noise in the data.
As well, that a parameter confidence interval includes zero may simply be evidence of nonlinearity, or possibly lack of fit.
Shaily_T
Shaily_T 2022 年 11 月 27 日
@Star Strider Thanks for your response! I just eddited my question and included data and the custom function for fit.

サインインしてコメントする。


Walter Roberson
Walter Roberson 2022 年 11 月 27 日
編集済み: Walter Roberson 2022 年 11 月 27 日
When you see a coefficient shown with bounds that are close to the equal positive and negative, then it typically means that the fitting process could not decide whether the coefficient should be positive or negative. Consider for example if you fitted with a model A^2*x + B then negative and positive A would give the same result and so negative versus positive cannot be resolved.
If there are an even number of coefficients that follow the same pattern, it can mean that the model cannot distinguish between (negative for one coefficient, positive for a second) compared to (positive for one coefficient, negative for a second coefficient) . For example, A*exp(-B*x) + C*exp(-D*x) then if you swap A and C and B and D simultaneously you have the same equation; 2*exp(-3*x) + (-5)*exp(-7*x) is the same as (-5)*exp(-7*x) + 2*exp(-3*x) so A = 2 versus A = -5 cannot be resolved
In such cases it can help to set up constraints on one of the variables to be 0 to inf . If you have not analyzed the function to see which signs are important, then add one constraint at a time to see how the other variables react.
  2 件のコメント
Shaily_T
Shaily_T 2022 年 11 月 28 日
編集済み: Shaily_T 2022 年 11 月 28 日
Thanks for your response! I have already set up the upper and lower bounds on the fitting parameters but the shown confidence bounds for some of the obtained fitting parameters are not consistent with the upper and lower bounds I've set up. The upper and lower bound I've set up are:
[r1, r2, n, L, A, s]:
Startpoint = [sqrt(0.4), sqrt(0.99), 1.8217, 4E-3, 230, 2];
Lower = [sqrt(0.38) sqrt(0.97) 1.79 3.5E-3 120 1];
Upper = [sqrt(0.49) sqrt(1) 1.83 4.5E-3 280 3];
Walter Roberson
Walter Roberson 2022 年 11 月 29 日
How did you set the bounds on the fitting process?
It is possible to have bounds that are strictly positive but for the calculated range to include some negative . The calculation involves mean and standard deviation, and when the distribution does not happen to match Normal Distribution, it is possible that 3 standard deviations below the mean might be negative even though none of the underlying values are negative. However you can generally tell that situation apart by the fact that in the case where the fitting cannot tell the difference between positive and negative, then the range is pretty much equal positive and negative, whereas for the case of standard-deviations-predict-negative then the reported range will be distinctly biased.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSolver Outputs and Iterative Display についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by