Find nonlinear function to optimize parameters

3 ビュー (過去 30 日間)
Matthew Blomquist
Matthew Blomquist 2021 年 11 月 2 日
コメント済み: Matthew Blomquist 2021 年 11 月 2 日
Hello,
I'm trying to optimize a large dataset that contains 36 predictors and 1 response variable. To optimize, I am using fminsearchbnd, which I found on the MathWorks File Exchange. However, I don't know the best formula/function to use for the optimization (e.g., coefficients, highest order, etc). I tried using fitlm with linear, squared, and interaction terms between all 36 predictors, but the function output isn't great, and the response variable goes below 0 (which it shouldn't because it is a RMSE). It should be a nonlinear function, but I don't know of what form.
Is there a function / toolbox I could use to find the formula/function to optimize the predictor variables so that the response variable (RMSE) is minimized?
Thank you in advance!

採用された回答

Walter Roberson
Walter Roberson 2021 年 11 月 2 日
Is there a function / toolbox I could use to find the formula/function to optimize the predictor variables so that the response variable (RMSE) is minimized?
No.
It can be proven mathematically (and I have personally posted proofs in the past) that any finite set of points of finite precision, can be exactly fitted (to within round-off error) by an uncountable infinity of different formula. If a program were to pick one of the formulas, then the probability that it picked the "right" formula would be which is 0 .
If you do not have a restricted set of possible forms, then there is no possible program that can find the "right" form of the equation.
Even if you have a restircted set of possible forms, due to round-off error and noise in measurements, it is notoriously true that a form known in advance to be the "wrong" equation can end up with a lower RMSE than the "right" equation.
  1 件のコメント
Matthew Blomquist
Matthew Blomquist 2021 年 11 月 2 日
Yes, that makes sense. Thank you for the explanation, I appreciate it

サインインしてコメントする。

その他の回答 (1 件)

John D'Errico
John D'Errico 2021 年 11 月 2 日
編集済み: John D'Errico 2021 年 11 月 2 日
NO. Do NOT use fminsearchbnd to try to optimize a problem with 36 parameters. You will be wasting your time and mine, when you next send me a plaintive e-mail asking why it does not work.
fminsearchbnd uses fminsearch, as an overlay to do the work, but then apply bound constraints. fminsearch is able to optimize problems with perhaps 6-8 parameters. Maybe 10 in a pinch. But 36 unknowns? Give me a break. It won't work. PERIOD.
What we are not told is how many data points you have. Far too often people think they don't need many data points. With too few data points, expect garbage for results no matter what. You say the dataset is large, but is it? Do you have sufficient information to reasonably estimate that many parameters?
Next, we are given no clue if the model is even reasonable for your data. Too often, people try to cram their own favorite model into their data. You can't fit a square peg into a round hole. Well, you can, but either the peg or the hole will suffer.
And, oh. it looks like you have no idea what model to use here, so you are trying to use a multinomial model (polynomial in multiple dimensions.) Expect randomly garbage results with that model.
Finally, you need good starting values for a nonlinear model. A 36 dimensinal search space is IMMENSE. Provide poor starting values, and expect crapola for a result. But if your model is LINEAR, as it would be if you used fitlm, then there is no reason to even bother with an iterative method like fminsearchbnd. fitlm will give you the optimal answer. It may not be a model that you like, but that is the fault of your data and your choice of model.
  1 件のコメント
Matthew Blomquist
Matthew Blomquist 2021 年 11 月 2 日
I have a table of 10000 rows by 37 columns. I am running simulations that randomly varied the first 36 parameters, then compared the output to experimental data, which the comparison is the 37th column (the RMSE of the simulation data compared to experimental data). I don't know how the functions for these parameters, and I'm not sure how they all interact, so I was checking to see if there was some way I could make some sort of response surface to all the data that I had, then check to see if there were some set of values for the parameters that would further decrease the RMSE.
I don't have a favorite model. I believe it is nonlinear, but I'm just trying to use different functions to see what best fits the data.
For the initial starting value, I used the set of parameters that gave the lowest RMSE in the table.
Does that clear up any of your questions? If it is still impossible / too complex, then just let me know.
Thanks

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSolver Outputs and Iterative Display についてさらに検索

製品


リリース

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by