How to define a custom equation in fitlm function for linear regression?

27 ビュー (過去 30 日間)
Spirit
Spirit 2017 年 11 月 22 日
コメント済み: the cyclist 2023 年 6 月 8 日
I'd like to define a custom equation for linear regression. For example y = a*log(x1) + b*x2^2 + c*x3 + k. This is a linear regression problem - but how to do this within FitLm function?
Thanks, Shriram

回答 (3 件)

the cyclist
the cyclist 2017 年 11 月 22 日
編集済み: the cyclist 2017 年 11 月 22 日
% Set the random number seed for reproducibility
rng default
% Make up some pretend data
N = 100;
x1 = rand(N,1);
x2 = rand(N,1);
x3 = rand(N,1);
a = 2;
b = 3;
c = 5;
k = 7;
noise = 0.2*randn(N,1);
y = a*log(x1) + b*x2.^2 + c*x3 + k + noise;
% Put the variables into a table, naming them appropriately
tbl = table(log(x1),x2.^2,x3,y,'VariableNames',{'log_x1','x2_sqr','x3','y'});
% Specify and carry out the fit
mdl = fitlm(tbl,'y ~ 1 + log_x1 + x2_sqr + x3')
  5 件のコメント
Walter Roberson
Walter Roberson 2023 年 6 月 8 日
If you have a discontinuity in the first or second derivatie of the model, then you surely do not have a linear regression situation.
You probably need to use ga(). Not fmincon() or similar -- those optimizers cannot handle discontinuities in derivatives either.
the cyclist
the cyclist 2023 年 6 月 8 日
I suggest you search the keywords segmented regression matlab and/or piecewise regression matlab. Although I don't believe there are any built-in functions for this, you should find a few different threads that you might find useful. I also think you might want to start a brand-new question for this, after you have done that search. In that question, I would suggest posting your data, which makes it easier for people to try out code suggestions.

サインインしてコメントする。


laurent jalabert
laurent jalabert 2021 年 12 月 19 日
編集済み: laurent jalabert 2021 年 12 月 19 日
To proceed with a custom function it is possible to use the non linear regression model
The example below is intended to fit a basic Resistance versus Temperature at the second order such as R=R0*(1+alpha*(T-T0)+beta*(T-T0)^2), and the fit coefficient will be b(1)=R0, b(2) = alpha, and b(3)=beta.
The advantage here, is that the SE will be computed directly for R0, alpha and beta.
beta0 is an initial range of [R0 alpha beta]
b(n) is retrieved using mdl.Coefficients.Estimate(n), for n=1,2,3
standard deviation on the coefficients are retrieved by mdl.Coefficients.SE(n)
(Curve fitting toolbox and Statistical/Machine Learning toolbox are both requiered)
clear tbl mdl
% your vector data T_T0 and R of same dimension
tbl = table(T_T0,R);
modelfun = @(b,x)b(1).*(1+b(2).*x(:,1)+b(3).*x(:,1).^2);
beta0 = [100 1e-3 1e-6];
mdl = fitnlm(tbl,modelfun,beta0,'CoefficientNames',{'R0';'alpha';'beta'})
  1 件のコメント
Erin Evans
Erin Evans 2023 年 6 月 7 日
Would you be able to help me write this such that there is a conditional statement in it? I essentially need two connected trendlines such that the statistics are for both sections together.
my equation is:
log10(Qs) = log10(a) + b*log(Qr) + c*log10(Max(1, Qr / Qc)) + d*log10(Qr(i) / Qr(i - 1))
the code I have so far is:
AgaDisc = readtable("File Path");
%% Bring in data columns
AgaDischargeArray = table2array(AgaDisc(:,"Discharge"));
AgaLoadArray = table2array(AgaDisc(:,"SedimentLoad"));
%% Normalize the discharge and sediment data
meanDisc = mean(AgaDischargeArray);
medianLoad = median(AgaLoadArray);
AgaDischargeArray = AgaDischargeArray/meanDisc;
AgaLoadArray = AgaLoadArray/medianLoad;
%% log10 sediment
logQs = log10(AgaDischargeArray);
%% log10 discharge
logQt = log10(AgaLoadArray);
%% Create new table of log-normalized data
tbl = table(logQs, logQt);
%% Add weighting factor
weights = zeros(length(logQt), 1);
weightFactor = [0.5, 1, 5, 10];
Q = quantile(logQt, 3);
for i = 1:length(logQt)
if logQt(i) > Q(3)
weights(i) = weightFactor(4);
elseif logQt(i) > Q(2)
weights(i) = weightFactor(3);
elseif logQt(i) > Q(1)
weights(i) = weightFactor(2);
else
weights(i) = weightFactor(1);
end
end
m = fitlm(tbl, 'logQs ~ logQt', 'RobustOpts', 'on', 'weight', weights);

サインインしてコメントする。


laurent jalabert
laurent jalabert 2023 年 6 月 8 日
please check carefully your expression, cause you use log10 and log (I guess neperian log here)
log10(Qs) in equation and logQs = log10(AgaDischargeArray) in your program; is it same ?
d*log(Qr(i)/Qr(i-1) might be similar to d* diff(log(Qr))
log10(Qs) = log10(a) + b*log(Qr) + c*log10(Max(1, Qr / Qc)) + d*log10(Qr(i) / Qr(i - 1))
log10Qs is x(:,1) as first column of tbl.
logQt is x(:,2) as the second column of tbl.
a,b,c,d are unknown
Qc is not defined
d* diff(log(Qr)) will lead to problem because its length is length(Qr) -1

カテゴリ

Help Center および File ExchangeLinear and Nonlinear Regression についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by