applying a function to a datatable using rowfun

1 回表示 (過去 30 日間)
Emily
Emily 2023 年 9 月 27 日
回答済み: dpb 2023 年 9 月 28 日
I'm attempting to apply a function to a datable using rowfun. The table contains two variables of interest, "date" and "d18O", which are grouped by site and depth (site name also included).
I want to fit a sinusoidal function, "isofcn" to the date (x) and d18O (y). I would like a separate fit for each depth at each site. My current version below is only doing one fit per site.
I also think it might be possible to clean this up so that I only have to have 1 function below (i.e., combine function "doit" and iso_fcn", but I'm a bit lost.
Any help would be so appreciated.
*note the sample data table contains only a tiny subset of data, so the uncertainty on the sine fits will probably be extremely high. this issue should resolve when I use all the data.
tT=readtable("data1.csv", "VariableNamingRule","preserve");
tT.Properties.VariableNames(2)={'Site'}; % shorten to be more convenient to use
G=grpstats(tT,{'Site','Depth'},{'mean','median','std'},'DataVars',{'Date','d18O'});
hSc=rowfun(@doit,tT,'GroupingVariables',{'Site'},'InputVariables',{'Date','d18O','Site','Depth','name'},'OutputFormat','uniform');
Warning: The model is overparameterized, and model parameters are not identifiable. You will not be able to compute confidence or prediction intervals, and you should use caution in making predictions.
Warning: The model is overparameterized, and model parameters are not identifiable. You will not be able to compute confidence or prediction intervals, and you should use caution in making predictions.
Warning: The model is overparameterized, and model parameters are not identifiable. You will not be able to compute confidence or prediction intervals, and you should use caution in making predictions.
Warning: The model is overparameterized, and model parameters are not identifiable. You will not be able to compute confidence or prediction intervals, and you should use caution in making predictions.
Warning: The model is overparameterized, and model parameters are not identifiable. You will not be able to compute confidence or prediction intervals, and you should use caution in making predictions.
Warning: The model is overparameterized, and model parameters are not identifiable. You will not be able to compute confidence or prediction intervals, and you should use caution in making predictions.
Warning: The model is overparameterized, and model parameters are not identifiable. You will not be able to compute confidence or prediction intervals, and you should use caution in making predictions.
Warning: The model is overparameterized, and model parameters are not identifiable. You will not be able to compute confidence or prediction intervals, and you should use caution in making predictions.
function out=doit(x,y,s,d,n)
soil_params_guess= zeros(3,1);
mdl_soil = fitnlm(x,y,@iso_fcn,soil_params_guess(:,1));
soil_params_fit = table2array(mdl_soil.Coefficients(:,1));
out = {soil_params_fit};
end
function F = iso_fcn(isofcn_params,date)
F = isofcn_params(1).*(cos(2*pi.*(1/365).*date)) + isofcn_params(2).*(sin(2*pi.*(1/365).*date)) + isofcn_params(3);
end

採用された回答

dpb
dpb 2023 年 9 月 28 日
Seems as though you could have given the actual problem in the beginning when we did this last...
tT=readtable("data1.csv", "VariableNamingRule","preserve");
tT.Properties.VariableNames(2)={'Site'}; % shorten to be more convenient to use
hSc=rowfun(@doit,tT,'GroupingVariables',{'Site','Depth'}, ...
'InputVariables',{'Date','d18O'}, ...
'OutputFormat','cell');
hSc
hSc = 16×1 cell array
{0×0 double } {0×0 double } {0×0 double } {0×0 double } {0×0 double } {0×0 double } {0×0 double } {0×0 double } {1×1 NonLinearModel} {1×1 NonLinearModel} {1×1 NonLinearModel} {0×0 double } {0×0 double } {0×0 double } {1×1 NonLinearModel} {0×0 double }
function mdl=doit(x,y)
if numel(x)<3, mdl=[]; return, end
px=2*pi*x/365;
F=@(b,x) b(1)*cos(px) + b(2)*sin(px) + b(3);
rnge=range(y)/2;
b0=[rnge rnge 0];
mdl=fitnlm(x,y,F,b0);
end
As you note, there's insufficient data for any one group to really evaluate the model sufficiency, but I'd suggest looking at the fit fourier1 or sin2 builtin models as alternative -- or consider adding the phase shift coefficients in this one (which would then be the builtin sin2 with c2=90 degrees). I'm not confident simply adding the constant at the end here is going to be sufficient with zero phase shift in either term.
I simply returned the fit model here as a cell array, there aren't enough to estimate with less that three points so it returns an empty cell for those cases here; one presumes that will go away except for maybe a case or two with the whole dataset.
You could dereference the model in the function and return the pieces if so choose, will leave those details as "exercise for Student"
Your basic problem above was you didn't have the two variables in the 'GroupingVariables' input cell array, only the one which is why it was only doing the grouping by site. I removed the extraneous input variables; if you want you can also put some of those back and then pass back the one record of the grouping variables...or, if you do dereference such can output a table, then you'll get the automagic set of grouping variables in the output table, but they don't come along for free with the cell output. A wished-for enhancement to rowfun of mine would be to have the grouping variables available for each group in the function as an additional available convenience.

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeClassification についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by