フィルターのクリア

How can I remove outliers in my data using Cook's Distance?

4 ビュー (過去 30 日間)
Fatemah Ebrahim
Fatemah Ebrahim 2020 年 6 月 29 日
編集済み: Fatemah Ebrahim 2020 年 6 月 29 日
I have a large dataset, 6 .'xlsx' files with ~ 400,000 rows each, and I want to use Cook's Distance to determine the outliers in the fourth column of each dataset and then delete the corresponding row. How would I do that?
  2 件のコメント
Fatemah Ebrahim
Fatemah Ebrahim 2020 年 6 月 29 日
編集済み: Fatemah Ebrahim 2020 年 6 月 29 日
Hi! So I'm using the code they used on one of the '.xlsx' files as so:
X = A_t; % where this is a datetime value
Y = Adata(:,4); % where we are pulling the fourth column of the table
mdl = fitlm(X,Y);
plotDiagnostics(mdl,'cookd')
find((mdl.Diagnostics.CooksDistance)>3*mean(mdl.Diagnostics.CooksDistance))
And I am getting this error:
Error using classreg.regr.TermsRegression/handleDataArgs (line 550)
Predictor variables must be numeric vectors, numeric matrices, or
categorical vectors.
Error in LinearModel.fit (line 1184)
[X,y,haveDataset,otherArgs] =
LinearModel.handleDataArgs(X,varargin{:});
Error in fitlm (line 121)
model = LinearModel.fit(X,varargin{:});
Please let me know if you have any idea how to address this error, there does not seem to be much information on this. Thanks!

サインインしてコメントする。

回答 (0 件)

カテゴリ

Help Center および File ExchangeDimensionality Reduction and Feature Extraction についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by