Using rmoutliers without a for loop

1 回表示 (過去 30 日間)
Kevin Jansen
Kevin Jansen 2023 年 3 月 2 日
コメント済み: Kevin Jansen 2023 年 3 月 22 日
Hi!
I am having trouble with removing outliers from an array of Gaussian distributed vectors, with an unequal amount of outliers and different offsets.
e.g.:
% three Gaussian distributed vectors, with unequal outliers and different
% offsets. (b has 2 outliers, a and c only 1)
a = [60,1, 2,3,3,4,4,4,5,5,6,5,5,4,4,4,3,3,2,1];
b = [60,60,2,3,3,4,4,4,5,5,6,5,5,4,4,4,3,3,2,1] + 500;
c = [60,1, 2,3,3,4,4,4,5,5,6,5,5,4,4,4,3,3,2,1] + 1000;
% array of the vectors above
array = horzcat(a',b',c');
% remove outliers
arrayOutliersRemoved = rmoutliers(array)
arrayOutliersRemoved = 18×3
2 502 1002 3 503 1003 3 503 1003 4 504 1004 4 504 1004 4 504 1004 5 505 1005 5 505 1005 6 506 1006 5 505 1005
The code almost works, however, vector a and b have a value removed which is were not outliers. This is because b had two outliers, and therefore it will remove an extra value from a and c so that an array can be made.
I have the code in the form of a for loop know, is there a way to do this without one?
I have tried using a cell array, but the rmoutliers function does not support it.
Thanks in advance!
  1 件のコメント
Voss
Voss 2023 年 3 月 2 日
Please share your code which is in the form of a loop, and please share the intended result (e.g., a cell array of column vectors of different lengths or something else).

サインインしてコメントする。

採用された回答

Simon Chan
Simon Chan 2023 年 3 月 2 日
You may consider using function filloutliers and fillmethod to replace the outliers with a numeric scalar, which is NaN.
  1 件のコメント
Kevin Jansen
Kevin Jansen 2023 年 3 月 22 日
Thanks!

サインインしてコメントする。

その他の回答 (1 件)

Les Beckham
Les Beckham 2023 年 3 月 2 日
編集済み: Les Beckham 2023 年 3 月 2 日
From the documentation of rmoutliers:
B = rmoutliers(A) detects and removes outliers from the data in A.
  • If A is a matrix, then rmoutliers detects outliers in each column of A separately and removes the entire row.
One way to do this is to remove the outliers from a, b, and c first (separately) and then combine them into a cell array.
Another possibility, if you want to retain the original size of the array and apply the operation after combining a, b, and c, is this, using filloutliers instead of rmoutliers:
% three Gaussian distributed vectors, with unequal outliers and different
% offsets. (b has 2 outliers, a and c only 1)
a = [60,1, 2,3,3,4,4,4,5,5,6,5,5,4,4,4,3,3,2,1];
b = [60,60,2,3,3,4,4,4,5,5,6,5,5,4,4,4,3,3,2,1] + 500;
c = [60,1, 2,3,3,4,4,4,5,5,6,5,5,4,4,4,3,3,2,1] + 1000;
% array of the vectors above
array = horzcat(a',b',c');
% remove outliers
arrayOutliersRemoved = filloutliers(array, 'nearest') %<<< there are other options besides 'nearest'
arrayOutliersRemoved = 20×3
1 502 1001 1 502 1001 2 502 1002 3 503 1003 3 503 1003 4 504 1004 4 504 1004 4 504 1004 5 505 1005 5 505 1005
I would suggest reading the documentation for this function and see if you can make this work for you.

カテゴリ

Help Center および File ExchangeDescriptive Statistics についてさらに検索

製品


リリース

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by