Problem having isoutlier detecting anything

6 ビュー (過去 30 日間)
Vittorio
Vittorio 2018 年 8 月 9 日
コメント済み: BAPPA MUKHERJEE 2019 年 11 月 11 日
I have what looks like a very easy problem but I cannot seem to solve it. I have a dataset (attached) that has some obvious (to the human eye) outliers.
I cannot get isoutlier to detect it in any way. My attempt is essentially this:
idx = isoutlier(x(:,2),'movmedian',w);
I have put the code in a for loop, spanning all possible values of w, I get at most 3 outliers detected when the window size is 3 and those detected are not actually outliers.
Using movmean instead movmedian detects no outliers for any value of w. I have also played with the threshold factor, but without luck. This seemed to me like a straightforward application for the outlier detection. What am I missing?

採用された回答

Akira Agata
Akira Agata 2018 年 8 月 10 日
One possible way to detect this type of outlier would be like this:
load('out1.mat');
% Assuming that data has 2nd order polynomial curve trend
p = polyfit(x(:,1),x(:,2),2);
y = polyval(p,x(:,1));
% Detect outlier in de-trend data
idx = isoutlier(x(:,2)-y);
% Show the result
plot(x(:,1),x(:,2))
hold on
plot(x(idx,1),x(idx,2),'ro')
legend({'Original data','Detected outlier'},'FontSize',14)
  3 件のコメント
Chris Turnes
Chris Turnes 2018 年 8 月 14 日
You can also generalize this approach a little bit if your data doesn't globally fit a polynomial, but does over large local regions by replacing the polyfit portion with a call to smoothdata using the loess, lowess, or sgolay methods. For your data, you can get similar results doing:
% Local weighted quadratic fit on a window of span 0.3
y = smoothdata(x(:,2), 'loess', 0.3, 'SamplePoints', x(:,1));
% Find outliers in the difference between the smoothed and original.
tf = isoutlier(y-x(:,2));
% Visualize the difference.
plot(x(:,1), x(:,2), x(tf,1), x(tf,2), 'o')
legend({'Original data','Detected outlier'},'FontSize',14)
BAPPA MUKHERJEE
BAPPA MUKHERJEE 2019 年 11 月 11 日
Currently I am working on this topic. Can you please help me to load smoothdata function in directory, because in 2012 version its shows undifined function.

サインインしてコメントする。

その他の回答 (2 件)

Ryan Takatsuka
Ryan Takatsuka 2018 年 8 月 9 日
The outlier detection generally works best on single data point outliers (not multiple in a row). Your data has a large amount of outliers in a row that have a significant effect on the trend, or moving average, of the data.
In order to detect these outliers, you will need to use a very large moving average window to minimize the effect that the outlier have on it. Additionally, the threshold will need to be modified a bit. I used a window size of w=50 and a threshold of 0.5. This detects the outliers, but also falsely identifies points and the beginning and end of the dataset because the moving average has such a large window.
a = isoutlier(x(:,2), 'movmean', 50, 'ThresholdFactor', 0.5);
It also helps to plot the moving average that is used to calculate the outliers. This is shown in the image:
  1 件のコメント
BAPPA MUKHERJEE
BAPPA MUKHERJEE 2019 年 11 月 11 日
I am unable to plot the last figure. could you please elaborate this code upto the plotting stage.

サインインしてコメントする。


BAPPA MUKHERJEE
BAPPA MUKHERJEE 2019 年 11 月 11 日
Currently I am working on this topic. Can any one help me to load smoothdata function in directory, because in 2012 version its shows undifined function.

カテゴリ

Help Center および File ExchangeData Preprocessing についてさらに検索

タグ

製品


リリース

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by