How to replace outliers with NaN

48 ビュー (過去 30 日間)
012786534
012786534 2019 年 8 月 22 日
回答済み: Steven Lord 2019 年 8 月 22 日
Hello,
I am trying to replace values above the 99th percentile (outliers) by NaN for each group (for both group A and group B) in a table t.
group = repelem(['A' 'B'], 1000)';
val = repelem(1:1000, 2)';
t = table(group, val);
unique_gr = unique(t.group);
for g = 1:length(unique_gr)
sub = t(strcmp(t.group, unique_gr(g, 1)), :);
f = filloutliers(sub.val, 'NaN', 'percentiles', [0 99])
end
Ideas ? Please note that I do not have any toolboxes.
  2 件のコメント
Walter Roberson
Walter Roberson 2019 年 8 月 22 日
Use unique with three outputs and iterate through the group numbers,
[unique_gr, ~, groupnum] = unique(t.group);
for g = 1 : size(unique_gr,1)
mask = groupnum == g;
t(mask,:) = filloutliers(t(mask,:), nan, 'percentiles', [0 99]);
end
012786534
012786534 2019 年 8 月 22 日
Thank you Walter, work like a charm

サインインしてコメントする。

回答 (1 件)

Steven Lord
Steven Lord 2019 年 8 月 22 日
You can use grouptransform with an anonymous function that calls filloutliers. Let's use your sample data.
group = repelem(['A' 'B'], 1000)';
val = repelem(1:1000, 2)';
t = table(group, val);
This grouptransform call uses the variable group from the table t as the grouping variable. The anonymous function is the same as what you used and Walter each used in your for loops, though I chose to replace it with the double NaN rather than the text 'NaN' like Walter did.
t2 = grouptransform(t, 'group', ...
@(x) filloutliers(x, NaN, 'percentiles', [0 99]));
Let's see what values of val in t were replaced by NaN in t2.
t(isnan(t2.val), :)
By the way you built t, those do look like the top 1% of values for each group.

カテゴリ

Help Center および File ExchangeData Preprocessing についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by