How to replace values above a given percentile by nans

Question

Blue 2019 年 8 月 14 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/476130-how-to-replace-values-above-a-given-percentile-by-nans

コメント済み: Blue 2019 年 8 月 16 日

Hi,

I am trying to replace values above the 99th percentile in a table by nans by using Star Strider excellent little function (https://www.mathworks.com/matlabcentral/answers/127944-how-to-pick-the-j-th-percentile-of-a-vector). An example of my code looks like so:

% Table with nan
t = array2table(vertcat(horzcat([1:1000]', [1001:2000]'), NaN(10, 2)));
% Replace values above 99th percentile by nan
pct = @(v,p) interp1(linspace(0.5/length(v), 1-0.5/length(v), length(v))', sort(v), p*0.01, 'spline');
      t.Var1(t.Var1 > pct(t.Var1, 99)) = nan;
      t.Var2(t.Var2 > pct(t.Var2, 99)) = nan;

The problem of course is that there are nan values scattered across multiple variables in the table and I therefore receive the following error message: Warning: Columns of data containing NaN values have been ignored during interpolation.

Does anyone has an idea as to how I could replace values above the 99th percentile in a table by nans in a table where there are already multiple nans ? Please note that I do not have the Statistic toolbox.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Adam Danz 2019 年 8 月 15 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/476130-how-to-replace-values-above-a-given-percentile-by-nans#answer_387661

編集済み: Adam Danz 2019 年 8 月 15 日

MATLAB Online で開く

Here is an alternative function that ignores NaN values. It's not a nice one-liner like Star Striders but it does produce the same output as the prctile function in the stats toolbox.

function pctl = percentile(v,p)
% v is a vector of data; example v = [8 9 nan 13 15 11 nan 3 5 7];
% p is a percentile; example p = 75;
% pctl is the exact p_th percentile of v.
vsort = reshape(sort(v(~isnan(v))),1,[]); %row vector
pd = p/100 * numel(vsort);
idx = floor(pd + 0.5) + [0,1];
md = pd - idx(1);
idx(idx<1) = 1; 
idx(idx>numel(vsort)) = numel(vsort); 
pctl = sum(vsort(idx) .* (0.5+[-md,md]));
end  % <-- only needed if this function is within a script

Comparison

v = [8 9 nan 13 15 11 nan 3 5 7];
p = 82; 
percentile(v,p)      %  = 13.120000   this function
prctile(v,p)         %  = 13.120000   stat's toolbox
pct(v(~isnan(v)),p)  %  = 13.115995   the one-liner approximation

To apply it to your data,

t = array2table(vertcat(horzcat((1:1000)', (1001:2000)'), NaN(10, 2)));
t.Var1(t.Var1 > percentile(t.Var1, 99)) = nan;
t.Var2(t.Var2 > percentile(t.Var2, 99)) = nan;

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Blue 2019 年 8 月 16 日

Thank you Adam

サインインしてコメントする。

Answer 2

per isakson 2019 年 8 月 15 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/476130-how-to-replace-values-above-a-given-percentile-by-nans#answer_387660

編集済み: per isakson 2019 年 8 月 15 日

MATLAB Online で開く

Ignore the NaNs explicitely when calculating the percentile. Try

%%
% Table with nan
t = array2table(vertcat(horzcat([1:1000]', [1001:2000]'), NaN(10, 2)));
% Replace values above 99th percentile by nan
t.Var1(t.Var1 > pct(t.Var1, 99)) = nan;
t.Var2(t.Var2 > pct(t.Var2, 99)) = nan;
function   z = pct(v,p)
    v( isnan(v) ) = [];
    z = interp1(linspace(0.5/length(v), 1-0.5/length(v), length(v))' ...
            ,   sort(v), p*0.01, 'spline');
end

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Adam Danz 2019 年 8 月 15 日

編集済み: Adam Danz 2019 年 8 月 15 日

MATLAB Online で開く

If you're going to use the one-liner appoximation, you can keep the function handle and explicitly ignore the NaNs from within the call to the function.

t = array2table(vertcat(horzcat((1:1000)', (1001:2000)'), NaN(10, 2)));
pct = @(v,p) interp1(linspace(0.5/length(v), 1-0.5/length(v), length(v))', sort(v), p*0.01, 'spline');
t.Var1(t.Var1 > pct(t.Var1(~isnan(t.Var1)), 99)) = nan;
t.Var2(t.Var2 > pct(t.Var2(~isnan(t.Var2)), 99)) = nan;
%                         {_____here____}

サインインしてコメントする。

How to replace values above a given percentile by nans

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (1 件)

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

How to replace values above a given percentile by nans

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (1 件)

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示