Any algorithm to separate very high values from a data set?

1 回表示 (過去 30 日間)
Wajahat Farrukh
Wajahat Farrukh 2019 年 1 月 10 日
回答済み: Akira Agata 2019 年 1 月 10 日
Hello everybody,
i have a column of dimensions 10317x1, which contains the reflected voltage values.
We can say out of 10,317 reflected voltages more than 95% of values are diffused reflections which has reflected voltage very low, or around average. Only small percentage (less than 5%) of values are too large because those are specular reflections.
The main goal is to split this 2 type of data. I am looking for some algorithm or any mathematical separation function, which can give me a threshold. A threshold which separates the very high values from rest of the values.
I have attached a normalised histogram of the data. Which shows how my data looks like.
I have highlighted with circle which shows that very few values are of high amplitude.
What i can do is pick out the highest 500 values and separate them, but that would be a manual approach. what i am looking for is a mathematical or algorithm based approach.

採用された回答

Akira Agata
Akira Agata 2019 年 1 月 10 日
If you have percentage of outlier (say, 5%), I think you can assume 95th percentiles of a data set as a threshold, like:
% Assuming x is your 10317x1 data array
th = prctile(x,95);
% Index of outlier
idx = x > th;
% Separate the data
xOutlier = x(idx);
xNormal = x(~idx);

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeStructures についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by