Defining the 95% of data which are around the mean value

3 ビュー (過去 30 日間)
Giorgos Papakonstantinou
Giorgos Papakonstantinou 2013 年 7 月 31 日
For a given set of data, how can I define which of those correspond to the 95% of the data which are around the mean value?

採用された回答

Jan
Jan 2013 年 8 月 1 日
編集済み: Jan 2013 年 8 月 1 日
x = rand(1, 1000) - 0.5;
m = mean(x);
dist = abs(x - m);
[sortDist, sortIndex] = sort(dist);
index_95perc = sortIndex(1:floor(0.95 * numel(x)));
x_95percent = x(index_95perc);
  1 件のコメント
Giorgos Papakonstantinou
Giorgos Papakonstantinou 2013 年 8 月 1 日
Thank you Jan. It was easier than I expected. Before your answer I was doing the folllowing:
vals=abs(slope);
[CdfY,CdfX] = ecdf(vals,'Function','cdf'); % compute empirical function
cr=CdfY<0.95;
where vals is my dataset.

サインインしてコメントする。

その他の回答 (2 件)

Image Analyst
Image Analyst 2013 年 7 月 31 日
I'd sort the data using sort(). Then use cumsum() to get the cdf. Normalize the CDF then go from the 2.5% element to the 97.5% element using find() to find the elements (values) where the data starts and stops. It's pretty easy, but let me know if you can't figure it out.

Giorgos Papakonstantinou
Giorgos Papakonstantinou 2013 年 7 月 31 日
Thank you for your answer Image Analyst. The data contain also negative values. I am not sure but I think that poses a problem when I normalize the data after the cumsum.
  1 件のコメント
Tom Lane
Tom Lane 2013 年 8 月 1 日
It sounds like Image Analyst is talking about the cumsum of a vector that assigns probability 1/N to each of N points. However, you could take the 0.025*N and 0.975*N values from the sorted vector directly, converting the index to an integer as you see fit.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeLogical についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by