How to reduce the number of unique values in a matrix?

1 回表示 (過去 30 日間)
Nisha
Nisha 2017 年 4 月 27 日
編集済み: Stephen23 2017 年 4 月 27 日
I would like to reduce the number of unique values in my matrix to a fixed number. If I just round my values, I still get a too high number of unique values. For instance, I would like to be able to group the matrix values into maybe 10 groups (=10 unique values). I would like the values of each group to relate to the original values, for instance as the mean of all the values in the group. My original idea was to do something like k-means clustering, but I don't think this can be done with data in a matrix.
Is there a way to do this?

採用された回答

Stephen23
Stephen23 2017 年 4 月 27 日
編集済み: Stephen23 2017 年 4 月 27 日
Although your data is arranged in a matrix, the matrix is a red-herring because actually you want a simple 1D clustering of the values themselves, irrelevant of their position in the matrix. This is simple, as K-Means clustering can be done on any number of dimensions, including on 1D data. So convert your matrix to a vector, apply kmeans, and the use the indices to allocate the values into the clusters. The simply reshape to get back the matrix shape.
Here is a complete working example, with just two clusters for clarity:
>> inp = [1,9,8,8;9,8,8,1;1,8,1,9;7,8,2,1]
inp =
1 9 8 8
9 8 8 1
1 8 1 9
7 8 2 1
>> [idx,vec] = kmeans(inp(:),2);
>> out = reshape(vec(idx),size(inp))
out =
1.1667 8.2000 8.2000 8.2000
8.2000 8.2000 8.2000 1.1667
1.1667 8.2000 1.1667 8.2000
8.2000 8.2000 1.1667 1.1667
  1 件のコメント
Nisha
Nisha 2017 年 4 月 27 日
Thank you very much!

サインインしてコメントする。

その他の回答 (1 件)

Adam
Adam 2017 年 4 月 27 日
編集済み: Adam 2017 年 4 月 27 日
vals = ceil( 10 * vals / max( vals(:) ) );
  3 件のコメント
Adam
Adam 2017 年 4 月 27 日
Well, once you have your 10 unique labels you can use them as indices into the original values and replace the labels with the average of those values e.g.
newVals = ceil( 10 * vals / max( vals(:) ) );
for n = 1:10
newVals( newVals == n ) = mean( vals( newVals == n ) );
end
Stephen23
Stephen23 2017 年 4 月 27 日
編集済み: Stephen23 2017 年 4 月 27 日
I also considered rounding as per Adam's answer, but this has the disadvantage that then the cluster values are linearly spaced, and this might not best represent the actual cluster values. Consider clusters centered around 0, 3, and 10: rounding would split the 3 cluster into 0 and 5... this might not be the desired effect.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeMultidimensional Arrays についてさらに検索

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by