How can I extract a certain 'cluster' of elements according to a particular condition on the elements?

6 ビュー (過去 30 日間)
Ansh
Ansh 2016 年 1 月 21 日
編集済み: Ansh 2016 年 1 月 28 日
I have a matrix (about 342 by 342) denoted by C(k,l) and I want to identify all cluster of indices of the original according to the condition C(k,l) > rho. I.e. I want all square matrices C'(a,b) of C(k,l) such that C'(a,b) > rho for all pairs of indices a and b
For example, if I have the matrix C(i,j) as:
C = 1 0.8 0.7
0.8 1 0.5
0.7 0.5 1
And rho = 0.6 then a correct square matrix I want my code to identify is:
C'= 1 0.7
0.7 1
This is not unique of course and the result as given by the example above is not necessarily a submatrix. I am not sure how/the best way to do this is in MATLAB? If possible, I would also like identify what a and b are for each possible matrix e.g. for my example above a and b can be 1 or 3. The matrices are always symmetric and the diagonal entries are always 1.
  8 件のコメント
Ansh
Ansh 2016 年 1 月 22 日
編集済み: Ansh 2016 年 1 月 22 日
Kirby Fears,
Yes each submatrix is required to have 1's on the diagonal. What do you mean exactly by taking submatrices along the diagonal?
Many thanks
Ansh
Ansh 2016 年 1 月 22 日
Image Analyst,
The matrices being considered are correlation matrices. I am using a clustering procedure to find a cluster of indices (in this case indices are stocks) that are highly correlated to each other. This involves extracting a square submatrix from the original matrix such that of all of its entries are >= rho as described in the problem. In the actual correlation matrices to be used (which are taken from empirical data) it is possible that this may not give a unique cluster (possibly not given the size), hence why I have asked for all such submatrices. Does this help in anyway?

サインインしてコメントする。

回答 (2 件)

Kirby Fears
Kirby Fears 2016 年 1 月 21 日
編集済み: Kirby Fears 2016 年 1 月 21 日
Assuming you only want to find submatrices along the diagonal of C, the following code extracts all square submatrices (>rho) into a table S. This should be a good starting point for whatever assumptions you end up deciding on.
% make data
sizeC = 342;
rho = 0.6;
c = rand(sizeC);
c(1:(sizeC+1):end) = 1;
% prep
S = cell((sizeC-2)*(sizeC-1),3);
varNames = {'S','sizeS','diagC'};
idxRho = c>rho;
counterS = 1;
% traverse submatrix size
for sizeS = (sizeC-1):-1:2,
% traverse diagonal of c
for d = 1:(sizeC-sizeS),
% store valid submatrix with meta info
if all(idxRho(d:(d+sizeS-1),d:(d+sizeS-1))),
S(counterS,:) = {c(d:(d+sizeS-1),d:(d+sizeS-1)),...
sizeS,d};
counterS = counterS + 1;
end
end
end
% drop extra rows of S
if counterS<=size(S,1),
S(counterS:end,:)=[];
end
% convert S to table
S = array2table(S,'VariableNames',varNames);
Hope this helps.
  9 件のコメント
Ansh
Ansh 2016 年 1 月 27 日
編集済み: Ansh 2016 年 1 月 27 日
Hi Stephen, but this doesn't produce the 3 by 3 cluster I have detailed in the comment to your code. (There could have been a delay in posting that further comment - sorry).
Ansh
Ansh 2016 年 1 月 27 日
編集済み: Ansh 2016 年 1 月 27 日
Hi Kirby Fears,
I was attempting to use the sets of indices to clarify what it is I wanted, obviously it seems as if it has had the opposite effect. Thank you for taking the time to answer anyway. In hindsight, this question could have been posed better from the start to avoid the confusion caused later. Many thanks for the best wishes.

サインインしてコメントする。


Stephen23
Stephen23 2016 年 1 月 23 日
編集済み: Stephen23 2016 年 1 月 24 日
Assuming that the input matrix is always square and symmetric:
>> D = [1,0.8,0.9,0.5;0.8,1,0.6,0.1;0.9,0.6,1,0.7;0.5,0.1,0.7,1]
D =
1 0.8 0.9 0.5
0.8 1 0.6 0.1
0.9 0.6 1 0.7
0.5 0.1 0.7 1
>> rho = 0.6;
>> [R,C] = find(tril(D,-1)>rho);
>> out = arrayfun(@(r,c)D([r,c],[r,c]),R,C,'UniformOutput',false);
>> out{:}
ans =
1 0.8
0.8 1
ans =
1 0.9
0.9 1
ans =
1 0.7
0.7 1
  5 件のコメント
Stephen23
Stephen23 2016 年 1 月 27 日
This task might not be solvable using a standard PC: there are potentially a lot of such matrices:
Ansh
Ansh 2016 年 1 月 28 日
編集済み: Ansh 2016 年 1 月 28 日
Thank you Stephen for your answer, it is partly that reason why I first posted on here. I shall go away and think of an alternative procedure. If I can analyse the data I have and come up with an upper bound on the size of the cluster that could help.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeDescriptive Statistics についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by