I want to change the matrix such that the first and second array are the panes of the final matrix.
That is I want to transform this
1 1 5
1 2 6
1 3 7
2 1 8
2 2 9
into this
5 6 7
8 9 NaN
I know how to do it brute force with a loop:
d = [1 1 1 2 2]';
g = [1 2 3 1 2]';
v = [5 6 7 8 9]';
A = [d g v];
for i = 1:max(d)
M1=A(d == i,:);
for j = 1 :max(g)
M2=M1(M1(:,2) == j,:);
B(i,j) =mean(M2(:,3));
end
end
However, are there more time-saving ways?

1 件のコメント

Stephen23
Stephen23 2018 年 8 月 9 日
"However, are there more time-saving ways?"
Yes, see Jos's answer.

サインインしてコメントする。

 採用された回答

Jos (10584)
Jos (10584) 2018 年 8 月 9 日

2 投票

A simple one-liner would do, I think, letting accumarray do all the work:
A = [ 1 1 5
1 2 6
1 3 7
2 1 8
2 2 9 ]
B = accumarray(A(:,[1 2]), A(:,3), [], [], NaN)
% 5 6 7
% 8 9 NaN

3 件のコメント

Stephen23
Stephen23 2018 年 8 月 9 日
編集済み: Stephen23 2018 年 8 月 9 日
+1 that is exactly how to use accumarray!
Rik
Rik 2018 年 8 月 9 日
True, except for the requirement of calculating the mean for any duplicate, so it becomes this:
A = [ 1 1 5
1 2 6
1 3 7
2 1 8
2 2 7
2 2 9 ]
B = accumarray(A(:,[1 2]), A(:,3), [], @mean, NaN)
Rahel Braun
Rahel Braun 2018 年 8 月 9 日
Very elegant, that reduces the size of my m-file a lot :) Thank you all for the quick inputs

サインインしてコメントする。

その他の回答 (2 件)

Fangjun Jiang
Fangjun Jiang 2018 年 8 月 7 日

2 投票

a=[ 1 1 5
1 2 6
1 3 7
2 1 8
2 2 9];
MatrixSize=max(a(:,1:2));
b=nan(MatrixSize);
b(sub2ind(MatrixSize,a(:,1),a(:,2)))=a(:,3)

1 件のコメント

Rahel Braun
Rahel Braun 2018 年 8 月 7 日
Perfect thank you, that's what I wanted.

サインインしてコメントする。

Rik
Rik 2018 年 8 月 7 日
編集済み: Rik 2018 年 8 月 7 日

1 投票

I now wrote it with accumarray, without needing the call to unique. I also added a part that handles any empty positions (I removed the 1,2 position).
d = [1 1 2 2 2]';
g = [1 3 1 2 2]';
v = [5 7 8 9 7]';
%pre-allocate correct size output as NaN
out=NaN(max(d),max(g));
%convert subs to linear indices
ind=sub2ind(size(out),d,g);
%compute mean for each position (taking care of duplicates
means = accumarray(ind,v,[],@nanmean);
%paste into output array
out(1:numel(means))=means;
%take care of skipped values (replace 0 by NaN)
%(ismembc is way faster than ismember, and works best with 2 sorted arrays)
missing=find(~ismembc(1:numel(means),sort(ind)));
out(missing)=NaN;
Original post:
I'm assuming you want to calculate the mean for any duplicates. The code to remove duplicates could be further optimized.
d = [1 1 1 2 2]';
g = [1 2 3 1 2]';
v = [5 6 7 8 9]';
%pre-allocate correct size output as NaN
out=NaN(max(d),max(g));
%convert subs to linear indices
ind=sub2ind(size(out),d,g);
%sort indices and values
[ind,order]=sort(ind);v=v(order);
%check for double assignments
while any(diff(ind)==0)
%compute mean
current_index=ind(find(diff(ind)==0,1));
L=ind==current_index;
new_value=mean(v(L));
%remove old values and put back the new one
ind(L)=[];v(L)=[];
ind=[ind;current_index];v=[v;new_value]; %#ok<AGROW>
end
%write to matrix
out(ind)=v;

6 件のコメント

Rahel Braun
Rahel Braun 2018 年 8 月 7 日
This looks also like a good solution, however I really need to avoid loops since I have a big dataset. But thank you anyway.
Rik
Rik 2018 年 8 月 7 日
This only include a loop for duplicates. The answer you currently have accepted will return an incorrect result for duplicates. Try the input below.
d = [1 1 1 2 2 2]';
g = [1 2 3 1 2 2]';
v = [5 6 7 8 9 7]';
a = [d g v];
Fangjun Jiang's code result:
b =
5 6 7
8 7 NaN
Result with my code:
out =
5 6 7
8 8 NaN
Rahel Braun
Rahel Braun 2018 年 8 月 7 日
Yes you are absolutely right. One needs to take care for duplicates. A solution without a loop would be to calculate the mean of duplicates first, and then use Fangjun Jiang's code
d = [1 1 1 2 2 2]';
g = [1 2 3 1 2 2]';
v = [5 6 7 8 9 7]';
a = [d g v];
%Check for duplicates and take the mean if there any
[~,~,idxA] = unique(a(:,1:2),'rows');
means = accumarray(idxA,a(:,3),[],@nanmean);
z= means(idxA);
s=[d g z];
%Resize matrix
MatrixSize=max(s(:,1:2));
b=nan(MatrixSize);
b(sub2ind(MatrixSize,s(:,1),s(:,2)))=s(:,3)
Rik
Rik 2018 年 8 月 7 日
Have you read my code? It is functionally equivalent to that of Fangjun Jiang, except for the sort & check for duplicates. But yeah, your method with unique and accumarray might be faster than my code. You could check with actual data, because it also likely depends on the 'density' of duplicates. The loop might be faster if there are very few duplicates.
Rahel Braun
Rahel Braun 2018 年 8 月 8 日
Yes, I run it with my data but it takes ages. However, the new one you posted without loop works quick and great. Thank you
Rik
Rik 2018 年 8 月 8 日
You're welcome. If this solves your issue better than Fangjun's answer, you can un-accept that one and accept this one. (he will still keep the reputation points)

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangePerformance and Memory についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by