How to efficiently zero-pad a matrix with varying lengths

7 ビュー (過去 30 日間)
William
William 2012 年 12 月 30 日
Here's my modified code after integrating advice from the 3 answers below (and if I could I would accept all three of the answers, I only accepted the first because it was the first. There should be some way to accept multiple answers because each one was helpful!):
num_points = 100 * 100 * 75;
for i = 1:num_points
% zero_pad_length only changes every 3 "s"
zero_pad = repmat(zero_pad_length(:,i), 1, 3)';
offset_trace(1:end-zero_pad,:) = trace_to_offset(zero_pad+1:end,:);
results(i) = max(prod(offset_trace));
offset_trace(:) = 0;
end
It's 50% faster than it was before. Thanks everyone for your help.
ORIGINAL QUESTION: I've been working at this for a bit, and I'm not really sure where to go from here. The basic structure of what I want to do is (results and offset_trace are preallocated with zeros()):
for i = 1:100
for j = 1:100
for k = 1:75
for s = 1:30
% zero-pad a vector with different lengths of zeros
offset_trace(s,:) = [trace_to_offset(s,1+zero_pad_length(s,i,j,k):end) zeros(1,zero_pad_length(s,i,j,k))];
end
% then take the max of its product
results(i,j,k) = max(prod(offset_trace));
end
end
end
I don't see any way to vectorize the first three for loops (those loop over a 3D grid) because for each point of my grid there's a different pad_length for each of the 30 different rows of trace_to_offset.
I've tried to vectorize the for loop over s with:
str_to_eval = sprintf('[trace_to_offset(%i,1+zero_pad_length(%i,i,j,k):end) zeros(1,zero_pad_length(%i,i,j,k))],', stats');
offset_trace = eval(['vertcat(' str_to_eval(1:end-1) ')']);
but it takes exactly as much time to run as the first option. Are there any other solutions that could help me speed up this zero-padding?
Or even better, if it's possible to vectorize the first 3 for loops?
EDIT: For the comments belows and stat should have been "s"! Sorry for the mistakes.
EDIT_2: I've included my modified code that integrates the 3 answers below. Thanks to everyone for your help!
  2 件のコメント
Jan
Jan 2012 年 12 月 30 日
編集済み: Jan 2012 年 12 月 30 日
Brrr, a vectorization by eval... a bad idea! This disables the possibilities of the JIT to accelerate the loop.
The posted code is incomplete. A "[" is missing. Is "offset_trace" and "result" pre-allocated? A pre-allocation is fundamental for speed, while a vectorization is not necessarily.
William
William 2012 年 12 月 30 日
I must have lost a "[" while copy-paste-ing. And yes, offset_trace and result are both pre-allocated.
I've edited the question to reflect those two things.

サインインしてコメントする。

採用された回答

Jan
Jan 2012 年 12 月 30 日
Pre-allocation:
results = zeros(100, 100, 75);
offset_trace = zeros();
c = trace_to_offset(stat, :);
n = length(c);
for i = 1:100
for j = 1:100
for k = 1:75
d = zero_pad_length(:,i,j,k);
for s = 1:30
% Fill in initial part, equivalent to zero padding:
e = 1 + d(s);
offset_trace(s, 1:n-e+1) = c(e:n);
end
% then take the max of its product
results(i,j,k) = max(prod(offset_trace));
end
end
end
  2 件のコメント
Matt J
Matt J 2012 年 12 月 30 日
offset_trace needs to be reset to zero after the s-loop finishes
offset_trace(:)=0;
William
William 2012 年 12 月 30 日
First off, thanks for taking the time to respond. I just tried this out and it's basically what I was doing before but now instead of concatenating two vectors together (the truncated trace_to_offset and zeros()), I'm just assigning directly the truncated trace. I'm not sure why I didn't think of that before...
But unfortunately it looks like it takes the same amount of time...

サインインしてコメントする。

その他の回答 (2 件)

Matt J
Matt J 2012 年 12 月 30 日
編集済み: Matt J 2012 年 12 月 30 日
Well, first a few general points,
  • There's no need to have a triple loop over i,j,k. You can just use linear indexing for those. And I'm guessing the reduction in loop-nesting will make things faster.
  • If you pre-allocate offset_trace there should be no need to zero-pad. Just assign trace_to_offset to the appropriate sublock of offset_trace.
  • It might be faster to load offset_trace column-wise, as opposed to row-wise, so that memory-access operations will be more contiguous. Similarly, it might be good to pre-transpose and read column-wise from trace_to_offset.
With these things in mind:
offset_trace=zeros(P,30);
trace_to_offset = trace_to_offset.';
results=zeros(size(zero_pad_length));
for idx=1:75e4
for s=1:30
N=zero_pad_length(s,idx);
offset_trace(1:P-N,s)= trace_to_offset(N+1:P,stat);
end
results(idx)=max(prod(offset_trace,2));
offset_trace(:)=0; %reset
end

Matt J
Matt J 2012 年 12 月 30 日
編集済み: Matt J 2012 年 12 月 30 日
A fully vectorized solution, if you've got the RAM for it,
v=trace_to_offset(stat,:);
n=length(v);
Table= fliplr(toeplitz([v(n),zeros(1,n-1)], v(n:-1:1) ));
Table(end, end+1)=0;
offset_trace=Table(:,zero_pad_length+1);
offset_trace=reshape( offset_trace, n,30,[]);
results = max(prod(offset_trace,2),[],1);
results=reshape(results, 100, 100, 75 );
  3 件のコメント
Matt J
Matt J 2012 年 12 月 30 日
編集済み: Matt J 2012 年 12 月 30 日
Well, you could still probably salvage a lot of the vectorization using a single short loop over s,
accum=1;
for s=1:30
v=trace_to_offset(s,:);
n=length(v);
Table= fliplr(toeplitz([v(n),zeros(1,n-1)], v(n:-1:1) ));
Table(end, end+1)=0;
offset_trace=Table(:,zero_pad_length(s,:)+1);
offset_trace=reshape( offset_trace, n,1,[]);
accum=accum.*offset_trace;
end
results = max(accum,[],1);
results=reshape(results, 100, 100, 75 );
William
William 2012 年 12 月 30 日
Sorry about that mistake, I realize that changes the problem quite a bit.
The "s" loop is actually two loops combined here:
for s = 1:10
for c = 1:3
stat = (s - 1)*3 + 1;
end
end
I do this because the zero_pad_length only varies every 3 rows of trace_to_offset. I simplified the code above to make it as generalized/simple as possible.
I do appreciate the time you've taken to respond. Thanks.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeLoops and Conditional Statements についてさらに検索

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by