problems with the computational speed of a sliding kernel window
3 ビュー (過去 30 日間)
古いコメントを表示
Hello everyone,
I have been facing an issue with my code being unoptimised for speed. The MATLAB code is used for finding the neighbours of a pixel and stored its neighbours as indices in a matrix. This process is looped for every pixel. The zero padding is used to get the neighbours of the edge pixel. The for loop is designed as a convolutional kernel which scans the neighbour points and store them. These indices are then used to find the neighbours of the MatPC matrix then are averaged.
clear all;
%Man.png is a 284x284 depth image of a mannequin face
I = double(imread('D:\Fezan\images\Man.png'));
r =1;
%preallocation
% Ind = zeros(size(I,1),(r+2)^2);
I = padarray(I,[r,r],0,'both');
[row,col] = size(I); % matrix size for original data
n =1;
tic
for i = r + 1 : row - r
for j = r + 1 : col - r
LocationMat=zeros(row,col);
LocationMat(i-r:i+r,j-r:j+r) = 1;
Ind(n,:)=find(LocationMat); %finds the indices of the box kernel
n = n + 1;
end
end
toc
SizInd = size(Indices,1);
for loop = 1:SizInd
PCIndexing = MatPC(Indices(loop,:),:,:); %using the index matrix to find the neighbours of a point in Nx3 (PC matrix
AverageX(loop,1) = median(PCIndexing(:,1)); %averaging x values for all neighbouring points
AverageY(loop,1) = median(PCIndexing(:,2)); %averaging y values for all neighbouring points
AverageZ(loop,1) = median(PCIndexing(:,3)); %averaging z values for all neighbouring points
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
AveragePC = [AverageX, AverageY, AverageZ];
The problem with the code is the computational speed of the indice for loop.
for i = r + 1 : row - r
for j = r + 1 : col - r
LocationMat=zeros(row,col); %creates a zeros matrix of the Image
LocationMat(i-r:i+r,j-r:j+r) = 1; % the kernel window
Ind(n,:)=find(LocationMat); %finds the indices of the kernel and stores into variable
n = n + 1;
end
end
when the radius is increased, the computational time on finding the indices becomes ludacris, making it unviable when the kernel is larger than 3x3. I was wondering if anyone had any suggestions to tackle this problem. I tried using parfor loop but I come across this error when it occurs.
I am unsure on how to proceed and help would be appreciated.
1 件のコメント
採用された回答
Jan
2021 年 3 月 20 日
編集済み: Jan
2021 年 3 月 21 日
Do not start serious code with "clear all". This removes all loaded functions from the memory and the reloading from the slow disk is a waste of time. If you want to keep your workspace clean, use "clear variables" or even better: store the code inside a function.
You have comment out these lines:
%preallocation
% Ind = zeros(size(I,1),(r+2)^2);
A pre-allocation would be very useful. The correct size is:
Ind = zeros((row - 2*r) * (col - 2*r), (2*r+1)^2);
Remember that letting an array grow iteratively consumes a lot of ressources:
x = [];
for i = 1:1e6
x(i) = i * rand;
end
Although the final vector needs 8MB only (1e6 * 8 bytes per double), Matlab has to allocate sum(1:1e6)*8 bytes: more then 4 TB! In each iteration a new array is allocated, the existing elements are copied and a new element is appended. This is a massive waste of time. With a pre-allocation only the final 8MB are allocated and no memory transfers are required:
x = zeros(1, 1e6);
Let's take a look on the computations:
LocationMat=zeros(row,col);
LocationMat(i-r:i+r,j-r:j+r) = 1;
Ind(n,:)=find(LocationMat);
Wow, this is cruel. For each pixel you create a matrix with the same size as the image only to insert some 1s at specific positions and let FIND locate these specific positions. Why not using these positions directly?
tic
n = 1;
Ind = zeros((row - 2*r) * (col - 2*r), (2*r+1)^2); % Pre-allocation
box = reshape((1:2*r).' + (0:2*r-1) * row, 1, []); % Indices of box at (1,1)
% [EDITED] ^ ^ 2 instead of 3, t
% ypos fixed
for i = r + 1 : row - r
for j = r + 1 : col - r
Ind(n, :) = box + (i - 1 - r + row * (j - 1 - r));
n = n + 1;
end
end
toc
For a [300 x 400] pixel image your method needs 80.0 seconds on my i5 Matlab R2018b, and my code 0.027 seconds. A speedup of factor 3000. Nice.
If you smooth a 12 MPixel image with a 10*10 kernel, the index matrix needs about 12e6*10*10*8 byte, which are 9.6GB already. So this approach is rather demanding for memory. But of course, a moving median filter does not need to create a huge index matrix at first. On one hand it is cheaper to calculate the median directly inside the loops instead of collecting the indices at first. On the other hand Matlab's toolbox function medfilt2 can solve this also.
Remember to pre-allocate the output of AverageX/Y/Z also, if you really want to use the slower loops.
2 件のコメント
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Get Started with MATLAB についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!