Detect the vertical and horizontal lines and than crop the area that is hand written

Hello, I have a complete data set of images like this around 1000+, what i want to do is to detect the vertical and horizontal lines and than crop the area that is hand written. After cropping each hand written character should be saved as an individual image. Any help will be appreciated. Thanks

 採用された回答

Image Analyst
Image Analyst 2017 年 7 月 19 日
This code will do it. Try it and let me know. Then adapt it by making it a function and putting it inside a loop, like you can get from the FAQ, to process the other thousand images. It will save each cropped image with the row and column number from where it came from into the folder of the original image. Once you verify that it works, you can comment out the questdlg() to have it run without showing you each small cropped image and prompting you to continue.
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 15;
%===============================================================================
% Get the name of the image the user wants to use.
baseFileName = '013.jpg';
% Get the full filename, with path prepended.
folder = pwd
fullFileName = fullfile(folder, baseFileName);
% Check if file exists.
if ~exist(fullFileName, 'file')
% The file doesn't exist -- didn't find it there in that folder.
% Check the entire search path (other folders) for the file by stripping off the folder.
fullFileNameOnSearchPath = baseFileName; % No path this time.
if ~exist(fullFileNameOnSearchPath, 'file')
% Still didn't find it. Alert user.
errorMessage = sprintf('Error: %s does not exist in the search path folders.', fullFileName);
uiwait(warndlg(errorMessage));
return;
end
end
%===============================================================================
% Read in demo image.
rgbImage = imread(fullFileName);
% Get the dimensions of the image.
[imageRows, imageColumns, numberOfColorChannels] = size(rgbImage);
% Display the original image.
subplot(2, 2, 1);
imshow(rgbImage, []);
axis on;
caption = sprintf('Original Color Image, %s', baseFileName);
title(caption, 'FontSize', fontSize, 'Interpreter', 'None');
hp = impixelinfo();
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0 0 1 1]);
% Get rid of tool bar and pulldown menus that are along top of figure.
% set(gcf, 'Toolbar', 'none', 'Menu', 'none');
% Give a name to the title bar.
set(gcf, 'Name', 'Demo by ImageAnalyst', 'NumberTitle', 'Off')
drawnow;
hp = impixelinfo(); % Set up status line to see values when you mouse over the image.
% Extract the individual red, green, and blue color channels.
redChannel = rgbImage(:, :, 1);
% greenChannel = rgbImage(:, :, 2);
% blueChannel = rgbImage(:, :, 3);
% Threshold to get the mask.
mask = redChannel > 224;
% Get rid of the white surround.
mask = imclearborder(mask);
% Fill holes.
mask = imfill(mask, 'holes');
% Make sure each blob is a minimum of 150 by 150 pixels.
mask = bwareafilt(mask, [150*150, inf]);
% Erode by 10 pixels to get rid of any black lines
% in the bounding box due to it being rotated.
mask = imerode(mask, ones(10));
% Display the image.
subplot(2, 2, 2);
imshow(mask);
grid on;
axis on;
hold on;
title('Mask Image', 'FontSize', fontSize);
drawnow;
% Label the image
[labeledImage, numBlobs] = bwlabel(mask);
% Measure bounding boxes:
props = regionprops(labeledImage, 'BoundingBox', 'Centroid');
allCentroids = [props.Centroid];
xCentroids = allCentroids(1:2:end);
yCentroids = allCentroids(2:2:end);
% There are 14 rows and 10 columns.
% Find the average of them using kmeans
[indexes, xClusterCtr] = kmeans(xCentroids', 10);
[indexes, yClusterCtr] = kmeans(yCentroids', 14);
% Sort them in ascending order.
xClusterCtr = sort(xClusterCtr, 'ascend')
yClusterCtr = sort(yClusterCtr, 'ascend')
% Loop through them
for k = 1 : numBlobs
% Get the bounding box of this blob.
thisBB = props(k).BoundingBox;
% Crop out the small box.
thisCroppedImage = imcrop(rgbImage, thisBB);
% Display the image.
subplot(2, 2, 3);
imshow(thisCroppedImage, []);
axis on;
caption = sprintf('Blob #%d', k);
title(caption, 'FontSize', fontSize);
% Find out which row and column this is in
thisX = xCentroids(k);
thisY = yCentroids(k);
distancesX = sqrt((thisX - xClusterCtr) .^ 2);
distancesY = sqrt((thisY - yClusterCtr) .^ 2);
[minDistance, column] = min(distancesX);
[minDistance, row] = min(distancesY);
fprintf('Blob #%d is at (%.1f, %.1f) in row %d, column %d\n', ...
k, thisX, thisY, row, column);
% Plot a star there.
subplot(2, 2, 2);
% Plot actual centroid.
plot(thisX, thisY, 'r*', 'MarkerSize', 10, 'LineWidth', 2);
% Plot star at grid crossing lines.
plot(xClusterCtr(column), yClusterCtr(row), 'b*', 'MarkerSize', 10, 'LineWidth', 2);
% Prepare filename.
baseFileName = sprintf('Row %d, Column %d.png', row, column);
fullFileName = fullfile(folder, baseFileName);
imwrite(thisCroppedImage, fullFileName);
% Pause to prompt user.
promptMessage = sprintf('Saved image as %s\nDo you want to Continue processing,\nor Quit processing?', fullFileName);
titleBarCaption = 'Continue?';
buttonText = questdlg(promptMessage, titleBarCaption, 'Continue', 'Quit', 'Continue');
if strcmpi(buttonText, 'Quit')
break;
end
end

18 件のコメント

Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 19 日
Hi Image Analyst, Thank you for your response, when i try to compile your code it gives a error. Undefined function 'bwareafilt' for input arguments of type 'double'. Error in urdudbnew (line 59) mask = bwareafilt(mask, [150*150, inf]); how to resolve this.
You have a really old version of MATLAB. Please upgrade, or else replace the line with
mask = bwareaopen(mask, 150*150);
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 19 日
i have Matlab R2013a, i have checked it have image processing toolbox. but still i will do as you have suggested.
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 19 日
Previous error is solved but know i am getting bunch of other errors.
Image Analyst
Image Analyst 2017 年 7 月 19 日
編集済み: Image Analyst 2017 年 7 月 19 日
Is this the same image I used? You have the stats toolbox, correct?
Raja, I'm on a different computer now than when I developed the code for you. I downloaded the image and code fresh to this machine and ran it. It ran perfectly. Please attach the actual image you used along with your code with any changes to my code that you made.
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 19 日
編集済み: Raja Bilal Rsb 2017 年 7 月 19 日
yes i have used same image that you have used, i am also attaching the actual image again. i have the stats toolbox installed.
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 15;
%===============================================================================
% Get the name of the image the user wants to use.
baseFileName = '013.jpg';
% Get the full filename, with path prepended.
folder = pwd;
fullFileName = fullfile(folder, baseFileName);
% Check if file exists.
if ~exist(fullFileName, 'file')
% The file doesn't exist -- didn't find it there in that folder.
% Check the entire search path (other folders) for the file by stripping off the folder.
fullFileNameOnSearchPath = baseFileName; % No path this time.
if ~exist(fullFileNameOnSearchPath, 'file')
% Still didn't find it. Alert user.
errorMessage = sprintf('Error: %s does not exist in the search path folders.', fullFileName);
uiwait(warndlg(errorMessage));
return;
end
end
%===============================================================================
% Read in demo image.
rgbImage = imread(fullFileName);
% Get the dimensions of the image.
[imageRows, imageColumns, numberOfColorChannels] = size(rgbImage);
% Display the original image.
subplot(2, 2, 1);
imshow(rgbImage, []);
axis on;
caption = sprintf('Original Color Image, %s', baseFileName);
title(caption, 'FontSize', fontSize, 'Interpreter', 'None');
hp = impixelinfo();
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0 0 1 1]);
% Get rid of tool bar and pulldown menus that are along top of figure.
% set(gcf, 'Toolbar', 'none', 'Menu', 'none');
% Give a name to the title bar.
set(gcf, 'Name', 'Demo by ImageAnalyst', 'NumberTitle', 'Off')
drawnow;
hp = impixelinfo(); % Set up status line to see values when you mouse over the image.
% Extract the individual red, green, and blue color channels.
redChannel = rgbImage(:, :, 1);
% greenChannel = rgbImage(:, :, 2);
% blueChannel = rgbImage(:, :, 3);
% Threshold to get the mask.
mask = redChannel > 224;
% Get rid of the white surround.
mask = imclearborder(mask);
% Fill holes.
mask = imfill(mask, 'holes');
% Make sure each blob is a minimum of 150 by 150 pixels.
mask = bwareaopen(mask, 150*150);
% Erode by 10 pixels to get rid of any black lines
% in the bounding box due to it being rotated.
mask = imerode(mask, ones(10));
% Display the image.
subplot(2, 2, 2);
imshow(mask);
grid on;
axis on;
hold on;
title('Mask Image', 'FontSize', fontSize);
drawnow;
% Label the image
[labeledImage, numBlobs] = bwlabel(mask);
% Measure bounding boxes:
props = regionprops(labeledImage, 'BoundingBox', 'Centroid');
allCentroids = [props.Centroid];
xCentroids = allCentroids(1:2:end);
yCentroids = allCentroids(2:2:end);
% There are 14 rows and 10 columns.
% Find the average of them using kmeans
[indexes, xClusterCtr] = kmeans(xCentroids', 10);
[indexes, yClusterCtr] = kmeans(yCentroids', 14);
% Sort them in ascending order.
xClusterCtr = sort(xClusterCtr, 'ascend')
yClusterCtr = sort(yClusterCtr, 'ascend')
% Loop through them
for k = 1 : numBlobs
% Get the bounding box of this blob.
thisBB = props(k).BoundingBox;
% Crop out the small box.
thisCroppedImage = imcrop(rgbImage, thisBB);
% Display the image.
subplot(2, 2, 3);
imshow(thisCroppedImage, []);
axis on;
caption = sprintf('Blob #%d', k);
title(caption, 'FontSize', fontSize);
% Find out which row and column this is in
thisX = xCentroids(k);
thisY = yCentroids(k);
distancesX = sqrt((thisX - xClusterCtr) .^ 2);
distancesY = sqrt((thisY - yClusterCtr) .^ 2);
[minDistance, column] = min(distancesX);
[minDistance, row] = min(distancesY);
fprintf('Blob #%d is at (%.1f, %.1f) in row %d, column %d\n', ...
k, thisX, thisY, row, column);
% Plot a star there.
subplot(2, 2, 2);
% Plot actual centroid.
plot(thisX, thisY, 'r*', 'MarkerSize', 10, 'LineWidth', 2);
% Plot star at grid crossing lines.
plot(xClusterCtr(column), yClusterCtr(row), 'b*', 'MarkerSize', 10, 'LineWidth', 2);
% Prepare filename.
baseFileName = sprintf('Row %d, Column %d.png', row, column);
fullFileName = fullfile(folder, baseFileName);
imwrite(thisCroppedImage, fullFileName);
% Pause to prompt user.
promptMessage = sprintf('Saved image as %s\nDo you want to Continue processing,\nor Quit processing?', fullFileName);
titleBarCaption = 'Continue?';
buttonText = questdlg(promptMessage, titleBarCaption, 'Continue', 'Quit', 'Continue');
if strcmpi(buttonText, 'Quit')
break;
end
end
i only changed this line
mask = bwareafilt(mask, [150*150, inf]);
to this
mask = bwareaopen(mask, 150*150);
Image Analyst
Image Analyst 2017 年 7 月 19 日
Raja:
I ran that code, and your newly posted image and it ran great. No problems whatsoever. Set a breakpoint at line 80 and see what you're sending in to kmeans(). There must be a problem with the x and y centroids arrays.
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 20 日
i tried putting a breakpoint on line 80 run the code but still there was error. what i did next was to put one more breakpoint on line 81. it worked fine but when i remove the breakpoints from line 80 it again shows the error. I don't understand this? i am attaching Screenshot for all these breakpoints. Breakpoint on line 80:
Breakpoint on line 81:
Image Analyst
Image Analyst 2017 年 7 月 20 日
That all looks exactly right. Because it has xClusterCtr, that means that kmeans() ran successfully this time. You should be able to run the next kmeans() to get yClusterCtr and everything should work fine from then on. Step through it a line at a time and see where, if anywhere, it errors out.
one more thing the output image is very large in size i want it to be not more than 28X28, so for that should i write a separate code? or we can modify it by using imresize? or modify this line or i use the imclearborder().
thisCroppedImage = imcrop(rgbImage, thisBB);
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 20 日
編集済み: Raja Bilal Rsb 2017 年 7 月 20 日
By using the breakpoints i cropped the whole image to individual images but in the output folder i have only 91 images. There should be 140 images as there are 140 boxes. Every time i crop the images the row 4 and 10 is missing.
It will do 140. Either get rid of the prompt, or say Yes to all 140 prompts.
28x28 is not enough resolution to distinguish the different characters I'd think. SO you have to decide if you want to crop to the red letter, and then resize it to 28x28, OR take the whole white box with red letter inside and resize that.
% Reduce/degrade the resolution to something really poor:
thisCroppedImage = imresize(thisCroppedImage, [28, 28]);
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 20 日
The code is working as i need but the only issues that remains is that the output image size is very large, but that's not a problem, i will modify my CNN code. Thank you so much. Stay Blessed :)
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 20 日
編集済み: Raja Bilal Rsb 2017 年 7 月 20 日
Sir i have removed the the prompt dialog box, but still i am not getting all 140 images i am getting 96 images only. In the screenshot below the column 6 and 8 are missing, they are not cropped. Any solution for this?
Image Analyst
Image Analyst 2017 年 7 月 21 日
Raja, I don't know what to say. It works perfectly fine for me. Here's proof it worked for me:
The file count is 140 - Windows' File Explorer even says so.
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 21 日
i have upgraded my Matlab to the latest version, its working fine know. i am getting all the 140 files and code is also working for a directory path as an input. Thank you so much.
Image Analyst
Image Analyst 2017 年 7 月 21 日
Thanks for getting back. It's really strange though because I don't see any reason why an earlier version would give fewer files. If you had an old version, prior to bwareafilt(), it would have thrown an error, but to complete without any errors, and just not have the right number of files is really weird.
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 22 日
In the screen short of mask image the star that are red are not cropped. But now the issue is solved i have upgraded my Matlab. i also don't understand this weird behavior. Thank you.

サインインしてコメントする。

その他の回答 (1 件)

Kevin Xia
Kevin Xia 2017 年 7 月 18 日
編集済み: Kevin Xia 2017 年 7 月 18 日
You can use bwconncomp and regionprops to find the centroids of each whitespace box. The centroids can be used to generate subimages, "cropping" the target image. Here is an example:
Read image to file and convert it to a black and white image using imbinarize. Note that the threshold being used is 0.8. The threshold will depend on the image. You can use the greythresh function to dynamically generate the threshold, but the threshold may still have to be tuned.
I=imread('013.jpg'); %insert image name in place of ‘013.jpg’
greyIm=rgb2gray(I);
bwIm=imbinarize(greyIm,0.8);
Find the connected regions in the array using bwconncomp:
numPixels=cellfun(@numel,CC.PixelIdxList); %For each connected component, calculate the number of pixels.
boxIndices=find(numPixels>22500);
Calculate the centroids of all connected regions. Refer to the documentation of regionprops for more detail:
S=regionprops(CC,'Centroid');
centroids=cat(1,S.Centroid);
Verify that all whitespace box centroids have been found:
figure;
imshow(bwIm);
hold on
plot(boxCentroids(:,1),boxCentroids(:,2),'b*')
hold off
Create subimages from the whitespace box centroids using matrix indexing on the black and white image. In this case, I created a 200x200 pixel subimage around the second box centroid. Note that the centroids are floating point numbers, and have to be rounded using ceil to integers. This can be automated with a loop:
%obtaining one box:
Xrange=ceil(boxCentroids(2,1))-100:ceil(boxCentroids(2,1))+100; %each box is approximately 200x200 pixels.
Yrange=ceil(boxCentroids(2,2))-100:ceil(boxCentroids(2,2))+100;
boxIm=bwIm(Yrange,Xrange);
figure;
imshow(boxIm)
One of the box centroids (using the sample image the first one) is actually the centroid of the grid, and thus will produce a noncentered image. All other centroids should be the centroid of the whitespace boxes. Imwrite can be used to save the image.

5 件のコメント

Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 18 日
Thank you, Kevin Xia for your response. if i write the code as you have explained will it work for all the images(i will provide a directory path where all the images are stored)? i don't want to do any manual work as it will take to much time and can i resize the subimages size so that it does not exceed 28x28 pixel, as this data set will be used in Convolutional neural network, for training. I am bit new to image processing, so more help in this regard will be appreciated.
The code will work for single files, and will not work for a directory path. However, it is possible to place the entire code body in a for loop so that it can. You can pull image names from the directory path using:
imageFiles=dir('*.jpg') %or any other desired image extension
imageFileNames={imageFiles.name};
See the documentation of dir for more details. You'll have to be in the directory with the image files in order for the above to work. With regards to resizing the subimages, you can either change the indexing from 100 to 14, as follows:
Xrange=ceil(boxCentroids(2,1))-14:ceil(boxCentroids(2,1))+14; %each box is approximately 200x200 pixels.
Though this might cut off the letter. Alternatively, you can use imresize:
newIm=imresize(boxIm,scale)
See the documentation on imresize for more details
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 19 日
編集済み: Raja Bilal Rsb 2017 年 7 月 19 日
i worked around a bit and wrote this code, that gets images from directory path, crops the cells than it saves the cropped images in another folder. Can you please modify this to do the job. Thanks
clc;
clear;
close all;
listdir = dir(['D:\URDUDDB\' '*.jpg']);
for fileCtr =1:1:length(listdir)
fname = ['D:\URDUDDB\' listdir(fileCtr).name];
img = imread(fname);
greyIm=rgb2gray(img);
bwIm=imbinarize(greyIm,0.8);
numPixels=cellfun(@numel,CC.PixelIdxList); %For each connected component, calculate the number of pixels.
boxIndices=find(numPixels>22500);
S=regionprops(CC,'Centroid');
centroids=cat(1,S.Centroid);
figure;
imshow(bwIm);
hold on
plot(boxCentroids(:,1),boxCentroids(:,2),'b*')
hold off
%obtaining one box:
Xrange=ceil(boxCentroids(2,1))-100:ceil(boxCentroids(2,1))+100; %each box is approximately 200x200 pixels.
Yrange=ceil(boxCentroids(2,2))-100:ceil(boxCentroids(2,2))+100;
boxIm=bwIm(Yrange,Xrange);
figure;
imshow(boxIm)
cname = ['D:\URDUDDB\cimgs\' num2str(randi(100)) '_' listdir(fileCtr).name];
imwrite(boxIm, cname);
end
Image Analyst
Image Analyst 2017 年 7 月 19 日
編集済み: Image Analyst 2017 年 7 月 19 日
After I fixed the first few problems, there were more. And the more I fixed it, the more it started to approach my code, for example you'd need to call imclearborder(), imfill(), fix the output filename, get the cropped image size correct, etc. So might as well just use my code, which already works. There is a reason my code is usually longer than others - it's flexible, general, robust, and extensively commented.
Raja Bilal Rsb
Raja Bilal Rsb 2017 年 7 月 20 日
yes Sir, your code is way more better than this one, i was just trying to get along this problem. I will surly use your code.

サインインしてコメントする。

カテゴリ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by