MATLAB Answers

Why ocr function doesn't recognize the numbers?

72 ビュー (過去 30 日間)
Adriano
Adriano 2018 年 1 月 16 日
コメント済み: ISAAC DOUGHAN 2020 年 8 月 9 日
Hi,
I have the image below:
I need to capture the numbers through ocr function. Thus, I use this code to do it:
capture = imread('Capture.png');
my_image = imresize(capture, 1.4);
ocrResults = ocr(my_image,'CharacterSet','.0123456789');
recognizedText = ocrResults.Words;
However, ocr function only recognize some numbers. In fact, the output is a cell array 37x1 while I should have 39 rows:
{'18.33'}
{'1423' }
{'1' }
{'6.55' }
{'1' }
{'5.65' }
{'12.54'}
{'14.77'}
{'10.33'}
{'13.79'}
{'12.94'}
{'1255' }
{'1' }
{'1.70' }
{'9.84' }
{'10.71'}
{'9.74' }
{'9.98' }
{'933' }
{'9.00' }
{'7.22' }
{'3.02' }
{'7.45' }
{'7.10' }
{'6.56' }
{'6.28' }
{'5.86' }
{'5.40' }
{'5.01' }
{'4.57' }
{'4.10' }
{'174' }
{'3.39' }
{'3.011'}
{'2.71' }
{'2.33' }
{'2.118'}
Then, many numbers are worng. Please, somone can help me? Thanks!

  0 件のコメント

サインインしてコメントする。

採用された回答

Birju Patel
Birju Patel 2018 年 1 月 17 日
編集済み: Birju Patel 2018 年 1 月 17 日
Hi,
A little bit of pre-processing and using ROIs to specify where the words are will help. By default, OCR uses page layout analysis to determine blocks of text. In this case, the image doesn't look like a normal page of text (like a PDF article for example).
To make it easier for OCR, first you can find the location of the words using regionprops and then pass the location of the words (as bounding boxes) to the OCR function. See the code below and results. They look accurate. You may have to play around more with the pre-processing to make this robust for a collection of different images. But hopefully, this gives you an idea on how to proceed:
capture = imread('Captura.PNG');
% Increase image size by 3x
my_image = imresize(capture, 3);
figure
imshow(my_image)
% Localize words
BW = imbinarize(rgb2gray(my_image));
BW1 = imdilate(BW,strel('disk',6));
s = regionprops(BW1,'BoundingBox');
bboxes = vertcat(s(:).BoundingBox);
% Sort boxes by image height
[~,ord] = sort(bboxes(:,2));
bboxes = bboxes(ord,:);
% Pre-process image to make letters thicker
BW = imdilate(BW,strel('disk',1));
% Call OCR and pass in location of words. Also, set TextLayout to 'word'
ocrResults = ocr(BW,bboxes,'CharacterSet','.0123456789','TextLayout','word');
words = {ocrResults(:).Text}';
words = deblank(words)
words =
39×1 cell array
{'0' }
{'18.33'}
{'0' }
{'14.23'}
{'0' }
{'16.55'}
{'0' }
{'15.65'}
{'12.64'}
{'14.77'}
{'10.83'}
{'13.79'}
{'12.94'}
{'12.55'}
{'0' }
{'11.70'}
{'9.84' }
{'10.71'}
{'9.74' }
{'9.98' }
{'9.33' }
{'9.00' }
{'7.22' }
{'8.02' }
{'7.45' }
{'7.10' }
{'6.56' }
{'6.28' }
{'5.86' }
{'5.40' }
{'5.02' }
{'4.57' }
{'4.10' }
{'3.74' }
{'3.39' }
{'3.00' }
{'2.71' }
{'2.33' }
{'2.08' }

  5 件のコメント

表示 2 件の古いコメント
Robert Cadavos
Robert Cadavos 2020 年 4 月 26 日
you're my hero man!
José Luis Sandoval
José Luis Sandoval 2020 年 6 月 8 日
It still does not work for me.
ISAAC DOUGHAN
ISAAC DOUGHAN 2020 年 8 月 9 日
Hi Birju,
I tried using OCR to train ocr to recognise numbers from 1 to 50
The font is consolas. However zero (0) and two (2) are always conflicting.
zero(0) always reads as 2 and sometimes as forward slash when using them in the matlab.
this is the onedrive link to a copy of data. I have also added the code to it
https://studentuef-my.sharepoint.com/:u:/g/personal/isaacd_uef_fi/EdC1xH4AGJ5PnukqZNahscYBFKuJ_WoS98uHgzi3NSIwLA?e=Pc90ys
Please what am i doing wrong?
Best regard,
Isaacjbbjjhj

サインインしてコメントする。

その他の回答 (0 件)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by