How to automatically identify text lines in curved document images ?
1 回表示 (過去 30 日間)
古いコメントを表示
for document images with fronto- parallel view, projection method is effective for identifying text lines, but in case of curved document images (text lines are not straight) projection is not effective. I need help to solve this issue.
consider the below image is for Arabic language text image
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/145526/image.jpeg)
0 件のコメント
回答 (2 件)
Birju Patel
2014 年 9 月 29 日
Hi,
You might be able to use CPSELECT and FITGEOTRANS with the polynomial transform to remove the distortion of the text lines.
As far as OCR, the OCR function in the Computer Vision System Toolbox supports many languages, including Arabic. You can try downloading the Arabic language data file and use that with the OCR function.
Download the Arabic language data file:
https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-3.02.ara.tar.gz&can=2&q=
Then you can pass the location of the ara.traineddata file via the OCR function's 'Language' parameter and try it out on your image. See the OCR ref page for details:
0 件のコメント
Image Analyst
2014 年 9 月 25 日
I don't think the OCR built into the Computer Vision System Toolbox handles that language, but it might, so check it out. If it does not then there is no "built-in" functionality for that and you'll have to write your own, so in that case look online for published algorithms here: http://www.visionbib.com/bibliography/contentschar.html#OCR,%20Document%20Analysis%20and%20Character%20Recognition%20Systems
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Language Support についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!