detectTextCRAFT
Syntax
Description
detects texts in images by using character region awareness for text detection (CRAFT) deep
learning model. The bboxes
= detectTextCRAFT(I
)detectTextCRAFT
function uses a pretrained CRAFT deep
learning model to detect texts in an image. The pretrained CRAFT model can detect 9
languages that include Chinese, Japanese, Korean, Italian, English, French, Arabic, German,
and Bangla (Indian).
Note
To use the pretrained CRAFT model, you must install the Computer Vision Toolbox™ Model for Text Detection. You can download and install the Computer Vision Toolbox Model for Text Detection from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. To run this function, you will require the Deep Learning Toolbox™.
detects texts within a region-of-interest (ROI) in the image.bboxes
= detectTextCRAFT(I
,roi
)
specifies additional options by using name-value pair arguments. You can use the name-value
pair arguments to fine-tune the detection results.bboxes
= detectTextCRAFT(___,Name=Value
)
Examples
Detect Texts in Images by Using CRAFT Model
Read an input image into the MATLAB workspace.
I = imread("handicapSign.jpg");
Compute the text detection results by using the detectTextCRAFT
function. The region and the affinity thresholds are set to default values. The output is a set of bounding boxes that contain the detected text regions.
bboxes = detectTextCRAFT(I);
Draw the output bounding boxes on the image by using the insertShape
function.
Iout = insertShape(I,"rectangle",bboxes,LineWidth=3);
Display the text detection results.
figure imshow(Iout)
Detect Texts in ROI by Using CRAFT
Read an input image into the MATLAB workspace.
visiondatadir = fullfile(toolboxdir('vision'),'visiondata'); I = imread(fullfile(visiondatadir,'imageSets','books','pairOfBooks.jpg'));
Specify a region of interest (ROI) within the input image.
roi = [120,80,250,200];
Detect texts within the specified ROI by using the detectTextCRAFT
function. The region and affinity thresholds are set to default values. The output is a set of bounding boxes that contain the detected text regions.
bboxes = detectTextCRAFT(I,roi);
Draw the ROI and the output bounding boxes on the input image. Display the text detection results.
I = insertObjectAnnotation(I,"rectangle",roi,"ROI",Color="green"); Iout = insertShape(I,"rectangle",bboxes,LineWidth=3); figure imshow(Iout)
Detect Characters by Modifying Affinity Threshold
This example shows how to detect each character in the text regions of an input image by using the CRAFT model. You can achieve this by modifying the affinity threshold. This example also demonstrates the effect of different affinity threshold values on the detection results.
Read an input image into the MATLAB workspace.
visiondatadir = fullfile(toolboxdir('vision'),'visiondata'); I = imread(fullfile(visiondatadir,'bookCovers','book27.jpg'));
Specify the affinity threshold values to consider for detecting the text regions in the image.
threshold = [1 0.1 0.01 0.001 0.0004];
Preallocate a 4-D array Iout
to store the output image with detection results.
Iout = zeros(size(I,1),size(I,2),size(I,3),length(threshold));
Compute the output for each affinity threshold value specified at the input. The output is a set of bounding boxes that contain the detected text regions. Draw the output bounding boxes on the image by using the insertShape
function. The region threshold is set to the default value, 0.4.
for cnt = 1:length(threshold) bboxes = detectTextCRAFT(I,LinkThreshold=threshold(cnt)); Iout(:,:,:,cnt) = insertShape(I,"rectangle",bboxes,LineWidth=3); end
Display the text detection results obtained for different values of affinity threshold. You can notice that as the affinity threshold value decrease, the characters with less affinity scores are considered as connected components and are grouped as a single instance. For good localization and detection results, the affinity threshold must be greater than zero.
figure montage(uint8(Iout),Size=[1 5],BackgroundColor="white"); title(['LinkThreshold = ' num2str(threshold(1)) ' | LinkThreshold = ' num2str(threshold(2)) ' | LinkThreshold = ' num2str(threshold(3)) ... ' | LinkThreshold = ' num2str(threshold(4)) ' | LinkThreshold = ' num2str(threshold(5))]);
Input Arguments
I
— Input image
2-D grayscale image | 2-D color image
Input image, specified as a 2-D grayscale image or 2-D color image.
Data Types: single
| double
| int16
| uint8
| uint16
| logical
roi
— Search rectangular region-of-interest
four-element vector
Search a rectangular region-of-interest in an image, specified as a four-element vector of the form [x y width height]. The vector specifies the upper left corner and size of a rectangular region in pixels. The region must be fully contained in the image.
When you specify this value, the detectTextCRAFT
function detects
texts that are present only within this ROI.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: bboxes = detectTextCRAFT(I,MaxSize=[10,10])
specifies the
maximum size of the text region to detect in the input image
CharacterThreshold
— Region threshold for characters
0.4 (default) | positive scalar
Region threshold for localizing each character in the image, specified as a positive scalar in the range [0, 1]. To increase the number of detections, lower the region threshold value. However, this will also result in false-positives. To reduce the number of false-positives, increase the region threshold value.
Data Types: single
| double
LinkThreshold
— Link threshold
0.4 (default) | positive scalar
Link threshold for grouping adjacent characters into a word, specified as a positive scalar in the range [0, 1]. You can increase the number of character level detections by increasing the link threshold. To detect each character in the image, set this value to 1. For good localization and detection results, the link threshold must be greater than zero.
Data Types: single
| double
MinSize
— Size of smallest detectable text region
[6,6] (default) | two-element vector
Size of smallest detectable text region in the image, specified as a two-element vector of form [height width].
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
MaxSize
— Size of largest detectable text region
size of input image (default) | two-element vector
Size of largest detectable text region in the image, specified as a two-element vector of form [height width]. By default, this value is set to the height and width of the input image.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
ExecutionEnvironment
— Hardware resource
"auto"
(default) | "cpu"
| "gpu"
Hardware resource for processing images with the CRAFT model, specified as
"auto"
, "gpu"
, or "cpu"
.
ExecutionEnvironment | Description |
---|---|
"auto" | Use a GPU if available. Otherwise, use the CPU. The use of GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox). |
"gpu" | Use the GPU. If a suitable GPU is not available, the function returns an error message. |
"cpu" | Use the CPU. |
Data Types: char
| string
Acceleration
— Performance optimization
"auto"
(default) | "mex"
| "none"
Performance optimization, specified as "auto"
,
"mex"
, or "none"
.
Acceleration | Description |
---|---|
"auto" | Automatically apply a number of optimizations suitable for the input network and hardware resource. |
"mex" | Compile and execute a MEX function. This option is available when using a GPU only. You must also have a C/C++ compiler installed. For setup instructions, see MEX Setup (GPU Coder). |
"none" | Disable all acceleration. |
The default option is "auto"
. If you use the
"auto"
option, MATLAB® does not ever generate a MEX function.
Using the "Acceleration"
options "auto"
and
"mex"
can offer performance benefits, but at the expense of an
increased initial run time. Subsequent calls with compatible parameters are faster.
Use performance optimization when you plan to call the function multiple times using
new input data.
The "mex"
option generates and executes a MEX function based on
the network and parameters used in the function call. You can have several MEX
functions associated with a single network at one time. Clearing the network variable
also clears any MEX functions associated with that network.
The "mex"
option is only available when you are using a GPU.
Using a GPU requires Parallel Computing Toolbox and a CUDA enabled NVIDIA GPU. For information about the supported compute
capabilities, see GPU Computing Requirements (Parallel Computing Toolbox). If Parallel Computing Toolbox or a suitable GPU is not available, then the function returns an
error.
Output Arguments
bboxes
— Bounding boxes for detected text regions
M-by-4 matrix
Bounding boxes specifying the detected text regions, returned as an M-by-4 matrix. M is the number of detected text regions. Each row in the matrix is a vector of form [x y width height]. The vector specifies the upper left corner and size of the detected region in pixels.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
The
roi
argument must be a code generation constant (coder.const()
) and a 1-by-4 vector.Code generation does not support variable-size data for the input argument
I
.Only the
CharacterThreshold
,LinkThreshold
,MinSize
, andMaxSize
name-value arguments are supported.When you set
build_type
argument ofcoder.config
object todll
, for generating code that does not use any third-party library, theDynamicMemoryAllocationForFixedSizeArrays
property of thecoder.CodeConfig
object must be set totrue
.To avoid memory allocation error during library free C++ code generation on Windows platform, the
cfg.DynamicMemoryAllocationForFixedSizeArrays
property of thecoder.CodeConfig
object must be set totrue
.
GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.
Usage notes and limitations:
The
roi
argument must be a code generation constant (coder.const()
) and a 1-by-4 vector.Code generation does not support variable-size data for the input argument
I
.Only the
CharacterThreshold
,LinkThreshold
,MinSize
, andMaxSize
name-value arguments are supported.
Version History
Introduced in R2022aR2024a: Generate CUDA code using GPU Coder
detectTextCRAFT
now supports the generation of optimized CUDA code (requires GPU Coder™).
R2024a: Generate C/C++ code using MATLAB Coder
detectTextCRAFT
now supports the generation of C/C++ code (requires
MATLAB
Coder™).
MATLAB コマンド
次の MATLAB コマンドに対応するリンクがクリックされました。
コマンドを MATLAB コマンド ウィンドウに入力して実行してください。Web ブラウザーは MATLAB コマンドをサポートしていません。
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)