ocrTrainingData

Create training data for OCR from ground truth

Since R2023a

Syntax

[imds,boxds,txtds] = ocrTrainingData(gTruth,labelName,attributeName)

Description

[imds,boxds,txtds] = ocrTrainingData(gTruth,labelName,attributeName) creates datastores for loading images, bounding boxes, and image text from ground truth.

ocrTrainingData creates training data that you can use to train and evaluate an optical character recognition (OCR) model from ground truth data. Use the trainOCR function to train an OCR model and the evaluateOCR function to evaluate the model.

example

Examples

collapse all

Analyze OCR Ground Truth Data

Open Live Script

This example shows how to analyze an OCR ground truth data to identify its character set and to understand the class distribution.

Load the ground truth data and then extract text labels.

ld = load("14SegmentGtruth.mat");
gTruth = ld.gTruth;
[~,~,txtds] = ocrTrainingData(gTruth,"Text","Word");

Read all ground truth text corresponding to each image and combine them.

allImagesText = txtds.readall;
allText = strjoin([allImagesText{:}], "");

Find the unique set of characters in the ground truth text.

[characterSet, ~, idx] = unique(char(allText));

Display the ground truth character set.

disp("Ground Character Set: " + string(characterSet))

Ground Character Set: +,-./3ABCDEFGHIJKLMNOPQRSTUVWXYZ

The ground truth data contains all the 26 alphabets of English language in capital case, the digit 3 and five special characters: +,-./.

To understand the class distribution, count the character occurrences and tabulate the character count.

characterSet = cellstr(characterSet');
characterCount = accumarray(idx,1);
characterCountTbl = table(characterSet, characterCount, ...
    VariableNames=["Character", "CharacterCount"]);
characterCountTbl = sortrows(characterCountTbl, ...
    "CharacterCount", "descend");

Visualize the character count with a word cloud chart.

wordcloud(characterCountTbl, "Character", "CharacterCount")

Figure contains an object of type wordcloud. The chart of type wordcloud has title CharacterCount.

The characters O, E, T, N and A have the highest character count and the characters -, +, /, . , 3 have the least character count.

Visualize the class distribution with a bar graph.

figure
numCharacters = numel(characterSet);
bar(1:numCharacters, characterCountTbl.CharacterCount)
xticks(1:numCharacters)
xticklabels(characterCountTbl.Character)
xlabel("Character")
ylabel("Number of samples")

Figure contains an axes object. The axes object with xlabel Character, ylabel Number of samples contains an object of type bar.

Prepare Data for OCR Training

Open Live Script

This example shows preparing data to train an OCR model that can recognize fourteen-segment characters.

The training data contains word samples of fourteen-segment characters from a page of text. Read the training image and display it.

I = imread("CVT-DSEG14.jpg");
imshow(I)

Figure contains an axes object. The hidden axes object contains an object of type image.

This image was annotated with bounding boxes containing words and text labels were added to these bounding boxes as an attribute using the Image Labeler. To learn more about labeling images for OCR training, see Train Custom OCR Model. The labels were exported from the app as groundTruth object and saved in 14SegmentGtruth.mat file.

ld = load("14SegmentGtruth.mat");
gTruth = ld.gTruth;

Create datastores that contain images, bounding boxes and text labels from the groundTruth object using the ocrTrainingData function with the label and attribute names used during labeling.

labelName = "Text";
attributeName = "Word";
[imds,boxds,txtds] = ocrTrainingData(gTruth,labelName,attributeName);

Combine the datastores.

cds = combine(imds,boxds,txtds);

The combined datastore can be used for training an OCR model using the trainOCR function.

Input Arguments

collapse all

`gTruth` — Ground truth data
`groundTruth` object | M-by-1 array of `groundTruth` objects

Ground truth data, specified as a groundTruth object or an M-by-1 array of groundTruth objects exported from the Image Labeler app.

`labelName` — Name of rectangular ROI label
string scalar | character vector

Name of the rectangular ROI label used for labeling ground truth, specified as a string scalar or character vector. You must use the Rectangle label type for OCR ground truth labeling.

Use the Image Labeler app to label ground truth data. After loading your images into the app, select Label from the toolbar, then select Rectangle. A dialog box appears that provides the field for entering the label name.

`attributeName` — Attribute name
string scalar | character vector

Attribute name that corresponds to the label name, specified as a string scalar or character vector. The attribute identifies what the OCR detects in the specified ROI labelName. For example, word. To name an attribute in Image Labeler, after creating the ROI, select Attribute from the toolbar. A dialog box appears that provides the field for entering the attribute name.

Output Arguments

collapse all

`imds` — Image datastore
`imageDatastore` object

Image datastore, returned as an imageDatastore object that contains images extracted from specified groundTruth object or objects.

`boxds` — Bounding box label datastore
`arrayDatastore` object

Bounding box label datastore associated with the ground truth images, returned as an arrayDatastore object.

`txtds` — Text label datastore
`arrayDatastore` object

Text label datastore that corresponds to the attribute name input, returned as an arrayDatastore object.

Version History

Introduced in R2023a

ocrTrainingData

Syntax

Description

Examples

Analyze OCR Ground Truth Data

Prepare Data for OCR Training

Input Arguments

`gTruth` — Ground truth data
`groundTruth` object | M-by-1 array of `groundTruth` objects

`labelName` — Name of rectangular ROI label
string scalar | character vector

`attributeName` — Attribute name
string scalar | character vector

Output Arguments

`imds` — Image datastore
`imageDatastore` object

`boxds` — Bounding box label datastore
`arrayDatastore` object

`txtds` — Text label datastore
`arrayDatastore` object

Version History

See Also

Apps

Functions

Objects

Topics

ocrTrainingData

Syntax

Description

Examples

Analyze OCR Ground Truth Data

Prepare Data for OCR Training

Input Arguments

gTruth — Ground truth data groundTruth object | M-by-1 array of groundTruth objects

labelName — Name of rectangular ROI label string scalar | character vector

attributeName — Attribute name string scalar | character vector

Output Arguments

imds — Image datastore imageDatastore object

boxds — Bounding box label datastore arrayDatastore object

txtds — Text label datastore arrayDatastore object

Version History

See Also

Apps

Functions

Objects

Topics

`gTruth` — Ground truth data
`groundTruth` object | M-by-1 array of `groundTruth` objects

`labelName` — Name of rectangular ROI label
string scalar | character vector

`attributeName` — Attribute name
string scalar | character vector

`imds` — Image datastore
`imageDatastore` object

`boxds` — Bounding box label datastore
`arrayDatastore` object

`txtds` — Text label datastore
`arrayDatastore` object