classify
Syntax
Description
Add-On Required: This feature requires the Computer Vision Toolbox Model for OpenAI CLIP Network add-on.
assigns each image in classes = classify(clip,I,classNames)I to one of the suggested classes
classNames using a Contrastive Language-Image Pre-Training
(CLIP) network.
Note
This functionality requires Deep Learning Toolbox™.
[___] = classify(___,
specifies options using one or more name-value arguments in addition to any combination
of arguments from previous syntaxes. For example, Name=Value)MiniBatchSize=32
limits the batch size to 32 images.
Examples
Create a pretrained CLIP network.
clip = clipNetwork("vit-b-16");Create a datastore of test images.
imageFiles = ["kobi.png","baby.jpg","flamingos.jpg","saturn.png"]; imds = imageDatastore(imageFiles);
Define the list of class suggestions for the test images.
classNames = ["baby","dog","flamingo","planet"];
Obtain the predicted classes for each image in the datastore.
class = classify(clip,imds,classNames);
Display the images along with their predicted classes.
figure tiledlayout(2,2) for i = 1:length(imageFiles) nexttile imshow(read(imds)) title(class(i)) end

Create a pretrained CLIP network with a RestNet-50 backbone.
clip = clipNetwork("resnet50")clip =
clipNetwork with properties:
ModelName: "resnet50"
ImageEncoderNetwork: [1×1 dlnetwork]
TextEncoderNetwork: [1×1 dlnetwork]
ImageNormalizationStatistics: [1×1 struct]
Load an image that contains the object to classify into the workspace, and display the image.
I = imread("kobi.png");
imshow(I)
Define the list of potential classes for the image.
classNames = ["aardvark","bee","cat","dog"];
Obtain the predicted class and prediction scores from the image.
[class,scores] = classify(clip,I,classNames)
class = categorical
dog
scores = 1×4 single row vector
0.5309 0.5131 0.5337 0.7217
Create a pretrained CLIP network.
clip = clipNetwork("vit-l-14");Load a satellite photo of the town of Concord, Massachusetts into the workspace, and display the image.
I = imread("concordaerial.png");
imshow(I)
Define the list of class suggestions for the image. These classes are town or city names.
classNames = ["Boston","Concord","Plymouth","Falmouth"];
Define class descriptions that provide more context to the CLIP model for more accurate classification.
classDescriptions = [ ... "A satellite photo of Boston, a city in Massachusetts." "A satellite photo of Concord, a suburb in Massachusetts." "A satellite photo of Plymouth, a town on the coast of Massachusetts." "A satellite photo of Falmouth, a town on Cape Cod in Massachusetts." ];
Specify the suggested class names for each of the towns, as well as the more detailed class descriptions, to predict the town shown in the image using the CLIP network.
class = classify(clip,I,classNames,ClassDescriptions=classDescriptions)
class = categorical
Concord
Input Arguments
CLIP network, specified as a clipNetwork object.
Image data, specified in one of these formats:
H-by-W-by-3-by-B numeric array representing a batch of B truecolor images.
H-by-W-by-1-by-B numeric array representing a batch of B grayscale images.
Datastore that reads and returns truecolor images.
Formatted
dlarray(Deep Learning Toolbox) object with two spatial dimensions of the format"SSCB". You can specify multiple test images by including a batch dimension.
Names of class suggestions, specified as a vector of strings or a categorical vector. You must specify class names in English using ASCII characters. The function automatically pads or truncates each text input so that it contains exactly 77 tokens.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN, where Name is
the argument name and Value is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example:
classify(clip,I,classNames,MiniBatchSize=32) limits the batch size to
32 images.
Size of batches for processing large collections of images, specified as a positive integer. Larger batch sizes reduce processing time, but require more memory.
Class descriptions used for classification by the CLIP network, specified as
a C-element string array. C is the number
of classes in classNames. By default, the CLIP model
generates class descriptions from the labels specified by the
classNames input argument.
Use the ClassDescriptions name-value argument to create
custom class descriptions. The classify function pads
each description string with zeros or shortens it so that it contains exactly
77 tokens.
Hardware resource on which to run the detector, specified as
"auto", "gpu", or
"cpu". The table shows the valid hardware resource
values.
| Resource | Action |
|---|---|
"auto"
| Use a GPU if it is available. Otherwise, use the CPU. |
"gpu"
| Use the GPU. To use a GPU, you must have Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. If a suitable GPU is not available, the function returns an error. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox). |
"cpu"
| Use the CPU. |
Output Arguments
Predicted classes, returned as a B-element categorical vector. B is the number of images in the batch.
Prediction scores, returned as a B-by-C
numeric matrix. B is the number of images in the batch, and
C is the number of suggested classes specified using the
classNames input argument.
The classify function computes the scores using the
CLIPScore algorithm. For an input image I and associated text
T, the algorithm computes the score using the
equation
Version History
Introduced in R2026a
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)