This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Classify Image Using GoogLeNet

This example shows how to classify an image using the pretrained deep convolutional neural network GoogLeNet.

GoogLeNet has been trained on over a million images and can classify images into 1000 object categories (such as keyboard, coffee mug, pencil, and many animals). The network has learned rich feature representations for a wide range of images. The network takes an image as input, and then outputs a label for the object in the image together with the probabilities for each of the object categories.

Load Pretrained Network

Load the pretrained GoogLeNet network. You can also choose to load a different pretrained network for image classification. This step requires the Deep Learning Toolbox™ Model for GoogLeNet Network support package. If you do not have the required support packages installed, then the software provides a download link.

net = googlenet;

The image that you want to classify must have the same size as the input size of the network. For GoogLeNet, the first element of the Layers property of the network is the image input layer. The network input size is the InputSize property of the image input layer.

inputSize = net.Layers(1).InputSize
inputSize = 1×3

   224   224     3

The final element of the Layers property is the classification output layer. The ClassNames property of this layer contains the names of the classes learned by the network. View 10 random class names out of the total of 1000.

classNames = net.Layers(end).ClassNames;
numClasses = numel(classNames);
disp(classNames(randperm(numClasses,10)))
    'speedboat'
    'window screen'
    'isopod'
    'wooden spoon'
    'lipstick'
    'drake'
    'hyena'
    'dumbbell'
    'strawberry'
    'custard apple'

Read and Resize Image

Read and show the image that you want to classify.

I = imread('peppers.png');
figure
imshow(I)

Display the size of the image. The image is 384-by-512 pixels and has three color channels (RGB).

size(I)
ans = 1×3

   384   512     3

Resize the image to the input size of the network by using imresize. This resizing slightly changes the aspect ratio of the image.

I = imresize(I,inputSize(1:2));
figure
imshow(I)

Depending on your application, you might want to resize the image in a different way. For example, you can crop the top left corner of the image by using I(1:inputSize(1),1:inputSize(2),:). If you have Image Processing Toolbox™, then you can use the imcrop function.

Classify Image

Classify the image and calculate the class probabilities using classify. The network correctly classifies the image as a bell pepper. A network for classification is trained to output a single label for each input image, even when the image contains multiple objects.

[label,scores] = classify(net,I);
label
label = categorical
     bell pepper 

Display the image with the predicted label and the predicted probability of the image having that label.

figure
imshow(I)
title(string(label) + ", " + num2str(100*scores(classNames == label),3) + "%");

Display Top Predictions

Display the top five predicted labels and their associated probabilities as a histogram. Because the network classifies images into so many object categories, and many categories are similar, it is common to consider the top-five accuracy when evaluating networks. The network classifies the image as a bell pepper with a high probability.

[~,idx] = sort(scores,'descend');
idx = idx(5:-1:1);
classNamesTop = net.Layers(end).ClassNames(idx);
scoresTop = scores(idx);

figure
barh(scoresTop)
xlim([0 1])
title('Top 5 Predictions')
xlabel('Probability')
yticklabels(classNamesTop)

References

[1] Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. "Going deeper with convolutions." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9. 2015.

See Also

| | | |

Related Topics