segnetLayers

(To be removed) Create SegNet layer graph for semantic segmentation

segnetLayers will be removed in a future release. Create a SegNet network using a dlnetwork (Deep Learning Toolbox) object instead. For more information, see Version History.

Syntax

lgraph = segnetLayers(imageSize,numClasses,model)

lgraph = segnetLayers(imageSize,numClasses,encoderDepth)

lgraph = segnetLayers(imageSize,numClasses,encoderDepth,Name,Value)

Description

lgraph = segnetLayers(imageSize,numClasses,model) returns SegNet layers, lgraph, that is preinitialized with layers and weights from a pretrained model.

SegNet is a convolutional neural network for semantic image segmentation. The network uses a pixelClassificationLayer to predict the categorical label for every pixel in an input image.

Use segnetLayers to create the network architecture for SegNet. You must train the network using the Deep Learning Toolbox™ function trainNetwork (Deep Learning Toolbox).

example

lgraph = segnetLayers(imageSize,numClasses,encoderDepth) returns uninitialized SegNet layers configured using the specified encoder depth.

lgraph = segnetLayers(imageSize,numClasses,encoderDepth,Name,Value) returns a SegNet layer with additional options specified by one or more Name,Value pair arguments.

Examples

collapse all

Create SegNet With Custom Encoder-Decoder Depth

Create SegNet layers with an encoder/decoder depth of 4.

imageSize = [480 640 3];
numClasses = 5;
encoderDepth = 4;
lgraph = segnetLayers(imageSize,numClasses,encoderDepth)

lgraph = 
  LayerGraph with properties:

     InputNames: {'inputImage'}
    OutputNames: {'pixelLabels'}
         Layers: [59x1 nnet.cnn.layer.Layer]
    Connections: [66x2 table]

Display the network.

figure
plot(lgraph)

Train SegNet

Load training images and pixel labels.

dataSetDir = fullfile(toolboxdir('vision'),'visiondata','triangleImages');
imageDir = fullfile(dataSetDir,'trainingImages');
labelDir = fullfile(dataSetDir,'trainingLabels');

Create an image datastore holding the training images.

imds = imageDatastore(imageDir);

Define the class names and their associated label IDs.

classNames = ["triangle", "background"];
labelIDs   = [255 0];

Create a pixel label datastore holding the ground truth pixel labels for the training images.

pxds = pixelLabelDatastore(labelDir,classNames,labelIDs);

Combine image and pixel label data for training a semantic segmentation network.

ds = combine(imds,pxds);

Create SegNet layers.

imageSize = [32 32];
numClasses = 2;
lgraph = segnetLayers(imageSize,numClasses,2)

lgraph = 
  LayerGraph with properties:

     InputNames: {'inputImage'}
    OutputNames: {'pixelLabels'}
         Layers: [31x1 nnet.cnn.layer.Layer]
    Connections: [34x2 table]

Set up training options.

options = trainingOptions('sgdm','InitialLearnRate',1e-3, ...
      'MaxEpochs',20,'VerboseFrequency',10);

Train the network.

net = trainNetwork(ds,lgraph,options)

Training on single CPU.
Initializing input data normalization.
|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |   Accuracy   |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:03 |       39.75% |       0.7658 |          0.0010 |
|      10 |          10 |       00:00:25 |       49.98% |       0.7388 |          0.0010 |
|      20 |          20 |       00:00:49 |       66.39% |       0.6910 |          0.0010 |
|========================================================================================|
Training finished: Max epochs completed.


net = 
  DAGNetwork with properties:

         Layers: [31x1 nnet.cnn.layer.Layer]
    Connections: [34x2 table]
     InputNames: {'inputImage'}
    OutputNames: {'pixelLabels'}

Display the network.

plot(lgraph)

Input Arguments

collapse all

`imageSize` — Network input image size
2-element vector | 3-element vector

Network input image size, specified as a:

2-element vector in the format [height, width].
3-element vector in the format [height, width, depth]. depth is the number of image channels. Set depth to 3 for RGB images, 1 for grayscale images, or to the number of channels for multispectral and hyperspectral images.

`numClasses` — Number of classes
integer greater than 1

Number of classes in the semantic segmentation, specified as an integer greater than 1.

`model` — Pretrained network model
`'vgg16'` | `'vgg19'`

Pretrained network model, specified as 'vgg16' or 'vgg19'. These models have an encoder depth of 5. When you use a 'vgg16' model, you must specify RGB inputs. You can convert grayscale images to RGB using the im2gray function.

`encoderDepth` — Encoder depth
positive integer

Encoder depth, specified as a positive integer.

SegNet is composed of an encoder and corresponding decoder subnetwork. The depth of these networks determines the number of times the input image is downsampled or upsampled as it is processed. The encoder network downsamples the input image by a factor of 2^D, where D is the value of encoderDepth. The decoder network upsamples the encoder network output by a factor of 2^D.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'NumConvolutionLayers',1

`NumConvolutionLayers` — Number of convolutional layer sections
`2` (default) | positive integer | vector of positive integers

Number of convolutional layers in each encoder and decoder section, specified as a positive integer or vector of positive integers.

`NumConvolutionLayers`	Description
scalar	The same number of layers is used for all encoder and decoder sections.
vector	The kth element of `NumConvolutionLayers` is the number of convolution layers in the kth encoder section and corresponding decoder section. Typical values are in the range [1, 3].

`NumOutputChannels` — Number of output channels
`64` (default) | positive integer | vector of positive integers

Number of output channels for each section in the SegNet encoder network, specified as a positive integer or vector of positive integers. segnetLayers sets the number of output channels in the decoder to match the corresponding encoder section.

`NumOutputChannels`	Description
scalar	The same number of output channels is used for all encoder and decoder sections.
vector	The kth element of `NumOutputChannels` is the number of output channels of the kth encoder section and corresponding decoder section.

`FilterSize` — Convolutional layer filter size
`3` (default) | positive odd integer | 2-element row vector of positive odd integers

Convolutional layer filter size, specified as a positive odd integer or a 2-element row vector of positive odd integers. Typical values are in the range [3, 7].

`FilterSize`	Description
scalar	The filter is square.
2-element row vector	The filter has the size [height width].

Output Arguments

collapse all

`lgraph` — Layers
`LayerGraph` object

Layers that represent the SegNet network architecture, returned as a layerGraph (Deep Learning Toolbox) object.

Tips

The sections within the SegNet encoder and decoder subnetworks are made up of convolutional, batch normalization, and ReLU layers.
All convolutional layers are configured such that the bias term is fixed to zero.
Convolution layer weights in the encoder and decoder subnetworks are initialized using the 'MSRA' weight initialization method. For 'vgg16' or 'vgg19' models, only the decoder subnetwork is initialized using MSRA.[1]
Networks produced by segnetLayers support GPU code generation for deep learning once they are trained with trainNetwork (Deep Learning Toolbox). See Code Generation (Deep Learning Toolbox) for details and examples.

References

[1] He, K., X. Zhang, S. Ren, and J. Sun. "Delving Deep Into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification." Proceedings of the IEEE International Conference on Computer Vision. 2015, 1026–1034.

[2] Badrinarayanan, V., A. Kendall, and R. Cipolla. "Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation." arXiv. Preprint arXiv: 1511.0051, 2015.

Extended Capabilities

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Usage notes and limitations:

For code generation, you must first create a SegNet network by using the segnetLayers function. Then, use the trainNetwork function on the resulting lgraph object to train the network for segmentation. Once the network is trained and evaluated, you can generate code for the DAGNetwork object using GPU Coder™.

Version History

Introduced in R2017b

collapse all

R2024a: `segnetLayers` will be removed

The segnetLayers function will be removed in a future release. To update your code, create a dlnetwork (Deep Learning Toolbox) instead. You can use functions such as addLayers (Deep Learning Toolbox) and connectLayers (Deep Learning Toolbox) to build the network.

Do not include output layers in the network. Instead, define a loss function. Here are some sample loss functions appropriate for pixel classification:

function loss = modelLoss(Y,T)
  z = generalizedDice(Y,T); 
  loss = 1 - mean(z,"all"); 
end 

function loss = modelLoss(Y,T) 
  mask = ~isnan(T);
  targets(isnan(T)) = 0;
  loss = crossentropy(Y,T,Mask=mask); 
end

Specify the loss function when you train the network using the trainnet (Deep Learning Toolbox) function. For example, this code trains a dlnetwork network called net using the training data images and the loss function modelLoss.

netTrained = trainnet(images,net,@modelLoss,options);

segnetLayers

Syntax

Description

Examples

Create SegNet With Custom Encoder-Decoder Depth

Train SegNet

Input Arguments

`imageSize` — Network input image size
2-element vector | 3-element vector

`numClasses` — Number of classes
integer greater than 1

`model` — Pretrained network model
`'vgg16'` | `'vgg19'`

`encoderDepth` — Encoder depth
positive integer

Name-Value Arguments

`NumConvolutionLayers` — Number of convolutional layer sections
`2` (default) | positive integer | vector of positive integers

`NumOutputChannels` — Number of output channels
`64` (default) | positive integer | vector of positive integers

`FilterSize` — Convolutional layer filter size
`3` (default) | positive odd integer | 2-element row vector of positive odd integers

Output Arguments

`lgraph` — Layers
`LayerGraph` object

Tips

References

Extended Capabilities

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

R2024a: `segnetLayers` will be removed

See Also

Topics

segnetLayers

Syntax

Description

Examples

Create SegNet With Custom Encoder-Decoder Depth

Train SegNet

Input Arguments

imageSize — Network input image size 2-element vector | 3-element vector

numClasses — Number of classes integer greater than 1

model — Pretrained network model 'vgg16' | 'vgg19'

encoderDepth — Encoder depth positive integer

Name-Value Arguments

NumConvolutionLayers — Number of convolutional layer sections 2 (default) | positive integer | vector of positive integers

NumOutputChannels — Number of output channels 64 (default) | positive integer | vector of positive integers

FilterSize — Convolutional layer filter size 3 (default) | positive odd integer | 2-element row vector of positive odd integers

Output Arguments

lgraph — Layers LayerGraph object

Tips

References

Extended Capabilities

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

R2024a: segnetLayers will be removed

See Also

Topics

`imageSize` — Network input image size
2-element vector | 3-element vector

`numClasses` — Number of classes
integer greater than 1

`model` — Pretrained network model
`'vgg16'` | `'vgg19'`

`encoderDepth` — Encoder depth
positive integer

`NumConvolutionLayers` — Number of convolutional layer sections
`2` (default) | positive integer | vector of positive integers

`NumOutputChannels` — Number of output channels
`64` (default) | positive integer | vector of positive integers

`FilterSize` — Convolutional layer filter size
`3` (default) | positive odd integer | 2-element row vector of positive odd integers

`lgraph` — Layers
`LayerGraph` object

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

R2024a: `segnetLayers` will be removed