Main Content

fcnLayers

Create fully convolutional network layers for semantic segmentation

Description

example

lgraph = fcnLayers(imageSize,numClasses) returns a fully convolutional network (FCN), configured as FCN 8s, for semantic segmentation. The FCN is preinitialized using layers and weights from the VGG-16 network.

fcnLayers includes a pixelClassificationLayer to predict the categorical label for every pixel in an input image. The pixel classification layer only supports RGB images.

This function requires the Deep Learning Toolbox™ Model for VGG-16 Network support package. If this support package is not installed, then the vgg16 (Deep Learning Toolbox) function provides a download link.

lgraph = fcnLayers(imageSize,numClasses,'Type',type) returns an FCN configured as a type specified by type.

Examples

collapse all

Define the image size and number of classes, then create the network.

imageSize = [480 640];
numClasses = 5;
lgraph = fcnLayers(imageSize,numClasses)

Display the network.

plot(lgraph)

Create a FCN 16s.

imageSize = [480 640];
numClasses = 5;
lgraph = fcnLayers(imageSize,numClasses,'Type','16s')

Display the network.

plot(lgraph)

Input Arguments

collapse all

Network input image size, specified as a 2-element vector in the format [height, width]. The minimum image size is [224 224] because an FCN is based on the VGG-16 network.

Number of classes in the semantic segmentation, specified as an integer greater than 1.

Type of FCN model, specified as one of the following:

FCN ModelDescription
'32s'

Upsamples the final feature map by a factor of 32. This option provides coarse segmentation with a lower computational cost.

'16s'

Upsamples the final feature map by a factor of 16 after fusing the feature map from the fourth pooling layer. This additional information from earlier layers provides medium-grain segmentation at the cost of additional computation.

'8s'

Upsamples the final feature map by a factor of 8 after fusing feature maps from the third and fourth max pooling layers. This additional information from earlier layers provides finer-grain segmentation at the cost of additional computation.

Output Arguments

collapse all

Layers that represent the FCN network architecture, returned as a layerGraph (Deep Learning Toolbox) object.

All transposed convolution layers are initialized using bilinear interpolation weights. All transposed convolution layer bias terms are fixed to zero.

Tips

  • Networks produced by fcnLayers support GPU code generation for deep learning once they are trained with trainNetwork (Deep Learning Toolbox). See Deep Learning Code Generation (Deep Learning Toolbox) for details and examples.

References

[1] Long, J., E. Shelhamer, and T. Darrell. "Fully Convolutional Networks for Semantic Segmentation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.

Introduced in R2017b