bisenetv2

Create BiSeNet v2 convolutional neural network for semantic segmentation

Since R2025a

Syntax

[net,classes] = bisenetv2

net = bisenetv2(imageSize,numClasses)

[net] = bisenetv2(imageSize,numClasses,Name=Value)

Description

Use the bisenetv2 function to semantically segment images using the BiSeNet v2 convolutional neural network. Using the pretrained network, trained on 171 image classes of the COCO-Stuff data set [2], you can perform inference on test images which contain these classes.

To perform semantic segmentation on a custom data set, you must train the network on your data set using the trainnet (Deep Learning Toolbox) function.

Note

This functionality requires Deep Learning Toolbox™ and the Computer Vision Toolbox™ Model for BiSeNet v2 Semantic Segmentation Network. You can install the Computer Vision Toolbox Model for BiSeNet v2 Semantic Segmentation Network from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

[net,classes] = bisenetv2 returns a pretrained BiSeNet v2 network and the names of the 171 classes that it was trained on.

example

net = bisenetv2(imageSize,numClasses) returns a BiSeNet v2 network pretrained on 171 image classes of the COCO-Stuff data set and configured for transfer learning using input images of the specified size imageSize and the specified number of classes numClasses. Train this network on a custom data set using the using the trainnet (Deep Learning Toolbox) function.

[net] = bisenetv2(imageSize,numClasses,Name=Value) specifies options using one or more name-value arguments in addition to the input arguments from the previous syntax. When you specify one or more network options using a name-value argument, you create a custom BiSeNet v2 network with uninitialized weights.

For example, bisenetv2(ChannelRatio=16) specifies the channel ratio between the semantic branch and detail branch as 16.

Examples

collapse all

Segment Image using Pretrained BiSeNet v2 Network

This example uses:

Open Live Script

Create a BiSeNet v2 convolutional neural network for semantic segmentation.

[net,classes] = bisenetv2;

Read a test image into the workspace.

I = imread("kobi.png");

Resize the image to the input size of the network.

inputSize = net.Layers(1).InputSize(1:2);
img = imresize(I,inputSize);

Create a segmentation map, a categorical array which relates the labels to each pixel in the input image, of the test image using the semanticseg function.

segMap = semanticseg(img,net,Classes=classes);

Display the segmentation map overlaid on the image, ordered by the smallest object masks on top, using the labeloverlay function.

segmentedImage = labeloverlay(img,segMap,Transparency=0.4);
imshow(segmentedImage)

Figure contains an axes object. The hidden axes object contains an object of type image.

Input Arguments

collapse all

`imageSize` — Network input image size
two-element vector | three-element vector

Network input image size, specified as one of these options:

Two-element vector of the form [height width].
Three-element vector of the form [height width depth]. depth is the number of image channels. Specify depth as 3 for RGB images, as 1 for grayscale images, or as the number of channels for multispectral and hyperspectral images.

`numClasses` — Number of classes
integer greater than 1

Number of classes to segment, specified as an integer greater than 1.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: bisenetv2(ChannelRatio=16) specifies the channel ratio between the semantic and detail branch as 16.

`ChannelRatio` — Channel ratio
`4` (default) | positive integer in the range `[2 64]`

Channel ratio between the semantic branch and detail branch, specified as a positive integer in the range [2, 64]. The channel ratio must be a power of 2 no greater than 64.

Tip

Decrease this value to enable the detail branch to learn finer features. Increase the channel ratio to enable the semantic branch to capture global context and improve the overall performance of the network, at the expense of less segmented finer features and a slower computing speed. Use empirical analysis to determine the optimal channel ratio for your application.

`ChannelMultiplier` — Channel multiplier
`1` (default) | positive scalar

Channel multiplier, specified as a positive scalar. This value scales the number of channels in all convolutional layers other than the inputs to the semantic, detail, and output branches.

Tip

Increase this value to improve segmentation accuracy at the cost of larger model size and longer training time.

Depending on the ChannelRatio value you specify, the ChannelMultiplier argument supports these values:

`ChannelRatio` Value	Supported `ChannelMultiplier` Values
2, 4, 8	1, 1.25, 1.5, 1.75, 2
16	1, 1.5, 2
32	1, 2
64	2

`DepthMultiplier` — Depth multiplier
`1` (default) | positive integer

Depth multiplier specified as a positive scalar. This value scales the number of layers of the semantic and detail branches. Increase the depth multiplier to improve learning capacity and accuracy at the cost of longer processing time. For most applications, use specify a depth multiplier value between 1 and 4.

Output Arguments

collapse all

`net` — BiSeNet v2 network layers
`dlnetwork` object

BiSeNet v2 network layers, returned as a dlnetwork (Deep Learning Toolbox) object.

`classes` — Names of classes
categorical array

Names of the classes that the BiSeNet v2 network has been pretrained to segment, returned as a categorical array.

More About

collapse all

BiSeNet v2 Architecture

The BiSeNet v2 architecture consists of a detail branch, which captures high-resolution spatial details and a semantic branch, which focuses on contextual information and abstract feature extraction, followed by a feature aggregation layer that combines and balances these pathways.

References

[1] Yu, Changqian, Changxin Gao, Jingbo Wang, Gang Yu, Chunhua Shen, and Nong Sang. “BiSeNet V2: Bilateral Network with Guided Aggregation for Real-Time Semantic Segmentation.” International Journal of Computer Vision 129, no. 11 (November 2021): 3051–68. https://doi.org/10.1007/s11263-021-01515-2.

[2] Caesar, Holger, Jasper Uijlings, and Vittorio Ferrari. “COCO-Stuff: Thing and Stuff Classes in Context.” In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1209–18. Salt Lake City, UT, USA: IEEE, 2018. https://doi.org/10.1109/CVPR.2018.00132.

Version History

Introduced in R2025a

bisenetv2

Syntax

Description

Examples

Segment Image using Pretrained BiSeNet v2 Network

Input Arguments

`imageSize` — Network input image size
two-element vector | three-element vector

`numClasses` — Number of classes
integer greater than 1

Name-Value Arguments

`ChannelRatio` — Channel ratio
`4` (default) | positive integer in the range `[2 64]`

`ChannelMultiplier` — Channel multiplier
`1` (default) | positive scalar

`DepthMultiplier` — Depth multiplier
`1` (default) | positive integer

Output Arguments

`net` — BiSeNet v2 network layers
`dlnetwork` object

`classes` — Names of classes
categorical array

More About

BiSeNet v2 Architecture

References

Version History

See Also

Objects

Functions

Topics

bisenetv2

Syntax

Description

Examples

Segment Image using Pretrained BiSeNet v2 Network

Input Arguments

imageSize — Network input image size two-element vector | three-element vector

numClasses — Number of classes integer greater than 1

Name-Value Arguments

ChannelRatio — Channel ratio 4 (default) | positive integer in the range [2 64]

ChannelMultiplier — Channel multiplier 1 (default) | positive scalar

DepthMultiplier — Depth multiplier 1 (default) | positive integer

Output Arguments

net — BiSeNet v2 network layers dlnetwork object

classes — Names of classes categorical array

More About

BiSeNet v2 Architecture

References

Version History

See Also

Objects

Functions

Topics

`imageSize` — Network input image size
two-element vector | three-element vector

`numClasses` — Number of classes
integer greater than 1

`ChannelRatio` — Channel ratio
`4` (default) | positive integer in the range `[2 64]`

`ChannelMultiplier` — Channel multiplier
`1` (default) | positive scalar

`DepthMultiplier` — Depth multiplier
`1` (default) | positive integer

`net` — BiSeNet v2 network layers
`dlnetwork` object

`classes` — Names of classes
categorical array