Code Generation for Semantic Segmentation Application on ARM Neon Targets That Uses U-Net
This example shows how to generate code for an image segmentation application that uses deep learning. It uses the codegen
command to generate a static library that performs prediction on a DAG Network object for U-Net. U-Net is a deep learning network for image segmentation.
For a similar example that uses U-Net for image segmentation but does not use the codegen
command, see Semantic Segmentation of Multispectral Images Using Deep Learning (Image Processing Toolbox).
Prerequisites
ARM® processor that supports the NEON extension and has a RAM of at least 3GB
ARM Compute Library (on the target ARM hardware)
Environment variables for the compilers and libraries
MATLAB® Coder™
MATLAB Coder Interface for Deep Learning support package
Deep Learning Toolbox™
The ARM Compute library version that this example uses might not be the latest version that code generation supports. For information about supported versions of libraries and about environment variables, see Prerequisites for Deep Learning with MATLAB Coder.
This example is not supported in MATLAB Online.
Overview of U-Net
U-Net [1] is a type of convolutional neural network (CNN) designed for semantic image segmentation. In U-Net, the initial series of convolutional layers are interspersed with max pooling layers, successively decreasing the resolution of the input image. These layers are followed by a series of convolutional layers interspersed with upsampling operators, successively increasing the resolution of the input image. The combination of these two series paths forms a U-shaped graph. The U-Net network was originally trained to perform prediction on biomedical image segmentation applications. This example demonstrates the ability of the network to track changes in forest cover over time. Environmental agencies track deforestation to assess and qualify the environmental and ecological health of a region.
Deep learning based semantic segmentation can yield a precise measurement of vegetation cover from high-resolution aerial photographs. One of the challenges of such computation is to differentiating classes that have similar visual characteristics, such as classifying a green pixel as grass, shrubbery, or tree. To increase classification accuracy, some data sets contain multispectral images that provide additional information about each pixel. For example, the Hamlin Beach State Park data set supplements the color images with near-infrared channels that provide a clearer separation of the classes.
This example uses the Hamlin Beach State Park Data [2] along with a pretrained U-Net network to correctly classify each pixel.
The U-Net that this example uses is trained to segment pixels belonging to a set of 18 classes which includes:
0. Other Class/Image Border 7. Picnic Table 14. Grass 1. Road Markings 8. Black Wood Panel 15. Sand 2. Tree 9. White Wood Panel 16. Water (Lake) 3. Building 10. Orange Landing Pad 17. Water (Pond) 4. Vehicle (Car, Truck, or Bus) 11. Water Buoy 18. Asphalt (Parking Lot/Walkway) 5. Person 12. Rocks 6. Lifeguard Chair 13. Other Vegetation
The segmentationUnetARM
Entry-Point Function
The segmentationUnetARM
entry-point function performs patchwise semantic segmentation on the input image by using the multispectralUnet network contained in the multispectralUnet.mat
file. The function loads the network object from the multispectralUnet.mat
file into a persistent variable mynet
and reuses the persistent variable on subsequent prediction calls.
type('segmentationUnetARM.m')
% OUT = segmentationUnetARM(IM) returns a semantically segmented % image, which is segmented using the network multispectralUnet. This segmentation % is performed on the input image patchwise on patches of size 256,256. % % Copyright 2019-2020 The MathWorks, Inc. function out = segmentationUnetARM(im) %#codegen persistent mynet; if isempty(mynet) mynet = coder.loadDeepLearningNetwork('trainedUnet/multispectralUnet.mat'); end % The input data has to be padded to the size compatible % with the network Input Size. This input_data is padded inorder to % perform semantic segmentation on each patch of size (Network Input Size) [height, width, nChannel] = size(im); patch = coder.nullcopy(zeros([256, 256, nChannel-1])); % padSize = zeros(1,2); padSize(1) = 256 - mod(height, 256); padSize(2) = 256 - mod(width, 256); % % Pad image must have dimensions as multiples of network input dimensions im_pad = padarray (im, padSize, 0, 'post'); [height_pad, width_pad, ~] = size(im_pad); % out = zeros([size(im_pad,1), size(im_pad,2)], 'uint8'); for i = 1:256:height_pad for j =1:256:width_pad for p = 1:nChannel -1 patch(:,:,p) = squeeze( im( i:i+255,... j:j+255,... p)); end % pass in input segmentedLabels = activations(mynet, patch, 'Segmentation-Layer'); % Takes the max of each channel (6 total at this point) [~,L] = max(segmentedLabels,[],3); patch_seg = uint8(L); % populate section of output out(i:i+255, j:j+255) = patch_seg; end end % Remove the padding out = out(1:height, 1:width);
Get Pretrained U-Net DAG Network Object
Download the multispectralUnet.mat
file and load the U-Net DAG network object.
if ~exist('trainedUnet/multispectralUnet.mat','file') trainedUnet_url = 'https://www.mathworks.com/supportfiles/vision/data/multispectralUnet.mat'; downloadUNet(trainedUnet_url,pwd); end
ld = load("trainedUnet/multispectralUnet.mat");
net = ld.net;
The DAG network contains 58 layers that include convolution, max pooling, depth concatenation, and pixel classification output layers. To display an interactive visualization of the deep learning network architecture, use the analyzeNetwork
(Deep Learning Toolbox) function.
analyzeNetwork(net);
Prepare Input Data
Download the Hamlin Beach State Park data.
if ~exist(fullfile(pwd,'data'),'dir') url = 'http://home.cis.rit.edu/~cnspci/other/data/rit18_data.mat'; downloadHamlinBeachMSIData(url,pwd+"/data/"); end
Load and examine the data in MATLAB.
load(fullfile(pwd,'data','rit18_data','rit18_data.mat'));
Examine data
whos test_data
The image has seven channels. The RGB color channels are the fourth, fifth, and sixth image channels. The first three channels correspond to the near-infrared bands and highlight different components of the image based on their heat signatures. Channel 7 is a mask that indicates the valid segmentation region.
The multispectral image data is arranged as numChannels-by-width-by-height arrays. In MATLAB, multichannel images are arranged as width-by-height-by-numChannels arrays. To reshape the data so that the channels are in the third dimension, use the helper function, switchChannelsToThirdPlane
.
test_data = switchChannelsToThirdPlane(test_data);
Confirm data has the correct structure (channels last).
whos test_data
This example uses a cropped version of the full Hamlin Beach State Park dataset that the test_data
variable contains. Crop the height and width of test_data
to create the variable input_data
that this example uses.
test_datacropRGB = imcrop(test_data(:,:,1:3),[2600, 3000, 2000, 2000]); test_datacropInfrared = imcrop(test_data(:,:,4:6),[2600, 3000, 2000, 2000]); test_datacropMask = imcrop(test_data(:,:,7),[2600, 3000, 2000, 2000]);
input_data(:,:,1:3) = test_datacropRGB; input_data(:,:,4:6) = test_datacropInfrared; input_data(:,:,7) = test_datacropMask;
Examine the input_data
variable.
whos('input_data');
Write the input data into a text file that is passed as input to the generated executable.
WriteInputDatatoTxt(input_data); [height, width, channels] = size(input_data);
Set Up a Code Generation Configuration Object for a Static Library
To generate code that targets an ARM-based device, create a configuration object for a library. Do not create a configuration object for an executable program. Set up the configuration object for generation of C++ source code only.
cfg = coder.config('lib'); cfg.TargetLang = 'C++'; cfg.GenCodeOnly = true;
Set Up a Configuration Object for Deep Learning Code Generation
Create a coder.ARMNEONConfig
object. Specify the library version and the architecture of the target ARM processor. For example, suppose that the target board is a HiKey/Rock960 board with ARMv8 architecture and ARM Compute Library version 20.02.1.
dlcfg = coder.DeepLearningConfig('arm-compute'); dlcfg.ArmComputeVersion = '20.02.1'; dlcfg.ArmArchitecture = 'armv8';
Assign the DeepLearningConfig
property of the code generation configuration object cfg
to the deep learning configuration object dlcfg
.
cfg.DeepLearningConfig = dlcfg;
Generate C++ Source Code by Using codegen
codegen -config cfg segmentationUnetARM -args {ones(size(input_data),'uint16')} -d unet_predict -report
The code gets generated in the unet_predict
folder that is located in the current working directory on the host computer.
Generate Zip File by Using packNGo
The packNGo
function packages all relevant files into a compressed zip file.
zipFileName = 'unet_predict.zip'; bInfo = load(fullfile('unet_predict','buildInfo.mat')); packNGo(bInfo.buildInfo, {'fileName', zipFileName,'minimalHeaders', false, 'ignoreFileMissing',true});
The name of the generated zip file is unet_predict.zip
.
Copy Generated Zip file to the Target Hardware
Copy the zip file into the target hardware board. Extract the contents of the zip file into a folder and delete the zip file from the hardware.
In the following commands, replace:
password
with your passwordusername
with your user nametargetname
with the name of your devicetargetDir
with the destination folder for the files
On the Linux® platform, to transfer and extract the zip file on the target hardware, run these commands:
if isunix, system(['sshpass -p password scp -r ' fullfile(pwd,zipFileName) ' username@targetname:targetDir/']), end if isunix, system('sshpass -p password ssh username@targetname "if [ -d targetDir/unet_predict ]; then rm -rf targetDir/unet_predict; fi"'), end if isunix, system(['sshpass -p password ssh username@targetname "unzip targetDir/' zipFileName ' -d targetDir/unet_predict"']), end if isunix, system(['sshpass -p password ssh username@targetname "rm -rf targetDir/' zipFileName '"']), end
On the Windows® platform, to transfer and extract the zip file on the target hardware, run these commands:
if ispc, system(['pscp.exe -pw password -r ' fullfile(pwd,zipFileName) ' username@targetname:targetDir/']), end if ispc, system('plink.exe -l username -pw password targetname "if [ -d targetDir/unet_predict ]; then rm -rf targetDir/unet_predict; fi"'), end if ispc, system(['plink.exe -l username -pw password targetname "unzip targetDir/' zipFileName ' -d targetDir/unet_predict"']), end if ispc, system(['plink.exe -l username -pw password targetname "rm -rf targetDir/' zipFileName '"']), end
Copy Supporting Files to the Target Hardware
Copy these files from the host computer to the target hardware:
Input data,
input_data.txt
Makefile for creating the library,
unet_predict_rtw.mk
Makefile for building the executable program,
makefile_unet_arm_generic.mk
In the following commands, replace:
password
with your passwordusername
with your user nametargetname
with the name of your devicetargetDir
with the destination folder for the files
On the Linux® platform, to transfer the supporting files to the target hardware, run these commands:
if isunix, system('sshpass -p password scp unet_predict_rtw.mk username@targetname:targetDir/unet_predict/'), end if isunix, system('sshpass -p password scp input_data.txt username@targetname:targetDir/unet_predict/'), end if isunix, system('sshpass -p password scp makefile_unet_arm_generic.mk username@targetname:targetDir/unet_predict/'), end
On the Windows® platform, to transfer the supporting files to the target hardware, run these commands:
if ispc, system('pscp.exe -pw password unet_predict_rtw.mk username@targetname:targetDir/unet_predict/'), end if ispc, system('pscp.exe -pw password input_data.txt username@targetname:targetDir/unet_predict/'), end if ispc, system('pscp.exe -pw password makefile_unet_arm_generic.mk username@targetname:targetDir/unet_predict/'), end
Build the Library on the Target Hardware
To build the library on the target hardware, execute the generated makefile on the ARM hardware.
Make sure that you set the environment variables ARM_COMPUTELIB
and LD_LIBRARY_PATH
on the target hardware. See Prerequisites for Deep Learning with MATLAB Coder. The ARM_ARCH
variable is used in Makefile to pass compiler flags based on the ARM Architecture. The ARM_VER
variable is used in Makefile to compile the code based on the version of the ARM Compute library.
On the Linux host platform, run this command to build the library:
if isunix, system(['sshpass -p password ssh username@targetname "make -C targetDir/unet_predict/ -f unet_predict_rtw.mk ARM_ARCH=' dlcfg.ArmArchitecture ' ARM_VER=' dlcfg.ArmComputeVersion ' "']), end
On the Windows host platform, run this command to build the library:
if ispc, system(['plink.exe -l username -pw password targetname "make -C targetDir/unet_predict/ -f unet_predict_rtw.mk ARM_ARCH=' dlcfg.ArmArchitecture ' ARM_VER=' dlcfg.ArmComputeVersion ' "']), end
Create Executable on the Target
In these commands, replace targetDir
with the destination folder where the library is generated. The variables height
, width
, and channels
represent the dimensions of the input data.
main_unet_arm_generic.cpp
is the C++ main wrapper file which invokes the segmentationUnetARM function and passes the input image to it. Build the library with the wrapper file to create the executable.
On the Linux host platform, to create the executable, run these commands:
if isunix, system('sshpass -p password scp main_unet_arm_generic.cpp username@targetname:targetDir/unet_predict/'), end if isunix, system(['sshpass -p password ssh username@targetname "make -C targetDir/unet_predict/ IM_H=' num2str(height) ' IM_W=' num2str(width) ' IM_C=' num2str(channels) ' -f makefile_unet_arm_generic.mk"']), end
On the Windows host platform, to create the executable, run these commands:
if ispc, system('pscp.exe -pw password main_unet_arm_generic.cpp username@targetname:targetDir/unet_predict/'), end if ispc, system(['plink.exe -l username -pw password targetname "make -C targetDir/unet_predict/ IM_H=' num2str(height) ' IM_W=' num2str(width) ' IM_C=' num2str(channels) ' -f makefile_unet_arm_generic.mk"']), end
Run the Executable on the Target Hardware
Run the Executable on the target hardware with the input image file input_data.txt
.
On the Linux host platform, run this command:
if isunix, system('sshpass -p password ssh username@targetname "cd targetDir/unet_predict/; ./unet input_data.txt output_data.txt"'), end
On the Windows host platform, run this command:
if ispc, system('plink.exe -l username -pw password targetname "cd targetDir/unet_predict/; ./unet input_data.txt output_data.txt"'), end
The unet
executable accepts the input data. Because of the large size of input_data
(2001x2001x7), it is easier to process the input image in patches. The executable splits the input image into multiple patches, each corresponding to network input size. The executable performs prediction on the pixels in one particular patch at a time and then combines all the patches together.
Transfer the Output from Target Hardware to MATLAB
Copy the generated output file output_data.txt
back to the current MATLAB session. On the Linux platform, run:
if isunix, system('sshpass -p password scp username@targetname:targetDir/unet_predict/output_data.txt ./'), end
To perform the same action on the Windows platform, run:
if ispc, system('pscp.exe -pw password username@targetname:targetDir/unet_predict/output_data.txt ./'), end
Store the output data in the variable segmentedImage
:
segmentedImage = uint8(importdata('output_data.txt'));
segmentedImage = reshape(segmentedImage,[height,width]);
To extract only the valid portion of the segmented image, multiply it by the mask channel of the input data.
segmentedImage = uint8(input_data(:,:,7)~=0) .* segmentedImage;
Remove the noise and stray pixels by using the medfilt2
function.
segmentedImageCodegen = medfilt2(segmentedImage,[5,5]);
Display U-Net Segmented data
This line of code creates a vector of the class names.
classNames = net.Layers(end).Classes; disp(classNames);
Overlay the labels on the segmented RGB test image and add a color bar to the segmented image.
Display input data
figure(1);
imshow(histeq(input_data(:,:,1:3)));
title('Input Image');
cmap = jet(numel(classNames)); segmentedImageOut = labeloverlay(imadjust(input_data(:,:,4:6),[0 0.6],[0.1 0.9],0.55),segmentedImage,'Transparency',0,'Colormap',cmap); figure(2); imshow(segmentedImageOut);
Display segmented data
title('Segmented Image using Codegen on ARM'); N = numel(classNames); ticks = 1/(N*2):1/N:1; colorbar('TickLabels',cellstr(classNames),'Ticks',ticks,'TickLength',0,'TickLabelInterpreter','none'); colormap(cmap)
Display segmented overlay Image
segmentedImageOverlay = labeloverlay(imadjust(input_data(:,:,4:6),[0 0.6],[0.1 0.9],0.55),segmentedImage,'Transparency',0.7,'Colormap',cmap); figure(3); imshow(segmentedImageOverlay); title('Segmented Overlayed Image');
References
[1] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-Net: Convolutional Networks for Biomedical Image Segmentation." arXiv preprint arXiv:1505.04597, 2015.
[2] Kemker, R., C. Salvaggio, and C. Kanan. "High-Resolution Multispectral Dataset for Semantic Segmentation." CoRR, abs/1703.01918, 2017.
[3] Reference Input Data used is part of the Hamlin Beach State Park data. The following steps can be used to download the data for further evaluation.
if ~exist(fullfile(pwd,'data')) url = 'http://home.cis.rit.edu/~cnspci/other/data/rit18_data.mat'; downloadHamlinBeachMSIData(url,pwd+"/data/"); end
[4] Kemker, Ronald, Carl Salvaggio, and Christopher Kanan. "Algorithms for Semantic Segmentation of Multispectral Remote Sensing Imagery Using Deep Learning." ISPRS Journal of Photogrammetry and Remote Sensing, Deep Learning RS Data, 145 (November 1, 2018): 60-77. https://doi.org/10.1016/j.isprsjprs.2018.04.014.
See Also
coder.ARMNEONConfig
| coder.DeepLearningConfig
| coder.hardware
| packNGo