imageInputLayer

Image input layer

expand all in page

Description

An image input layer inputs 2-D images to a neural network and applies data normalization.

For 3-D image input, use image3dInputLayer.

Creation

Syntax

layer = imageInputLayer(inputSize)

layer = imageInputLayer(inputSize,Name=Value)

Description

layer = imageInputLayer(inputSize) returns an image input layer and specifies the InputSize property.

layer = imageInputLayer(inputSize,Name=Value) sets additional options using one or more name-value arguments.

example

Input Arguments

expand all

`inputSize` — Size of the input
row vector of integers

Size of the input data, specified as a row vector of integers [h w c], where h, w, and c correspond to the height, width, and number of channels respectively.

For grayscale images, specify a two-element vector or a vector with c equal to 1.
For RGB images, specify a vector with c equal to 3.
For multispectral or hyperspectral images, specify a vector with c equal to the number of channels.

For 3-D image or volume input, use image3dInputLayer.

This argument sets the InputSize property.

Name-Value Arguments

expand all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: imageInputLayer([28 28 3],Name="input") creates an image input layer with input size [28 28 3] and name 'input'.

`Normalization` — Data normalization
`"zerocenter"` (default) | `"zscore"` | `"rescale-symmetric"` | `"rescale-zero-one"` | `"none"` | function handle

Data normalization to apply every time data is forward propagated through the input layer, specified as one of these values:

"zerocenter" — Subtract the mean specified by Mean.
"zscore" — Subtract the mean specified by Mean and divide by StandardDeviation.
"rescale-symmetric" — Rescale the input to be in the range [-1, 1] using the minimum and maximum values specified by Min and Max, respectively.
"rescale-zero-one" — Rescale the input to be in the range [0, 1] using the minimum and maximum values specified by Min and Max, respectively.
"none" — Do not normalize the input data.
function handle — Normalize the data using the specified function. The function must be of the form Y = f(X), where X is the input data and the output Y is the normalized data.

If the input data is complex-valued and the SplitComplexInputs property is 0 (false), then the Normalization property must be "zerocenter", "zscore", "none", or a function handle. (since R2024a)

Before R2024a: To input complex-valued data into the network, the SplitComplexInputs property must be 1 (true).

Tip

The software, by default, automatically calculates the normalization statistics when you use the trainnet function. To save time when training, specify the required statistics for normalization and set the ResetInputNormalization argument of the trainingOptions function to 0 (false).

This argument sets the Normalization property.

Data Types: char | string | function_handle

`NormalizationDimension` — Normalization dimension
`"auto"` (default) | `"channel"` | `"element"` | `"all"`

Normalization dimension, specified as one of these values:

"auto" — If the ResetInputNormalization training option is 0 (false) and you specify any of the normalization statistics (Mean, StandardDeviation, Min, or Max), then normalize over the dimensions matching the statistics. Otherwise, recalculate the statistics at training time and apply channel-wise normalization.
"channel" — Channel-wise normalization.
"element" — Element-wise normalization.
"all" — Normalize all values using scalar statistics.

This argument sets the NormalizationDimension property.

`Mean` — Mean for zero-center and z-score normalization
`[]` (default) | 3-D array | numeric scalar

Mean for zero-center and z-score normalization, specified as a h-by-w-by-c array, a 1-by-1-by-c array of means per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the mean, respectively.

To specify the Mean property, the Normalization property value must be "zerocenter" or "zscore".

If Mean is [], then the software updates the property value at initialization time.

The initialize function sets the property value to 0.
If you use the dlnetwork function and the Initialize name-value argument value is 1 (true), then the software sets the property value to 0.
If you use the trainnet function and the ResetInputNormalization training option value is 1 (true), then the software calculates the mean using the training data and uses the resulting value.
If you use the trainnet function and the ResetInputNormalization training option value is 0 (false), then the software sets the property value to 0.

Mean can be complex-valued (since R2024a). If Mean is complex-valued, then the SplitComplexInputs property value must be 0 (false).

Before R2024a: Split the mean into real and imaginary parts and split the input data into real and imaginary parts by setting the SplitComplexInputs property value to 1 (true).

This argument sets the Mean property.

`StandardDeviation` — Standard deviation for z-score normalization
`[]` (default) | 3-D array | numeric scalar

Standard deviation for z-score normalization, specified as a h-by-w-by-c array, a 1-by-1-by-c array of means per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the standard deviation, respectively.

To specify the StandardDeviation property, the Normalization property must be "zscore".

If StandardDeviation is [], then the software updates the property at initialization time.

The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 1.
If you use the trainnet function and the ResetInputNormalization training option value is 1 (true), then the software calculates the standard deviation using the training data and uses the resulting value.
If you use the trainnet function and the ResetInputNormalization training option value is 0 (false), then the software sets the property to 1.

This argument sets the StandardDeviation property.

`Min` — Minimum value for rescaling
`[]` (default) | 3-D array | numeric scalar

Minimum value for rescaling, specified as a h-by-w-by-c array, a 1-by-1-by-c array of minima per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the minima, respectively.

To specify the Min property, the Normalization must be "rescale-symmetric" or "rescale-zero-one".

If Min is [], then the software updates the property at initialization time.

The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to -1 and 0 when Normalization is "rescale-symmetric" and "rescale-zero-one", respectively.
If you use the trainnet function and the ResetInputNormalization training option value is 1 (true), then the software calculates the minimum value using the training data and uses the resulting value.
If you use the trainnet function and the ResetInputNormalization training option value is 0 (false), then the software sets the property to -1 and 0 when Normalization is "rescale-symmetric" and "rescale-zero-one", respectively.

This argument sets the Min property.

`Max` — Maximum value for rescaling
`[]` (default) | 3-D array | numeric scalar

Maximum value for rescaling, specified as a h-by-w-by-c array, a 1-by-1-by-c array of maxima per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the maxima, respectively.

To specify the Max property, the Normalization must be "rescale-symmetric" or "rescale-zero-one".

If Max is [], then the software updates the property at initialization time.

The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 1.
If you use the trainnet function and the ResetInputNormalization training option value is 1 (true), then the software calculates the maximum value using the training data and uses the resulting value.
If you use the trainnet function and the ResetInputNormalization training option value is 0 (false), then the software sets the property to 1.

This argument sets the Max property.

`SplitComplexInputs` — Flag to split input data into real and imaginary components
`0` (`false`) (default) | `1` (`true`)

Flag to split input data into real and imaginary components specified as one of these values:

0 (false) — Do not split input data.
1 (true) — Split data into real and imaginary components.

When SplitComplexInputs is 1, then the layer outputs twice as many channels as the input data. For example, if the input data is complex-valued with numChannels channels, then the layer outputs data with 2*numChannels channels, where channels 1 through numChannels contain the real components of the input data and numChannels+1 through 2*numChannels contain the imaginary components of the input data. If the input data is real, then channels numChannels+1 through 2*numChannels are all zero.

If the input data is complex-valued and SplitComplexInputs is 0 (false), then the layer passes the complex-valued data to the next layers (since R2024a).

Before R2024a: To input complex-valued data into a neural network, the SplitComplexInputs option of the input layer must be 1 (true).

For an example showing how to train a network with complex-valued data, see Train Network with Complex-Valued Data.

This argument sets the SplitComplexInputs property.

`Name` — Layer name
`""` (default) | character vector | string scalar

Layer name, specified as a character vector or a string scalar. For Layer array input, the trainnet and dlnetwork functions automatically assign names to unnamed layers.

This argument sets the Name property.

Data Types: char | string

Properties

expand all

Image Input

`InputSize` — Size of the input
Read-only: row vector of integers

This property is read-only.

This property is read-only after object creation. To set this property, use the corresponding positional input argument when you create the ImageInputLayer object.

Size of the input data, specified as a row vector of integers [h w c], where h, w, and c correspond to the height, width, and number of channels respectively.

Data Types: double

`Normalization` — Data normalization
`"zerocenter"` (default) | `"zscore"` | `"rescale-symmetric"` | `"rescale-zero-one"` | `"none"` | function handle

This property is read-only after object creation. To set this property, use the corresponding name-value argument when you create the ImageInputLayer object.

Data normalization to apply every time data is forward propagated through the input layer, specified as one of these values:

"zerocenter" — Subtract the mean specified by Mean.
"zscore" — Subtract the mean specified by Mean and divide by StandardDeviation.
"rescale-symmetric" — Rescale the input to be in the range [-1, 1] using the minimum and maximum values specified by Min and Max, respectively.
"rescale-zero-one" — Rescale the input to be in the range [0, 1] using the minimum and maximum values specified by Min and Max, respectively.
"none" — Do not normalize the input data.
function handle — Normalize the data using the specified function. The function must be of the form Y = f(X), where X is the input data and the output Y is the normalized data.

Before R2024a: To input complex-valued data into the network, the SplitComplexInputs property must be 1 (true).

Tip

The ImageInputLayer object stores this property as a character vector or a function handle.

`NormalizationDimension` — Normalization dimension
`"auto"` (default) | `"channel"` | `"element"` | `"all"`

Normalization dimension, specified as one of these values:

"auto" — If the ResetInputNormalization training option is 0 (false) and you specify any of the normalization statistics (Mean, StandardDeviation, Min, or Max), then normalize over the dimensions matching the statistics. Otherwise, recalculate the statistics at training time and apply channel-wise normalization.
"channel" — Channel-wise normalization.
"element" — Element-wise normalization.
"all" — Normalize all values using scalar statistics.

The ImageInputLayer object stores this property as a character vector.

`Mean` — Mean for zero-center and z-score normalization
`[]` (default) | 3-D array | numeric scalar

To specify the Mean property, the Normalization property value must be "zerocenter" or "zscore".

If Mean is [], then the software updates the property value at initialization time.

The initialize function sets the property value to 0.
If you use the dlnetwork function and the Initialize name-value argument value is 1 (true), then the software sets the property value to 0.
If you use the trainnet function and the ResetInputNormalization training option value is 1 (true), then the software calculates the mean using the training data and uses the resulting value.
If you use the trainnet function and the ResetInputNormalization training option value is 0 (false), then the software sets the property value to 0.

Mean can be complex-valued (since R2024a). If Mean is complex-valued, then the SplitComplexInputs property value must be 0 (false).

Before R2024a: Split the mean into real and imaginary parts and split the input data into real and imaginary parts by setting the SplitComplexInputs property value to 1 (true).

`StandardDeviation` — Standard deviation for z-score normalization
`[]` (default) | 3-D array | numeric scalar

To specify the StandardDeviation property, the Normalization property must be "zscore".

If StandardDeviation is [], then the software updates the property at initialization time.

The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 1.
If you use the trainnet function and the ResetInputNormalization training option value is 1 (true), then the software calculates the standard deviation using the training data and uses the resulting value.
If you use the trainnet function and the ResetInputNormalization training option value is 0 (false), then the software sets the property to 1.

`Min` — Minimum value for rescaling
`[]` (default) | 3-D array | numeric scalar

To specify the Min property, the Normalization must be "rescale-symmetric" or "rescale-zero-one".

If Min is [], then the software updates the property at initialization time.

The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to -1 and 0 when Normalization is "rescale-symmetric" and "rescale-zero-one", respectively.
If you use the trainnet function and the ResetInputNormalization training option value is 1 (true), then the software calculates the minimum value using the training data and uses the resulting value.
If you use the trainnet function and the ResetInputNormalization training option value is 0 (false), then the software sets the property to -1 and 0 when Normalization is "rescale-symmetric" and "rescale-zero-one", respectively.

`Max` — Maximum value for rescaling
`[]` (default) | 3-D array | numeric scalar

To specify the Max property, the Normalization must be "rescale-symmetric" or "rescale-zero-one".

If Max is [], then the software updates the property at initialization time.

The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 1.
If you use the trainnet function and the ResetInputNormalization training option value is 1 (true), then the software calculates the maximum value using the training data and uses the resulting value.
If you use the trainnet function and the ResetInputNormalization training option value is 0 (false), then the software sets the property to 1.

`SplitComplexInputs` — Flag to split input data into real and imaginary components
`0` (`false`) (default) | `1` (`true`)

This property is read-only after object creation. To set this property, use the corresponding name-value argument when you create the ImageInputLayer object.

Flag to split input data into real and imaginary components specified as one of these values:

0 (false) — Do not split input data.
1 (true) — Split data into real and imaginary components.

If the input data is complex-valued and SplitComplexInputs is 0 (false), then the layer passes the complex-valued data to the next layers (since R2024a).

Before R2024a: To input complex-valued data into a neural network, the SplitComplexInputs option of the input layer must be 1 (true).

For an example showing how to train a network with complex-valued data, see Train Network with Complex-Valued Data.

Layer

`Name` — Layer name
`''` (default) | character vector

Layer name, specified as a character vector. For Layer array input, the trainnet and dlnetwork functions automatically assign names to unnamed layers.

Data Types: char

`NumInputs` — Number of inputs
Read-only: `0` (default)

This property is read-only.

Number of inputs of the layer. The layer has no inputs.

Data Types: double

`InputNames` — Input names
Read-only: `{}` (default)

This property is read-only.

Input names of the layer. The layer has no inputs.

Data Types: cell

`NumOutputs` — Number of outputs
Read-only: `1` (default)

This property is read-only.

Number of outputs from the layer, stored as 1. This layer has a single output only.

Data Types: double

`OutputNames` — Output names
Read-only: `{'out'}` (default)

This property is read-only.

Output names, stored as {'out'}. This layer has a single output only.

Data Types: cell

Examples

collapse all

Create Image Input Layer

Open Live Script

Create an image input layer for 28-by-28 color images.

inputlayer = imageInputLayer([28 28 3])

inputlayer = 
  ImageInputLayer with properties:

                      Name: ''
                 InputSize: [28 28 3]
        SplitComplexInputs: 0

   Hyperparameters
          DataAugmentation: 'none'
             Normalization: 'zerocenter'
    NormalizationDimension: 'auto'
                      Mean: []

Include an image input layer in a Layer array.

layers = [
    imageInputLayer([28 28 1])
    convolution2dLayer(5,20)
    reluLayer
    maxPooling2dLayer(2,Stride=2)
    fullyConnectedLayer(10)
    softmaxLayer]

layers = 
  6×1 Layer array with layers:

     1   ''   Image Input       28×28×1 images with 'zerocenter' normalization
     2   ''   2-D Convolution   20 5×5 convolutions with stride [1  1] and padding [0  0  0  0]
     3   ''   ReLU              ReLU
     4   ''   2-D Max Pooling   2×2 max pooling with stride [2  2] and padding [0  0  0  0]
     5   ''   Fully Connected   Fully connected layer with output size 10
     6   ''   Softmax           Softmax

Algorithms

expand all

Layer Output Formats

Most layers in a layer array or layer graph pass data to subsequent layers as formatted dlarray objects. The format of a dlarray object is a string of characters in which each character describes the corresponding dimension of the data. The format consists of one or more of these characters:

"S" — Spatial
"C" — Channel
"B" — Batch
"T" — Time
"U" — Unspecified

For example, you can describe 2-D image data that is represented as a 4-D array, where the first two dimensions correspond to the spatial dimensions of the images, the third dimension corresponds to the channels of the images, and the fourth dimension corresponds to the batch dimension, as having the format "SSCB" (spatial, spatial, channel, batch).

The input layer of a network specifies the layout of the data that the network expects. If you have data in a different layout, then specify the layout using the InputDataFormats training option.

The layer inputs h-by-w-by-c-by-N arrays into the network, where h, w, and c are the height, width, and number of channels of the images, respectively, and N is the number of images. Data in this layout has the data format "SSCB" (spatial, spatial, channel, batch).

Complex Numbers

For complex-valued input to the neural network, when the SplitComplexInputs is 0 (false), the layer passes complex-valued data to subsequent layers (since R2024a).

Before R2024a: To input complex-valued data into a neural network, the SplitComplexInputs option of the input layer must be 1 (true).

If the input data is complex-valued and the SplitComplexInputs option is 0 (false), then the Normalization option must be "zerocenter", "zscore", "none", or a function handle. The Mean and property of the layer also support complex-valued data for the "zerocenter" and "zscore" normalization options.

For an example showing how to train a network with complex-valued data, see Train Network with Complex-Valued Data.

References

[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks." Communications of the ACM 60, no. 6 (May 24, 2017): 84–90. https://doi.org/10.1145/3065386.

[2] Cireşan, D., U. Meier, J. Schmidhuber. "Multi-column Deep Neural Networks for Image Classification". IEEE Conference on Computer Vision and Pattern Recognition, 2012.

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Code generation does not support passing dlarray objects with "U" (unspecified) dimensions to this layer.
Code generation does not support Normalization specified using a function handle.
Code generation does not support complex input and does not support the SplitComplexInputs option.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Refer to the usage notes and limitations in the C/C++ Code Generation section. The same usage notes and limitations apply to GPU code generation.

Version History

Introduced in R2016a

expand all

R2024a: Complex-valued outputs

For complex-valued input to the neural network, when the SplitComplexIputs is 0 (false), the layer passes complex-valued data to subsequent layers.

If the input data is complex-valued and the SplitComplexInputs option is 0 (false), then the Normalization option must be "zerocenter", "zscore", "none", or a function handle. The Mean property of the layer also supports complex-valued data for the "zerocenter" and "zscore" normalization options.

R2019b: `AverageImage` property will be removed

AverageImage will be removed. Use Mean instead. To update your code, replace all instances of AverageImage with Mean. There are no differences between the properties that require additional updates to your code.

R2019b: `imageInputLayer` and `image3dInputLayer`, by default, use channel-wise normalization

Starting in R2019b, imageInputLayer and image3dInputLayer, by default, use channel-wise normalization. In previous versions, these layers use element-wise normalization. To reproduce this behavior, set the NormalizationDimension option of these layers to 'element'.

R2018a: `DataAugmentation` is not recommended

The DataAugmentation property is not recommended. To preprocess images with cropping, reflection, and other geometric transformations, use augmentedImageDatastore instead.

imageInputLayer

Description

Creation

Syntax

Description

Input Arguments

inputSize — Size of the input row vector of integers

Name-Value Arguments

Normalization — Data normalization "zerocenter" (default) | "zscore" | "rescale-symmetric" | "rescale-zero-one" | "none" | function handle

NormalizationDimension — Normalization dimension "auto" (default) | "channel" | "element" | "all"

Mean — Mean for zero-center and z-score normalization [] (default) | 3-D array | numeric scalar

StandardDeviation — Standard deviation for z-score normalization [] (default) | 3-D array | numeric scalar

Min — Minimum value for rescaling [] (default) | 3-D array | numeric scalar

Max — Maximum value for rescaling [] (default) | 3-D array | numeric scalar

SplitComplexInputs — Flag to split input data into real and imaginary components 0 (false) (default) | 1 (true)

Name — Layer name "" (default) | character vector | string scalar

Properties

Image Input

InputSize — Size of the input Read-only: row vector of integers

Normalization — Data normalization "zerocenter" (default) | "zscore" | "rescale-symmetric" | "rescale-zero-one" | "none" | function handle

NormalizationDimension — Normalization dimension "auto" (default) | "channel" | "element" | "all"

Mean — Mean for zero-center and z-score normalization [] (default) | 3-D array | numeric scalar

StandardDeviation — Standard deviation for z-score normalization [] (default) | 3-D array | numeric scalar

Min — Minimum value for rescaling [] (default) | 3-D array | numeric scalar

Max — Maximum value for rescaling [] (default) | 3-D array | numeric scalar

SplitComplexInputs — Flag to split input data into real and imaginary components 0 (false) (default) | 1 (true)

Layer

Name — Layer name '' (default) | character vector

NumInputs — Number of inputs Read-only: 0 (default)

InputNames — Input names Read-only: {} (default)

NumOutputs — Number of outputs Read-only: 1 (default)

OutputNames — Output names Read-only: {'out'} (default)

Examples

Create Image Input Layer

Algorithms

Layer Output Formats

Complex Numbers

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

R2024a: Complex-valued outputs

R2019b: AverageImage property will be removed

R2019b: imageInputLayer and image3dInputLayer, by default, use channel-wise normalization

R2018a: DataAugmentation is not recommended

See Also

Topics

`inputSize` — Size of the input
row vector of integers

`Normalization` — Data normalization
`"zerocenter"` (default) | `"zscore"` | `"rescale-symmetric"` | `"rescale-zero-one"` | `"none"` | function handle

`NormalizationDimension` — Normalization dimension
`"auto"` (default) | `"channel"` | `"element"` | `"all"`

`Mean` — Mean for zero-center and z-score normalization
`[]` (default) | 3-D array | numeric scalar

`StandardDeviation` — Standard deviation for z-score normalization
`[]` (default) | 3-D array | numeric scalar

`Min` — Minimum value for rescaling
`[]` (default) | 3-D array | numeric scalar

`Max` — Maximum value for rescaling
`[]` (default) | 3-D array | numeric scalar

`SplitComplexInputs` — Flag to split input data into real and imaginary components
`0` (`false`) (default) | `1` (`true`)

`Name` — Layer name
`""` (default) | character vector | string scalar

`InputSize` — Size of the input
Read-only: row vector of integers

`Normalization` — Data normalization
`"zerocenter"` (default) | `"zscore"` | `"rescale-symmetric"` | `"rescale-zero-one"` | `"none"` | function handle

`NormalizationDimension` — Normalization dimension
`"auto"` (default) | `"channel"` | `"element"` | `"all"`

`Mean` — Mean for zero-center and z-score normalization
`[]` (default) | 3-D array | numeric scalar

`StandardDeviation` — Standard deviation for z-score normalization
`[]` (default) | 3-D array | numeric scalar

`Min` — Minimum value for rescaling
`[]` (default) | 3-D array | numeric scalar

`Max` — Maximum value for rescaling
`[]` (default) | 3-D array | numeric scalar

`SplitComplexInputs` — Flag to split input data into real and imaginary components
`0` (`false`) (default) | `1` (`true`)

`Name` — Layer name
`''` (default) | character vector

`NumInputs` — Number of inputs
Read-only: `0` (default)

`InputNames` — Input names
Read-only: `{}` (default)

`NumOutputs` — Number of outputs
Read-only: `1` (default)

`OutputNames` — Output names
Read-only: `{'out'}` (default)

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

R2019b: `AverageImage` property will be removed

R2019b: `imageInputLayer` and `image3dInputLayer`, by default, use channel-wise normalization

R2018a: `DataAugmentation` is not recommended