gelu
Description
The Gaussian error linear unit (GELU) activation operation weights the input by its probability under a Gaussian distribution.
This operation is given by
where erf denotes the error function.
Note
This function applies the GELU operation to dlarray
data. If you want to apply the GELU activation within a layerGraph
object or Layer
array, use the following layer:
Examples
Input Arguments
Output Arguments
Algorithms
References
[1] Hendrycks, Dan, and Kevin Gimpel. "Gaussian error linear units (GELUs)." Preprint, submitted June 27, 2016. https://arxiv.org/abs/1606.08415
Extended Capabilities
Version History
Introduced in R2022b