e3x.nn.activations.gelu

e3x.nn.activations.gelu(x, approximate=True)[source]

Gaussian error linear unit activation function.

Computes the gated linear activation. If approximate=False, the \(\mathrm{gate}\) function is given by:

\[\mathrm{gate}(x) = \frac{1}{2} \left(1 + \mathrm{erf} \left( \frac{x}{\sqrt{2}} \right) \right)\]

For scalar inputs, this is equivalent to:

\[\mathrm{gelu}(x) = \frac{x}{2} \left(1 + \mathrm{erf} \left( \frac{x}{\sqrt{2}} \right) \right)\]

If approximate=True, the \(\mathrm{gate}\) function is approximated as:

\[\mathrm{gate}(x) = \frac{1}{2} \left(1 + \mathrm{tanh} \left( \sqrt{\frac{2}{\pi}} \left(x + 0.044715 x^3 \right) \right) \right)\]

For scalar inputs, this is equivalent to:

\[\mathrm{gelu}(x) = \frac{x}{2} \left(1 + \mathrm{tanh} \left( \sqrt{\frac{2}{\pi}} \left(x + 0.044715 x^3 \right) \right) \right)\]

For more information, see Gaussian Error Linear Units (GELUs), section 2.

../_images/e3x.nn.activations.gelu_0_1.svg
Parameters:
Return type:

Union[Float[Array, '... 1 (max_degree+1)**2 num_features'], Float[Array, '... 2 (max_degree+1)**2 num_features']]

Returns:

The result of applying the nonlinearity to the input features.