Derivative of the Sigmoid Function
For sigmoid function, [Tex]\sigma(x)= \frac{1}{1+e^{-x}}[/Tex] it’s derivative is given as:
σ′(x) = σ(x)(1 − σ(x))
Let’s derive this derivative of sigmoid function as follows:
Let [Tex]y = \sigma(x)= \frac{1}{1+e^{-x}}[/Tex]
Let u = 1 + e-x. Thus, y = 1/u.
First, find the derivative of u with respect to x :
du/dx = -e-x.
Then, find the derivative of y with respect to u:
[Tex]\frac{dy}{du} = -\frac{1}{u^2}[/Tex]
Apply the chain rule:
[Tex]\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx} = -\frac{1}{u^2} \cdot (-e^{-x}) = \frac{e^{-x}}{(1 + e^{-x})^2}[/Tex]
[Tex]\sigma(x) = \frac{1}{1 + e^{-x}}, 1 – \sigma(x) = \frac{e^{-x}}{1 + e^{-x}}[/Tex]
Thus,[Tex] \sigma'(x) = \frac{e^{-x}}{(1 + e^{-x})^2} = \left( \frac{1}{1 + e^{-x}} \right) \left( \frac{e^{-x}}{1 + e^{-x}} \right) = \sigma(x) \left( 1 – \sigma(x) \right)[/Tex]
Derivative of the Sigmoid Function
Sigmoid function is one of the most commonly used activation functions in Machine learning and Deep learning. The sigmoid function can be used in the hidden layers, which take the output from the previous layer and brings the input values between 0 and 1. Now while working with neural networks it is necessary to calculate the derivate of the activation function.
Sigmoid function is also known as the squashing function, as it takes the input from the previously hidden layer and squeezes it between 0 and 1. So a value fed to the sigmoid function will always return a value between 0 and 1, no matter how big or small the value is fed.
Table of Content
- What is the Sigmoid Function?
- Mathematical Definition of Sigmoid Function
- Properties of the Sigmoid Function
- Derivative of the Sigmoid Function
- Applications of Sigmoid Function
- FAQs