Activation Functions

 Activation functions are used in the model to introduce nonlinearity. You can think of nonlinearity as a curve line, which means that without a nonlinearity models can only use straight lines only to learn the parameters.




  • Relu

Relu activation function can be used anywhere in the model, it is used after the final layer only if your output is natural number. Relu retire zero is if the input number is positive and if the input number is positive then it remains the same.

The only disadvantage to relu is that its output is zero when the input is a negative number.

Relu Range: 0 to inf


  • LeakyReLU

Leaky Relu activation function is used in the intermediate layers of deep learning models.

Leaky Relu Range: -inf to +inf


There are some other activation functions which are commonly used but these activation functions contribute to the vanishing gradient, because these reduce the big positive number and makes them close to 1, and any negative number will be close to 0. If you use such an activation function multiple times in a model then the numbers will keep getting smaller along with the gradient of those numbers.


  • Sigmoid

Sigmoid is commonly used after the last layer in deep learning, because it is one of the activation functions which contribute to the vanishing gradient.

Sigmoid Range: 0 to 1


  • Tanh

Tanh function is mostly used in the intermediate layers of model,

Tanh Range: -1 to +1


  • Softmax

Softmax function is used when we have multiple classes for our model, such as if a picture contains a human or dog or car, in this case we have three different classes. But a picture can only contain one of these three at a time in a single image.

Softmax Range: 0 to 1

In the above example we will have 3 numbers, [0.2, 0.3, 0.5] sum of these numbers will be equal to 1. You can interpret it as 20% chance it is a Human, 30% chance it is a Dog and 50% chance it is a Car.

No comments:

Post a Comment

Building a CLI-Based People Tracking and Dwell Time Analytics System Using YOLOv8 and DeepSORT

  Introduction Tracking people across video frames and analyzing their behavior (like  dwell time ) is a crucial task for many real-world ap...