Artificial Intelligence Code: Residual Block in Deep Learning model

Residual Block

Residual blocks are very useful in the deep learning model, because these blocks can be skipped by DL model if they are not needed, they have a skip connection which can pass the original input to output as it is, also in deep learning model they can be used to handle the vanishing gradient problem.

Residual block architecture includes an input layer, this input layer will pass its output to another layer along with the layer next to it as shown in figure above. That way if the block is needed then it will be used, otherwise it will be skipped. The gradient of that block will have 2 paths to follow. The 1st path will be through the middle layer, the 2nd path will be the curved path without any layer in between. The gradient going through 1st path will be modified based on the block error, while the gradient going through 2nd path will remain the same. When these Gradients merge they are added together.

This is simple architecture of the residual block, you can use it anywhere in the model, when using it in the middle of a model the input layer will be the previous layer. The figure above shows the single CNN layer based Residual block but you can make a residual block with multiple layers of different types as well, you just have to be careful about the dimension of the output of these layers, since these two paths will merge together so they should have similar dimensions.

Artificial Intelligence Code

Residual Block in Deep Learning model

Residual Block

No comments:

Post a Comment

Building a CLI-Based People Tracking and Dwell Time Analytics System Using YOLOv8 and DeepSORT

Followers