ResNet (Residual Network) is a groundbreaking deep neural network architecture introduced by Microsoft Research in 2015. It was designed to address the vanishing gradient problem and enable the training of very deep networks, which were previously difficult to optimize effectively.
Key Concepts in ResNet
Deep Networks and the Vanishing Gradient Problem:
- As neural networks become deeper, the gradients during backpropagation tend to diminish, making it challenging to update the weights of earlier layers.
- This can lead to a network where additional layers degrade performance rather than improve it (called the degradation problem).
Residual Learning:
- ResNet introduced a concept called residual connections (or skip connections).
- Instead of learning the direct mapping (desired output), it learns the residual function , reformulating the problem as:
- The residual connection directly adds the input to the output of a block, ensuring that the network learns only the residual
Residual Block:
A residual block is the fundamental building unit of ResNet. It consists of:
- Two or three convolutional layers.
- A skip connection that bypasses these layers and adds the input to the output.
- Batch normalization (BN) and ReLU activation are applied before or after convolutions.
Mathematically:
Where represents the convolutional operations with weights .
Bottleneck Architecture:
- For deeper versions of ResNet (e.g., ResNet-50, ResNet-101), a bottleneck block is used to reduce computational cost:
- First, reduce the dimensionality of the input with a
- Apply a convolution for feature extraction.
- Restore dimensionality with another convolution.
- For deeper versions of ResNet (e.g., ResNet-50, ResNet-101), a bottleneck block is used to reduce computational cost:
ResNet Architectures
ResNet comes in various depths, commonly referred to by the number of layers:
- ResNet-18: 18 layers (basic blocks).
- ResNet-34: 34 layers (basic blocks).
- ResNet-50: 50 layers (bottleneck blocks).
- ResNet-101: 101 layers (bottleneck blocks).
- ResNet-152: 152 layers (bottleneck blocks).
Basic Block (used in ResNet-18, ResNet-34):
- Two convolutions with a skip connection.
Bottleneck Block (used in ResNet-50, ResNet-101, ResNet-152):
- A convolution for dimensionality reduction.
- A convolution for feature extraction.
- A convolution to restore dimensionality.
Strengths of ResNet
- Enabling Very Deep Networks:
- Networks with hundreds or even thousands of layers can be trained effectively.
- Improved Gradient Flow:
- Residual connections ensure that gradients flow directly through the skip paths during backpropagation, reducing the vanishing gradient problem.
- High Accuracy:
- ResNet achieved top results on benchmarks like ImageNet and COCO.
Limitations of ResNet
- Computational Cost:
- Deeper models like ResNet-152 are computationally expensive.
- Inefficiency for Small Networks:
- For small tasks, the residual connections might not provide significant benefits.
Applications of ResNet
- Image Classification:
- Won the ImageNet challenge in 2015.
- Object Detection:
- Backbone for models like Faster R-CNN, Mask R-CNN.
- Semantic Segmentation:
- Used in models like DeepLab.
Variants of ResNet
- ResNeXt:
- Uses grouped convolutions for better accuracy-efficiency trade-off.
- Wide ResNet:
- Increases the width of layers instead of depth for better performance.
- ResNet-D:
- Incorporates modifications for better feature extraction in classification and detection tasks.