📚
DLIP
  • Introduction
  • Prerequisite
  • Image Processing Basics
    • Notes
      • Thresholding
      • Spatial Filtering
      • Masking with Bitwise Operation
      • Model n Calibration
    • Tutorial
      • Tutorial: Install OpenCV C++
      • Tutorial: Create OpenCV Project
      • Tutorial: C++ basics
      • Tutorial: OpenCV Basics
      • Tutorial: Image Watch for Debugging
      • Tutorial: Spatial Filter
      • Tutorial: Thresholding and Morphology
      • Tutorial: Camera Calibration
      • Tutorial: Color Image Processing
      • Tutorial: Edge Line Circle Detection
      • Tutorial: Corner Detection and Optical Flow
      • Tutorial: OpenCV C++ Cheatsheet
      • Tutorial: Installation for Py OpenCV
      • Tutorial: OpenCv (Python) Basics
    • LAB
      • Lab Report Template
      • Lab Report Grading Criteria
      • LAB Report Instruction
      • LAB: Grayscale Image Segmentation
        • LAB: Grayscale Image Segmentation -Gear
        • LAB: Grayscale Image Segmentation - Bolt and Nut
      • LAB: Color Image Segmentation
        • LAB: Facial Temperature Measurement with IR images
        • LAB: Magic Cloak
      • LAB: Straight Lane Detection and Departure Warning
      • LAB: Dimension Measurement with 2D camera
      • LAB: Tension Detection of Rolling Metal Sheet
  • Deep Learning for Perception
    • Notes
      • Lane Detection with Deep Learning
      • Overview of Deep Learning
        • Object Detection
        • Deep Learning Basics: Introduction
        • Deep Learning State of the Art
        • CNN, Object Detection
      • Perceptron
      • Activation Function
      • Optimization
      • Convolution
      • CNN Overview
      • Evaluation Metric
      • LossFunction Regularization
      • Bias vs Variance
      • BottleNeck Unit
      • Object Detection
      • DL Techniques
        • Technical Strategy by A.Ng
    • Tutorial - PyTorch
      • Tutorial: Install PyTorch
      • Tutorial: Python Numpy
      • Tutorial: PyTorch Tutorial List
      • Tutorial: PyTorch Example Code
      • Tutorial: Tensorboard in Pytorch
      • Tutorial: YOLO in PyTorch
        • Tutorial: Yolov8 in PyTorch
        • Tutorial: Train Yolo v8 with custom dataset
          • Tutorial: Train Yolo v5 with custom dataset
        • Tutorial: Yolov5 in Pytorch (VS code)
        • Tutorial: Yolov3 in Keras
    • LAB
      • Assignment: CNN Classification
      • Assignment: Object Detection
      • LAB: CNN Object Detection 1
      • LAB: CNN Object Detection 2
      • LAB Grading Criteria
    • Tutorial- Keras
      • Train Dataset
      • Train custom dataset
      • Test model
      • LeNet-5 Tutorial
      • AlexNet Tutorial
      • VGG Tutorial
      • ResNet Tutorial
    • Resource
      • Online Lecture
      • Programming tutorial
      • Books
      • Hardware
      • Dataset
      • Useful sites
  • Must Read Papers
    • AlexNet
    • VGG
    • ResNet
    • R-CNN, Fast-RCNN, Faster-RCNN
    • YOLOv1-3
    • Inception
    • MobileNet
    • SSD
    • ShuffleNet
    • Recent Methods
  • DLIP Project
    • Report Template
    • DLIP 2021 Projects
      • Digital Door Lock Control with Face Recognition
      • People Counting with YOLOv4 and DeepSORT
      • Eye Blinking Detection Alarm
      • Helmet-Detection Using YOLO-V5
      • Mask Detection using YOLOv5
      • Parking Space Management
      • Vehicle, Pedestrian Detection with IR Image
      • Drum Playing Detection
      • Turtle neck measurement program using OpenPose
    • DLIP 2022 Projects
      • BakeryCashier
      • Virtual Mouse
      • Sudoku Program with Hand gesture
      • Exercise Posture Assistance System
      • People Counting Embedded System
      • Turtle neck measurement program using OpenPose
    • DLIP Past Projects
  • Installation Guide
    • Installation Guide for Pytorch
      • Installation Guide 2021
    • Anaconda
    • CUDA cuDNN
      • CUDA 10.2
    • OpenCV
      • OpenCV Install and Setup
        • OpenCV 3.4.13 with VS2019
        • OpenCV3.4.7 VS2017
        • MacOS OpenCV C++ in XCode
      • Python OpenCV
      • MATLAB-OpenCV
    • Framework
      • Keras
      • TensorFlow
        • Cheat Sheet
        • Tutorial
      • PyTorch
    • IDE
      • Visual Studio Community
      • Google Codelab
      • Visual Studio Code
        • Python with VS Code
        • Notebook with VS Code
        • C++ with VS Code
      • Jupyter Notebook
        • Install
        • How to use
    • Ubuntu
      • Ubuntu 18.04 Installation
      • Ubuntu Installation using Docker in Win10
      • Ubuntu Troubleshooting
    • ROS
  • Programming
    • Python_Numpy
      • Python Tutorial - Tips
      • Python Tutorial - For Loop
      • Python Tutorial - List Tuple, Dic, Set
    • Markdown
      • Example: API documentation
    • Github
      • Create account
      • Tutorial: Github basic
      • Tutorial: Github Desktop
    • Keras
      • Tutorial Keras
      • Cheat Sheet
    • PyTorch
      • Cheat Sheet
      • Autograd in PyTorch
      • Simple ConvNet
      • MNIST using LeNet
      • Train ConvNet using CIFAR10
  • Resources
    • Useful Resources
    • Github
Powered by GitBook
On this page
  • Background
  • Residual Block
  • Blocks
  • Variant of Residual Block
  • ResNeXt
  • Densely Connected CNN

Was this helpful?

  1. Must Read Papers

ResNet

PreviousVGGNextR-CNN, Fast-RCNN, Faster-RCNN

Last updated 3 years ago

Was this helpful?

Background

Since AlexNet, the state-of-the-art CNN architecture is going deeper and deeper. While AlexNet had only 5 convolutional layers, the VGG network [3] and GoogleNet (also codenamed Inception_v1) [4] had 19 and 22 layers respectively.

Deep networks are hard to train because of the notorious vanishing gradient problem — as the gradient is back-propagated to earlier layers, repeated multiplication may make the gradient infinitively small.

ResNets solve is the famous known vanishing gradient. With ResNets, the gradients can flow directly through the skip connections backwards from later layers to initial filters.

The core idea of ResNet is introducing a so-called “identity shortcut connection” that skips one or more layers

Residual Block

The stacked layers fit a residual mapping is easier than letting them directly fit the desired underlaying mapping. With ResNets, the gradients can flow directly through the skip connections backwards from later layers to initial filters.

Blocks

Down sampling of the volume though the network is achieved by increasing the stride instead of a pooling operation.

Since the volume got modified we need to apply one of our down sampling strategies. The 1x1 convolution approach is shown.

Variant of Residual Block

There are several new architectures based on ResNet over years

Pre-activation variant of residual block [7]

K. He, X. Zhang, S. Ren, and J. Sun. Identity Mappings in Deep Residual Networks. arXiv preprint arXiv:1603.05027v3,2016.

The gradients can flow through the shortcut connections to any other earlier layer unimpededly.

Performance: 1202-layer ResNet <110-layer Full Pre-activation ResNet

ResNeXt

S. Xie, R. Girshick, P. Dollar, Z. Tu and K. He. Aggregated Residual Transformations for Deep Neural Networks. arXiv preprint arXiv:1611.05431v1,2016.

Xie et al. [8] proposed a variant of ResNet that is codenamed ResNeXt with the following building block:

This may look familiar to you as it is very similar to the Inception module of [4], they both follow the split-transform-merge paradigm, except in this variant, the outputs of different paths are merged by adding them together, while in [4] they are depth-concatenated. Another difference is that in [4], each path is different (1x1, 3x3 and 5x5 convolution) from each other, while in this architecture, all paths share the same topology.

This novel building block has three equivalent form as follows:

In practice, the “split-transform-merge” is usually done by pointwise grouped convolutional layer, which divides its input into groups of feature maps and perform normal convolution respectively, their outputs are depth-concatenated and then fed to a 1x1 convolutional layer.

Densely Connected CNN

G. Huang, Z. Liu, K. Q. Weinberger and L. Maaten. Densely Connected Convolutional Networks. arXiv:1608.06993v3,2016

Huang et al. [9] proposed a novel architecture called DenseNet that further exploits the effects of shortcut connections — it connects all layers directly with each other. In this novel architecture, the input of each layer consists of the feature maps of all earlier layer, and its output is passed to each subsequent layer. The feature maps are aggregated with depth-concatenation.

Other than tackling the vanishing gradients problem, the authors of [8] argue that this architecture also encourages feature reuse, making the network highly parameter-efficient. One simple interpretation of this is that, in [2][7], the output of the identity mapping was added to the next block, which might impede information flow if the feature maps of two layers have very different distributions. Therefore, concatenating feature maps can preserve them all and increase the variance of the outputs, encouraging feature reuse

Layer 2, Block 1
Imaleft: a building block of [2], right: a building block of ResNeXt with cardinality = 32ge for post
Image for pthree equivalent formost
Image for post
Image for post
Layer 1, block 1
Layer 1
Layer 2
variants of residual blocks