🖍️
gitbook_docs
  • Introduction
  • Machine Learning
    • Recommended Courses
      • For Undergrad Research
      • Math for Machine Learning
    • ML Notes
      • Covariance Correlation
      • Feature Selection
      • Linear Regression
      • Entropy, Cross-Entropy, KL Divergence
      • Bayesian Classifier
        • Terminology Review
        • Bayesian Classifier for Normally Distributed classes
      • Linear Discriminant Analysis
      • Logistic Regression
        • Logistic Regression Math
      • Logistic Regression-MaximumLikelihood
      • SVM
        • SVM concept
        • SVM math
      • Cross Validation
      • Parameter, Density Estimation
        • MAP, MLE
        • Gaussian Mixture Model
      • E-M
      • Density Estimation(non-parametric)
      • Unsupervised Learning
      • Clustering
      • kNN
      • WaveletTransform
      • Decision Tree
    • Probability and Statistics for Machine Learning
      • Introduction
      • Basics of Data Analysis
      • Probability for Discrete Random Variable
      • Poisson Distribution
      • Chi-Square Distribution
      • P-value and Statistical Hypothesis
      • Power and Sample Size
      • Hypothesis Test Old
      • Hypothesis Test
      • Multi Armed Bandit
      • Bayesian Inference
      • Bayesian Updating with Continuous Priors
      • Discrete Distribution
      • Comparison of Bayesian and frequentist inference
      • Confidence Intervals for Normal Data
      • Frequenist Methods
      • Null Hypothesis Significance Testing
      • Confidence Intervals: Three Views
      • Confidence Intervals for the Mean of Non-normal Data
      • Probabilistic Prediction
  • Industrial AI
    • PHM Dataset
    • BearingFault_Journal
      • Support Vector Machine based
      • Autoregressive(AR) model based
      • Envelope Extraction based
      • Wavelet Decomposition based
      • Prediction of RUL with Deep Convolution Nueral Network
      • Prediction of RUL with Information Entropy
      • Feature Model and Feature Selection
    • TempCore Journal
      • Machine learning of mechanical properties of steels
      • Online prediction of mechanical properties of hot rolled steel plate using machine learning
      • Prediction and Analysis of Tensile Properties of Austenitic Stainless Steel Using Artificial Neural
      • Tempcore, new process for the production of high quality reinforcing
      • TEMPCORE, the most convenient process to produce low cost high strength rebars from 8 to 75 mm
      • Experimental investigation and simulation of structure and tensile properties of Tempcore treated re
    • Notes
  • LiDAR
    • Processing of Point Cloud
    • Intro. 3D Object Detection
    • PointNet
    • PointNet++
    • Frustrum-PointNet
    • VoxelNet
    • Point RCNN
    • PointPillars
    • LaserNet
  • Simulator
    • Simulator List
    • CARLA
    • Airsim
      • Setup
      • Tutorial
        • T#1
        • T#2
        • T#3: Opencv CPP
        • T#4: Opencv Py
        • Untitled
        • T#5: End2End Driving
  • Resources
    • Useful Resources
    • Github
    • Jekyll
  • Reinforcement Learning
    • RL Overview
      • RL Bootcamp
      • MIT Deep RL
    • Textbook
    • Basics
    • Continuous Space RL
  • Unsupervised Learning
    • Introduction
  • Unclassified
    • Ethics
    • Conference Guideline
  • FPGA
    • Untitled
  • Numerical Method
    • NM API reference
Powered by GitBook
On this page
  • Traditional Pipeline
  • Deep Learning based 3D Object Detection
  • Benchmark network
  • LiDAR only
  • LiDAR+Vision
  • Taxonomy for 3D Object Detection Solutions
  • BEV based
  • ****
  • 3D Voxel based
  • Fusion based
  • Other notes
  • Code
  • Matlab
  • OpenDCDet
  • Reference
  • http://www.kmooc.kr/courses/course-v1:NGV+NGV01+2020_A1/course/

Was this helpful?

  1. LiDAR

Intro. 3D Object Detection

PreviousProcessing of Point CloudNextPointNet

Last updated 3 years ago

Was this helpful?

Traditional Pipeline

Background(ground) removal - spatiotemporal clustering - classification

Comparison to 2D object detection CNN

There are two key differences: 1) the point cloud is a sparse representation, while an image is dense and 2) the point cloud is 3D, while the image is 2D.

3D data is crucial for self-driving cars, autonomous robots, virtual and augmented reality. Different from 2D images that are represented as pixel arrays, it can be represented as , , , etc.

Deep Learning based 3D Object Detection

Benchmark network

LiDAR only

  • PointNet(2017)

  • VoxelNet(2018)

  • SECOND(2018)

  • Point-Pillar(2019)

LiDAR+Vision

  • ContFuse(2018)

  • Frustum PointNet

  • MV3D (2017)

  • AVOD (2018)

  • PIXOR++(2018)

Taxonomy for 3D Object Detection Solutions

Point-cloud based

  • Projection : Project point clouds onto image (front-view, bird’s eye projection). (BEV)

  • Volumetric : Encode point cloud to a volumetric voxel grid before processing them.

  • PointNet : Utilize PointNet-architecture.

Fusion based

  • Combine two or more sensor inputs to improve the overall performance of 3DOD.

  • Early fusion, late fusion, deep fusion

BEV based

Recent methods tend to view the lidar point cloud from a bird’s eye view (BEV, 2D)

  • MV3D, AVOD, PIXOR, Complex-YOLO, PointPillar

  • BEV preserves the object scales

  • Convolutions in BEV preserve the local range information

However, the bird’s eye view tends to be extremely sparse which makes direct application of convolutional neural networks impractical and inefficient.

Typically use grids 10x10cm and perform feature encoding to each grid cell.

  • Manually encoding features? How to do feature extraction?

****

3D Voxel based

  • VoxelNet, Vote3Deep, Point RCNN

VoxelNet

One of the first methods to truly do end-to-end learning in this domain.

VoxelNet divides the space into voxels, applies a PointNet to each voxel, followed by a 3D convolutional middle layer to consolidate the vertical axis, after which a 2D convolutional detection architecture is applied.

  • Slow ~4Hz due to 3D convolution

SECOND

An improvement on VoxelNet and improved inference time, but 3D convolution is still bottleneck

Fusion based

Frustum PointNet

Uses PointNets to segment and classify the point cloud in a frustum generated from projecting a detection on an image into 3D. It achieved high benchmark performance compared to other fusion methods, but its multi-stage design makes end-to-end learning impractical.


Other notes

Early works used 3D Covolutional network for detection but it is quite slow.

Recent works improve run-time by projecting 3D point cloud on either (1) Ground plane or BEV (2) image plane.

Fixed Encoder

For these methods, commonly the point cloud is organized in voxels and the set of voxels in each vertical column is encoded into a fixed-length, hand-crafted, feature encoding to form a pseudo-image which can be processed by a standard image detection architecture.

  • MV3D, AVOD : fuse with Vision and a two-stage pipeline

  • PIXOR, ComplexYOLO: a single-stage pipeline

Learned Encoder

PointNet: Learning from unordered pint sets for full end-to-end learning

VoxelNet: PointNet based that applied on Lidar points and use 3D conv and 2D conv layers.

Code

Matlab

OpenDCDet

Useful github for LiDAR 3D object Detection: OpenMMLab

OpenPCDet is a clear, simple, self-contained open source project for LiDAR-based 3D object detection.

OpenPCDet is a general PyTorch-based codebase for 3D object detection from point cloud. It currently supports multiple state-of-the-art 3D object detection methods with highly refactored codes for both one-stage and two-stage 3D detection frameworks.\

Reference

[집콕]자율주행 인공지능 시스템

image from:

Create 3D model from a single 2D image in PyTorch
https://github.com/open-mmlab/OpenPCDet
http://www.kmooc.kr/courses/course-v1:NGV+NGV01+2020_A1/course/
polygonal mesh
volumetric pixel grid
point cloud
[Paper Review] VoxelNet: End-to-end Learning for Point Cloud Based 3D Object Detection
14MB
Mathworks_LiDAR Processing.pdf
pdf
Mathworks LiDAR Processing Workshop
Friederich, Jonas, and Patrick Zschech. "Review and systematization of solutions for 3D object detection." Proceedings of the 15th International Conference on Wirtschaftsinformatik (WI). 2020.
Logo