🖍️
gitbook_docs
  • Introduction
  • Machine Learning
    • Recommended Courses
      • For Undergrad Research
      • Math for Machine Learning
    • ML Notes
      • Covariance Correlation
      • Feature Selection
      • Linear Regression
      • Entropy, Cross-Entropy, KL Divergence
      • Bayesian Classifier
        • Terminology Review
        • Bayesian Classifier for Normally Distributed classes
      • Linear Discriminant Analysis
      • Logistic Regression
        • Logistic Regression Math
      • Logistic Regression-MaximumLikelihood
      • SVM
        • SVM concept
        • SVM math
      • Cross Validation
      • Parameter, Density Estimation
        • MAP, MLE
        • Gaussian Mixture Model
      • E-M
      • Density Estimation(non-parametric)
      • Unsupervised Learning
      • Clustering
      • kNN
      • WaveletTransform
      • Decision Tree
    • Probability and Statistics for Machine Learning
      • Introduction
      • Basics of Data Analysis
      • Probability for Discrete Random Variable
      • Poisson Distribution
      • Chi-Square Distribution
      • P-value and Statistical Hypothesis
      • Power and Sample Size
      • Hypothesis Test Old
      • Hypothesis Test
      • Multi Armed Bandit
      • Bayesian Inference
      • Bayesian Updating with Continuous Priors
      • Discrete Distribution
      • Comparison of Bayesian and frequentist inference
      • Confidence Intervals for Normal Data
      • Frequenist Methods
      • Null Hypothesis Significance Testing
      • Confidence Intervals: Three Views
      • Confidence Intervals for the Mean of Non-normal Data
      • Probabilistic Prediction
  • Industrial AI
    • PHM Dataset
    • BearingFault_Journal
      • Support Vector Machine based
      • Autoregressive(AR) model based
      • Envelope Extraction based
      • Wavelet Decomposition based
      • Prediction of RUL with Deep Convolution Nueral Network
      • Prediction of RUL with Information Entropy
      • Feature Model and Feature Selection
    • TempCore Journal
      • Machine learning of mechanical properties of steels
      • Online prediction of mechanical properties of hot rolled steel plate using machine learning
      • Prediction and Analysis of Tensile Properties of Austenitic Stainless Steel Using Artificial Neural
      • Tempcore, new process for the production of high quality reinforcing
      • TEMPCORE, the most convenient process to produce low cost high strength rebars from 8 to 75 mm
      • Experimental investigation and simulation of structure and tensile properties of Tempcore treated re
    • Notes
  • LiDAR
    • Processing of Point Cloud
    • Intro. 3D Object Detection
    • PointNet
    • PointNet++
    • Frustrum-PointNet
    • VoxelNet
    • Point RCNN
    • PointPillars
    • LaserNet
  • Simulator
    • Simulator List
    • CARLA
    • Airsim
      • Setup
      • Tutorial
        • T#1
        • T#2
        • T#3: Opencv CPP
        • T#4: Opencv Py
        • Untitled
        • T#5: End2End Driving
  • Resources
    • Useful Resources
    • Github
    • Jekyll
  • Reinforcement Learning
    • RL Overview
      • RL Bootcamp
      • MIT Deep RL
    • Textbook
    • Basics
    • Continuous Space RL
  • Unsupervised Learning
    • Introduction
  • Unclassified
    • Ethics
    • Conference Guideline
  • FPGA
    • Untitled
  • Numerical Method
    • NM API reference
Powered by GitBook
On this page
  • PointPillar
  • Feature Encoder (Pillar feature net):
  • Backbone
  • Performance

Was this helpful?

  1. LiDAR

PointPillars

PreviousPoint RCNNNextLaserNet

Last updated 3 years ago

Was this helpful?

PointPillar

PointPillars: Fast Encoders for Object Detection From Point Clouds

Lang, Alex H., et al. "Pointpillars: Fast encoders for object detection from point clouds." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.

Github:

A method for object detection in 3D that enables end-to-end learning with only 2D convolutional layers.

Utilized PointNets, uses only LiDAR.

method for object detection in 3D that enables end-to-end learning with only 2D convolutional layers.

uses encoder that learns features on pillars (vertical columns) of the point cloud to predict 3D oriented boxes for objects.

Advantages

  • by learning features instead of relying on fixed encoders, PointPillars can leverage the full information represented by the point cloud.

  • Further, by operating on pillars instead of voxels there is no need to tune the binning of the vertical direction by hand.

  • Finally, pillars are fast because all key operations can be formulated as 2D convolutions which are extremely efficient to compute on a GPU. Run at 62Hz~105Hz

Contributions

  • a point cloud encoder and network that operates on the point cloud to enable end-to-end training of a 3D object detection network.

  • We show how all computations on pillars can be posed as dense 2D convolutions which enables inference at 62 Hz; a factor of 2-4 times faster than other methods.

  • We conduct experiments on the KITTI dataset and demonstrate state of the art results on cars, pedestrians, and cyclists on both BEV and 3D benchmarks.

Network

It consists of three main stages (Figure 2):

  1. A feature encoder network that converts a point cloud to a sparse pseudoimage

  2. a 2D convolutional backbone to process the pseudo-image into high-level representation

  3. a detection head that detects and regresses 3D boxes.

Feature Encoder (Pillar feature net):

Used pillar instead of voxels to avoid using 3D Conv.

Converts the point cloud into a sparse pseudo image. First, the point cloud is divided into grids in the x-y coordinates, creating a set of pillars. Each point in the cloud, which is a 4-dimensional vector (x,y,z, reflectance), is converted to a 9-dimensional vector containing the additional information explained as follows:

  • Xc, Yc, Zc = Distance from the arithmetic mean of the pillar c the point belongs to in each dimension.

  • Xp, Yp = Distance of the point from the center of the pillar in the x-y coordinate system.

Hence, a point now contains the information D = [x,y,z,r,Xc,Yc,Zc,Xp,Yp].

Feature Encoder creates pillars on the point cloud. Then each point is converted to a 9-dimensional vector encapsulating information about the pillar it belongs to.

For each grid (k), zero padding is applied when Nk is smaller than N.

[Image from] https://becominghuman.ai/pointpillars-3d-point-clouds-bounding-box-detection-and-tracking-pointnet-pointnet-lasernet-67e26116de5a

It used a simplified PointNet to generate from (D, P,N) --> (C, P, N) --> max.pooling in N direction- -> (C,P)

This paper used N=8000, 12000, 16000, C=64

Backbone

The backbone constitutes of sequential 3D convolutional layers to learn features from the transformed input at different scales. The input to the RPN is the feature map provided by the Feature Net.

The network has three blocks of fully convolutional layers. The first layer of each block downsamples the feature map by half via convolution with a stride size of 2, followed by a sequence of convolutions of stride 1 (×q means q applications of the filter). After each convolution layer, BN and ReLU operations are applied. We then upsample the output of every block to a fixed size and concatenated to construct the high-resolution feature map.

Loss Function

The loss function is optimized using Adam. The total loss is

Localization Regression

The same loss function used in SECOND with parameters of (x, y, z,w, l, h, theta).

Classification loss

used focal loss

Performance

img
img

An example of a backbone (RPN) Region Proposal Network used in Point Pillars. The image is taken from the paper which originally proposed this network.

Paper link
https://github.com/nutonomy/second.pytorch
Images from here
VoxelNet