🖍️
gitbook_docs
  • Introduction
  • Machine Learning
    • Recommended Courses
      • For Undergrad Research
      • Math for Machine Learning
    • ML Notes
      • Covariance Correlation
      • Feature Selection
      • Linear Regression
      • Entropy, Cross-Entropy, KL Divergence
      • Bayesian Classifier
        • Terminology Review
        • Bayesian Classifier for Normally Distributed classes
      • Linear Discriminant Analysis
      • Logistic Regression
        • Logistic Regression Math
      • Logistic Regression-MaximumLikelihood
      • SVM
        • SVM concept
        • SVM math
      • Cross Validation
      • Parameter, Density Estimation
        • MAP, MLE
        • Gaussian Mixture Model
      • E-M
      • Density Estimation(non-parametric)
      • Unsupervised Learning
      • Clustering
      • kNN
      • WaveletTransform
      • Decision Tree
    • Probability and Statistics for Machine Learning
      • Introduction
      • Basics of Data Analysis
      • Probability for Discrete Random Variable
      • Poisson Distribution
      • Chi-Square Distribution
      • P-value and Statistical Hypothesis
      • Power and Sample Size
      • Hypothesis Test Old
      • Hypothesis Test
      • Multi Armed Bandit
      • Bayesian Inference
      • Bayesian Updating with Continuous Priors
      • Discrete Distribution
      • Comparison of Bayesian and frequentist inference
      • Confidence Intervals for Normal Data
      • Frequenist Methods
      • Null Hypothesis Significance Testing
      • Confidence Intervals: Three Views
      • Confidence Intervals for the Mean of Non-normal Data
      • Probabilistic Prediction
  • Industrial AI
    • PHM Dataset
    • BearingFault_Journal
      • Support Vector Machine based
      • Autoregressive(AR) model based
      • Envelope Extraction based
      • Wavelet Decomposition based
      • Prediction of RUL with Deep Convolution Nueral Network
      • Prediction of RUL with Information Entropy
      • Feature Model and Feature Selection
    • TempCore Journal
      • Machine learning of mechanical properties of steels
      • Online prediction of mechanical properties of hot rolled steel plate using machine learning
      • Prediction and Analysis of Tensile Properties of Austenitic Stainless Steel Using Artificial Neural
      • Tempcore, new process for the production of high quality reinforcing
      • TEMPCORE, the most convenient process to produce low cost high strength rebars from 8 to 75 mm
      • Experimental investigation and simulation of structure and tensile properties of Tempcore treated re
    • Notes
  • LiDAR
    • Processing of Point Cloud
    • Intro. 3D Object Detection
    • PointNet
    • PointNet++
    • Frustrum-PointNet
    • VoxelNet
    • Point RCNN
    • PointPillars
    • LaserNet
  • Simulator
    • Simulator List
    • CARLA
    • Airsim
      • Setup
      • Tutorial
        • T#1
        • T#2
        • T#3: Opencv CPP
        • T#4: Opencv Py
        • Untitled
        • T#5: End2End Driving
  • Resources
    • Useful Resources
    • Github
    • Jekyll
  • Reinforcement Learning
    • RL Overview
      • RL Bootcamp
      • MIT Deep RL
    • Textbook
    • Basics
    • Continuous Space RL
  • Unsupervised Learning
    • Introduction
  • Unclassified
    • Ethics
    • Conference Guideline
  • FPGA
    • Untitled
  • Numerical Method
    • NM API reference
Powered by GitBook
On this page
  • Introduction
  • Comparison with VoxelNet
  • Comparison with AVOD and Frustum-PointNet
  • Architecture
  • 1. Botton-Up 3D Proposal Generation
  • 2. Point cloud region pooling
  • 3. Canonical 3D bounding box refinement

Was this helpful?

  1. LiDAR

Point RCNN

PreviousVoxelNetNextPointPillars

Last updated 3 years ago

Was this helpful?

Shi, Shaoshuai, Xiaogang Wang, and Hongsheng Li. "Pointrcnn: 3d object proposal generation and detection from point cloud." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.

Github:

Introduction

PointRCNN: 3D object detection from raw point cloud. The whole framework is composed of two stages:

  1. stage-1 for the bottom-up 3D proposal generation from points

    • Instead of using bird-eye view or voxels as previous models

  2. stage-2 for refining proposals in the canonical coordinates to obtain the final detection results.

    • Transforms the pooled points of each proposal to canonical coordinates to learn better local spatial features, which is combined with global semantic features of each point learned in stage-1 for accurate box refinement and confidence prediction

Comparison with VoxelNet

VoxelNet

  • 1-Stage network

  • 3D Voxels based. Uses 3D Convolution(inefficient)

Point RCNN

  • 2-Stages network

  • Applies point cloud directly on Region Proposals

Comparison with AVOD and Frustum-PointNet

AVOD, F-PointNet: Top-down. Creates ROI then uses point.

PointRCNN: Bottom-Up. Uses point to get ROI.

Architecture

1. Botton-Up 3D Proposal Generation

In 2D image network, two-stage network generates proposals first then refines the proposal in the second stage.

Direct extension of the two-stage methods from 2D to 3D is non-trivial due to the huge 3D search space and the irregular format of point clouds.

  • See Comparison with AVOD and F-Pointnet. (Top-down manner)

This paper generates 3D proposals in a bottom-up manner.

Specifically, we learn point-wise features to segment the raw point cloud and to generate 3D proposals from the segmented foreground points simultaneously.

Learning point cloud representations.

To discriminative point-wise features for describing the raw point clouds, we utilize the PointNet++ as the backbone network.

Foreground point segmentation.

The foreground segmentation and 3D box proposal generation are performed simultaneously.

For point segmentation, the ground-truth segmentation mask is naturally provided by the 3D ground-truth boxes.

Training 시, 3D bounding box Ground truth 정보로 Foreground segmentation를 적용하여바로 foreground point를 알 수 있음.

Inference 시에는 Foreground Point Segmentation--> Bin-based 3D Box generation 를 순차적으로 수행

Used Focal Loss for training: there exists imbalance between number of foregorund vs number of background.

Bin-based 3D bounding box generation.

During training, we only require the box regression head to regress 3D bounding box locations from foreground points.

A 3D bounding box is represented as (x; y; z; h;w; l; ) in the LiDAR coordinate system, where (x; y; z) is the object center location, (h;w; l) is the object size, and  is the object orientation from the bird’s view

we propose bin-based regression losses for estimating 3D bounding boxes of objects.

  • split the surrounding area of each foreground point into a series of discrete bins along the X and Z axes.

2. Point cloud region pooling

Enlarged bounding box to encode additional context information. Pooled points in ROI uses 1) coordinate, 2) intensity, 3) semantic mask, 4) semantic features.

From the proposed 3D box, enlarge the size by η\etaη and keep the points(coordinates, reflection, segmentation mask, semantic features) inside the enlarged box.

3. Canonical 3D bounding box refinement

Canonical coordinate system for one 3D box enables the box refinement stage to learn better local spatial features for each proposal.

(1) the origin is located at the center of the box proposal;

(2) the local X0 and Z0 axes are approximately parallel to the ground plane with X0 pointing towards the head direction of proposal and the other Z0 axis perpendicular to X0;

(3) the Y 0 axis remains the same as that of the LiDAR coordinate system

Feature Learning for Box Proposal Refinement

Concatenation of

  • (1) Canonical Transformed local spatial points

  • (2) Extra features (reflection, segmentation mask, Euclidean distance of box from origin)

  • (3) global sematic feature f(p) from Stage-1

https://arxiv.org/abs/1812.04244
https://github.com/sshaoshuai/PointRCNN