347. Point Pillar: 3D Object Detection from Point Clouds

Point Pillar is an architecture proposed for 3D object detection using point clouds as inputs.

The architecture consists of mainly 3 elements.

This phase takes the following steps.

Divide point clouds into grids in the X-Y coordinates which creates a set of pillars.
Most pillars will be sparse, so the network then creates a dense tensor by only including non-empty pillars.
By using the dense tensor as input, the paper uses a simple version of PointNet to output a [C, P, N] shaped tensor and encodes the feature to a [C, P] shaped tensor.
Scatter back the encoded feature to the original pillar location.

This paper uses a 2D Conv backbone consisting of 2 subnetworks.

Uses the Single Shot Detector setup to perform the 3D object detection.