Skip to content

RandLA-Net


Limitations of existing methods:

  1. Most approaches are limited to extremely 3D point clouds. methods like:
    • PointNet
    • PointNet++
    • PointCNN
    • PCCN
    • ShellNet
  2. Few methods can directly process large-scale point clouds, but they either rely on time-consuming preprocessing or computationally expensive voxelization steps. methods like:
    • SPG
    • FCPN
    • TagentConv
    • PCT

Goal

Develope a method that is

  • Process large-scale point clouds directly
    • Without block partition and block merging
    • Take the whole geometry into consideration
  • Computationally efficient & Memory efficient
    • Without time-consuming preprocessing or voxelization steps
    • inference a large-scale point cloud in a single pass
  • Effective & Accurate
    • Complex geometrix structures
    • Capture and preserve the prominent features

Key Problems

  • Efficient point sampling to reduce memomry footprint and computational cost
  • Effective local feature aggregatiion to capture the geometrical patterns


So we have the problem of "discarding useful features" in Random Point Sampling. To solve this problem, Local Feature Aggregation is proposed.

Local Feature Aggregation

Local Spatial Encoding

  1. Find the neighboring points for each point by using KNN
  2. Explicitly encode the relative point position of all neighboring points so that the corresponding point features are always aware of their relative spatial locations. this tends to aid the network to captures the geometric patterns.
  3. Encoded relative point position features are concatenated with its corresponding point features.

Attentive Pooling

Goal: Aggregate the neighboring feature set Instead of using max or mean pooling to hard integrade the neighboring features (majority of information has been lost), we turn to the powerful attention mechanism to automatically learn important local features from the neighboring feature set.

  1. Computing attention scores: with a shared function \(g\) to learn a unique attention score for each feature
  2. Weighted Summation: \(\widetilde{f_i}\) which is the aggregated features

Dilated Redidual Block

Since the large point clouds are going to be substantially down-sampled, it is desirable to significantly increase the receptive field of each point (so that geometric details are more likely to reserved even if some points are dropped).

So multiple spatial encoding and attentive pooling units are stacked together with a skip connection as a dilated residual block.

RandLA-Net Architechture