RandLA-Net

Limitations of existing methods:¶

Most approaches are limited to extremely 3D point clouds. methods like:
- PointNet
- PointNet++
- PointCNN
- PCCN
- ShellNet
Few methods can directly process large-scale point clouds, but they either rely on time-consuming preprocessing or computationally expensive voxelization steps. methods like:
- SPG
- FCPN
- TagentConv
- PCT

Goal¶

Develope a method that is

Process large-scale point clouds directly
- Without block partition and block merging
- Take the whole geometry into consideration
Computationally efficient & Memory efficient
- Without time-consuming preprocessing or voxelization steps
- inference a large-scale point cloud in a single pass
Effective & Accurate
- Complex geometrix structures
- Capture and preserve the prominent features

Key Problems¶

Efficient point sampling to reduce memomry footprint and computational cost
Effective local feature aggregatiion to capture the geometrical patterns

So we have the problem of "discarding useful features" in Random Point Sampling. To solve this problem, Local Feature Aggregation is proposed.

Local Feature Aggregation¶

Local Spatial Encoding¶

Find the neighboring points for each point by using KNN
Explicitly encode the relative point position of all neighboring points so that the corresponding point features are always aware of their relative spatial locations. this tends to aid the network to captures the geometric patterns.
Encoded relative point position features are concatenated with its corresponding point features.

Attentive Pooling¶

Goal: Aggregate the neighboring feature set Instead of using max or mean pooling to hard integrade the neighboring features (majority of information has been lost), we turn to the powerful attention mechanism to automatically learn important local features from the neighboring feature set.

Computing attention scores: with a shared function \(g\) to learn a unique attention score for each feature
Weighted Summation: \(\widetilde{f_i}\) which is the aggregated features

Dilated Redidual Block¶

Since the large point clouds are going to be substantially down-sampled, it is desirable to significantly increase the receptive field of each point (so that geometric details are more likely to reserved even if some points are dropped).

So multiple spatial encoding and attentive pooling units are stacked together with a skip connection as a dilated residual block.