Point Cloud - newlife-js/Wiki GitHub Wiki

by 중앙대학교 권준석 교수님

3D Representation

Volume(Voxel): 박스를 이용해서 표현

Three-dimension grid -> high memory usage
3D convolution -> many parameter
Surface smoothing 필요

Mesh: Triangle mesh를 통해 표현(3 vertex, 3 edge)

순서가 달라도 같은 삼각형(Permutation invariance)이지만 DL에서는 같은 삼각형으로 인식하기 어려움

Point Cloud: point로 표현

raw sensor data와 유사, efficient memory usage
Weak geometric relations: point 간의 관계를 표현하기 어려움
Surface smoothing 필요

Point Cloud Classification

Conventional Method

Permutation Invariance

point의 순서와 상관없이 같은 결과를 출력해야 함
Permutation Invariance Model: CNN, RNN은 순서와 관계가 있음 -> MLP
Permutation invariance Function: Symmetric function을 설계(+, max 등)
Sorting: Ordered set

PointNet

End-to-end learning for scattered, unordered point data
Unified framework for various tasks(classification, part segmentation, semantic segmentation)
Conventional method와 다르게 point cloud를 직접 handling

Permutation Invariance

MLP와 Max pooling을 사용해 permutation invariance 해결

PointNet이 Voxel 이용한 방법보다 무조건 성능이 뛰어나다는 것이 증명되었음..
classification에 핵심적인 critical point를 잘 뽑아내고, noise가 추가되어도 크게 영향받지 않음

Geometric Invariance

Geometric Transformation(Translation, reflection, rotation, glide reflection)에 영향 받지 않도록 canonical space mapping
3x3 transform matrix(T-net)를 이용해 해결

PointNet Segmentation

local embedding에 global feature을 붙여서 포인트별로 classification을 수행
skip link와 16-class indicating one-hot vector를 추가해 성능을 높일 수 있음

※ 한계: CNN이 아닌 MLP를 이용하기 때문에 local context를 잘 반영하지 못함

PointNet++

Hierarchical version of PointNet

Set abstraction layer을 반복: sampling layer + grouping layer + PointNet layer
local regions에 대하여 PointNet을 사용(Convolution과 비슷한 효과)
Density adaptive layer: non-uniform sampling density에 robust하도록 density에 따라 다르게 처리하도록
MSG(Multi-scale grouping): 스케일을 다르게 하면서 grouping하여 concat
MRG(Multi-resolution grouping): resolution을 다르게 하면서 grouping하여 concat

※ Non-euclidean space를 사용해 더 좋은 성능을 낼 수 있음

3D Point Cloud Generation

noise vector로부터 point cloud를 생성하는 task

l-GAN

Auto-Encoder의 latent vector를 GAN의 input으로 사용하여 generation
latent spaces가 GMM(Gaussian Mixture Model)을 따르도록 학습

r-GAN

Raw point cloud GAN(Vanilla GAN을 Point Cloud에 적용)

AE의 장점

latent space의 linearity를 사용해서 다른 두 object 사이의 interpolation 가능(의자의 다리길이 variation)
(+/-) 연산을 통해 원하는 특징을 넣거나 뺄 수 있음(팔걸이 유무)
partial input 으로부터 output을 얻을 수 있음

Graph Neural Networks

Adjacency Matrix(A): node 간의 연결이 있으면 1, 아니면 0
Degree Matrix(D): node 별로 연결된 인접 node의 수를 대각행렬로 나타냄

Laplacian Matrix(L): node들 간의 차이점을 나타내는 행렬(L = D - A )

Feature Matrix(H: NxF): N=node 수, F=feature 수

GCN(Graph Convolutional Networks)

Adjacency matrix에 self loop(항등행렬)을 더함
Degree matrix를 이용해 Adjacency matrix를 normalize함

※ Depth를 늘리면 오히려 성능이 떨어지는 문제
over-smoothing: 처음 layer의 정보를 잊어버리기 때문
-> Initial Residual + Identity Mapping 추가(GCN2)

GraphSAGE

stochastic sampling + flexible aggregation(mean, pool, LSTM)

PointNet + GCN

MLP에 GCN layer를 추가

EdgeConv

Captures local geometric structure while maintaining permutation invariance
layer마다 KNN을 통해 graph 관계를 dynamically update함

LocalSpecGCN

Recursive clustering and pooling을 통해서 node의 수를 줄임
(max pooling은 정보가 제대로 보존되지 않음)

knn-GAN

Unsupervised point cloud generation, encoding geometric relations
기존의 GCN은 adjacency matrix는 고정하지만, knn-GAN은 업데이트함

Upsampling을 통해서 node의 개수를 늘려나감

Tree-GAN

Tree-structured convolutions
Frechet point cloud distance metric 도입
Hierarchical relationship도 표현 가능(knn-GAN은 flat한 정보만 가짐)