Skip to content
/ TF-KNN Public

TensorFlow KNN Ops for point cloud based on CPU(KDTree) and GPU(CUDA) respectively.

Notifications You must be signed in to change notification settings

boryun/TF-KNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TF-KNN

This repository contains TensorFlow KNN Ops based on CPU(KDTree) and GPU(CUDA) respectively. CUDA version is a modification of KNN-CUDA, KDTree version is based on nanoflann.

There is also a pure tensorflow implementation of KNN in demo.py, yet it can easily get OOM when handling large scale pointcloud (basically the reason why I create this repository) as a distance matrix with the size of [batch_size, num_points, num_queries] need to be stored in VRAM (however, it is still faster than mine CUDA implementation when available, 🥺).

Notes:

  • The GPU version is very sensitive to K, larger K may cause the computation time grows dramatically.
  • The CPU version is the dominator during time test, which I insist to use instead of CUDA or pure TensorFlow version.

Usage

Both version has the same way to build the Op (yet for GPU version you may need to change the CUDA_HOME in compile.sh first):

  1. activate your anaconda env with TensorFlow installed.
  2. run compile.sh in CUDA or KDTree folder.
  3. import the knn_grouping function in knn_grouping.py.
  4. use tf.gather with `batch_dims=1' to gather the NN via returned indices.

Time Consumption (in python)

Sys & Env Info

  • SYS: Ubuntu 20.04.2 LTS
  • CPU: Intel i7-6700
  • GPU: Nvidia GTX980
  • CUDA: 10.1
  • Python: 3.7.9
  • Tensorflow: 2.3.1

Denote B,N,M,K for batch_size, reference_points, query_points and num_neighbours respectively, the computation times in python (averaged over 20 runs) are as following:

At  B=8, N=8192, M=512, K=16

Pure TF CUDA KDTree KDTree(OpenMP)
time(s) 0.026 0.085 0.018 0.005

At  B=8, N=8192, M=512, K=32

Pure TF CUDA KDTree KDTree(OpenMP)
time(s) 0.035 0.202 0.025 0.006

At  B=8, N=8192, M=512, K=64

Pure TF CUDA KDTree KDTree(OpenMP)
time(s) 0.053 0.559 0.037 0.009

At  B=8, N=32768, M=2048, K=16

Pure TF CUDA KDTree KDTree(OpenMP)
time(s) OOM 0.371 0.084 0.017

At  B=8, N=32768, M=2048, K=32

Pure TF CUDA KDTree KDTree(OpenMP)
time(s) OOM 0.506 0.112 0.024

At  B=8, N=32768, M=2048, K=64

Pure TF CUDA KDTree KDTree(OpenMP)
time(s) OOM 1.085 0.168 0.035

About

TensorFlow KNN Ops for point cloud based on CPU(KDTree) and GPU(CUDA) respectively.

Resources

Stars

Watchers

Forks