Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference example with pretrained model,CUDA kernel failed : no kernel image is available for execution on the device #155

Open
Kitsch123456 opened this issue Aug 3, 2024 · 3 comments

Comments

@Kitsch123456
Copy link

When i settle down the environment required and run the sentence python tools/inference.py cfgs/PCN_models/AdaPoinTr.yaml ckpts/AdaPoinTr_PCN.pth --pc_root demo/ --save_vis_img --out_pc_root inference_result/

then it returns
2024-08-03 13:13:59,386 - MODEL - INFO - Transformer with config {'NAME': 'AdaPoinTr', 'num_query': 512, 'num_points': 16384, 'center_num': [512, 256], 'global_feature_dim': 1024, 'encoder_type': 'graph', 'decoder_type': 'fc', 'encoder_config': {'embed_dim': 384, 'depth': 6, 'num_heads': 6, 'k': 8, 'n_group': 2, 'mlp_ratio': 2.0, 'block_style_list': ['attn-graph', 'attn', 'attn', 'attn', 'attn', 'attn'], 'combine_style': 'concat'}, 'decoder_config': {'embed_dim': 384, 'depth': 8, 'num_heads': 6, 'k': 8, 'n_group': 2, 'mlp_ratio': 2.0, 'self_attn_block_style_list': ['attn-graph', 'attn', 'attn', 'attn', 'attn', 'attn', 'attn', 'attn'], 'self_attn_combine_style': 'concat', 'cross_attn_block_style_list': ['attn-graph', 'attn', 'attn', 'attn', 'attn', 'attn', 'attn', 'attn'], 'cross_attn_combine_style': 'concat'}}
using group version 2
Loading weights from ckpts/AdaPoinTr_PCN.pth...
ckpts @ 353 epoch( performance = {'F-Score': 0.8446799506656607, 'CDL1': 6.527985830404605, 'CDL2': 0.19307194130320907, 'EMDistance': 0.0})
CUDA kernel failed : no kernel image is available for execution on the device
void furthest_point_sampling_kernel_wrapper(int, int, int, const float*, float*, int*) at L:228 in /home/mi/Pointnet2_PyTorch/pointnet2_ops_lib/pointnet2_ops/_ext-src/src/sampling_gpu.cu

@Kitsch123456
Copy link
Author

i run the code on the rtx3060laptop ,gcc9, torch-1.7.1+cu110 torchaudio-0.7.2 torchvision-0.13.1+cu113

@Nineyoyoyo
Copy link

Nineyoyoyo commented Aug 17, 2024

I used the same command as you and got the same error. Here’s what I did.

Best Solution (here reference) :

  1. go to file /Pointnet2_PyTorch/pointnet2_ops_lib/pointnet2_ops/_ext-src/src/sampling_gpu.cu
  2. comment out all the lines with CUDA_CHECK_ERRORS(); (there're 3 places)
  3. run python3 setup.py install again in pointnet2_ops_lib folder

Second Solution (here reference) :
(I've tried this but it did not work for me )

  1. go to file /Pointnet2_PyTorch/pointnet2_ops_lib/setup.py
  2. change the line os.environ["TORCH_CUDA_ARCH_LIST"] = "3.7+PTX;5.0;6.0;6.1;6.2;7.0;7.5" to os.environ["TORCH_CUDA_ARCH_LIST"] = "5.0;6.0;6.1;6.2;7.0;7.5;8.0;8.6;8.7;8.9;9.0" or just add your specific cuda arch code (see this list), in my case I use A100 so it's 8.0
  3. run python3 setup.py install again in pointnet2_ops_lib folder

The problem come from the pointnet2_ops library, as shown in your output here:

void furthest_point_sampling_kernel_wrapper(int, int, int, const float*, float*, int*) at L:228 in /home/mi/Pointnet2_PyTorch/pointnet2_ops_lib/pointnet2_ops/_ext-src/src/sampling_gpu.cu

Since there has been no maintenance of the Pointnet2_PyTorch library since July 31, 2021, the contributor mentioned it here

@Yiju1213
Copy link

Hello friend, I also encounter this issue when I'm about to create a dockerfile containing adapointr. At the end, I noticed that the problem is as @Nineyoyoyo said in the "Second Solution". The "TORCH_CUDA_ARCH_LIST" is a sign indicated the nvidia GPU compatibility when you compile the CUDA kernel image.

The Pointnet2_PyTorch is no maintenance since July 31, 2021, and in the Pointnet2_PyTorch/pointnet2_ops_lib/setup.py, the TORCH_CUDA_ARCH_LIST is set to "3.7+PTX;5.0;6.0;6.1;6.2;7.0;7.5", which is incompatible with your rtx3060laptop's 8.6. So even though you can build the image kernel, you can not run it using your 3060!

So the solution is as @Nineyoyoyo said, change the value to "5.0;6.0;6.1;6.2;7.0;7.5;8.0;8.6;8.7;8.9;9.0" or even "8.0;8.6;8.7;8.9;9.0".

My platform is 4070tis, and I'm currently running this well after this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants