Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

支持CUDA11.1吗 #182

Open
MachineVision123 opened this issue Dec 3, 2020 · 2 comments
Open

支持CUDA11.1吗 #182

MachineVision123 opened this issue Dec 3, 2020 · 2 comments

Comments

@MachineVision123
Copy link

我使用 from warp_ctc import CTCLoss 之后,报错:
ImportError: ....... warp-ctc-pytorch_bindings/pytorch_binding/warpctc_pytorch/_warp_ctc.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c1011CPUTensorIdEv
我之前是能用的。然后我用nvidia-smi查看cuda版本为11.1 ,请问是否不支持cuda11.1?期待解答,不胜感激!

@Rhythmblue
Copy link

Rhythmblue commented Dec 16, 2020

I add some lines in CMakeLists for RTX3090+CUDA11.1+pytorch1.7.1.

warp-ctc/CMakeLists.txt

Lines 50 to 57 in e2609d8

IF (CUDA_VERSION GREATER 8.9)
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_70,code=sm_70")
ENDIF()
if (NOT APPLE)
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} --std=c++14")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS}")
ENDIF()

IF (CUDA_VERSION GREATER 8.9)   
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_70,code=sm_70")
ENDIF()

IF (CUDA_VERSION GREATER 9.9)
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_75,code=sm_75")
ENDIF() 

IF (CUDA_VERSION GREATER 10.9) 
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_80,code=sm_80")
ENDIF()                                                                          

if (NOT APPLE)    
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} --std=c++14") 
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS}")
ENDIF() 

It works for the following testing codes,

import torch        
import warpctc_pytorch as warp_ctc

def test_simple():      
    probs = torch.FloatTensor([[[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.6, 0.1, 0.1]]]).transpose(0, 1).contiguous()
    grads = torch.zeros(probs.size())                                                                               
    labels = torch.IntTensor([1, 2])                                                                                
    label_sizes = torch.IntTensor([2])   
    sizes = torch.IntTensor(probs.size(1)).fill_(probs.size(0)) 
    minibatch_size = probs.size(1)                                                                                  
    costs = torch.zeros(minibatch_size)                                                                             
    warp_ctc.cpu_ctc(probs,                                                                                         
                     grads,                                                                                         
                     labels,                                                                                        
                     label_sizes,                                                                                   
                     sizes,                                                                                         
                     minibatch_size,                                                                                
                     costs,                                                                                         
                     0)                                                                                             
    print('CPU_cost: %f' % costs.sum())                                                                             
    probs = probs.clone().cuda()                                                                                    
    grads = torch.zeros(probs.size()).cuda()                                                                        
    costs = torch.zeros(minibatch_size)                                                                             
    warp_ctc.gpu_ctc(probs,                                                                                         
                     grads,                                                                                         
                     labels,                                                                                        
                     label_sizes,                                                                                   
                     sizes,                                                                                         
                     minibatch_size,                                                                                
                     costs,                                                                                         
                     0)                                                                                             
    print('GPU_cost: %f' % costs.sum())                                                                             
    print(grads.view(grads.size(0) * grads.size(1), grads.size(2)))

if __name__ == '__main__':
    test_simple()

Output:

CPU_cost: 2.462858        
GPU_cost: 2.462858    
tensor([[ 0.1770, -0.7081,  0.1770,  0.1770,  0.1770],    
        [ 0.1770,  0.1770, -0.7081,  0.1770,  0.1770]], device='cuda:0')

@Luna-yu
Copy link

Luna-yu commented Mar 22, 2021

I add some lines in CMakeLists for RTX3090+CUDA11.1+pytorch1.7.1.

warp-ctc/CMakeLists.txt

Lines 50 to 57 in e2609d8

IF (CUDA_VERSION GREATER 8.9)
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_70,code=sm_70")
ENDIF()
if (NOT APPLE)
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} --std=c++14")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS}")
ENDIF()

IF (CUDA_VERSION GREATER 8.9)   
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_70,code=sm_70")
ENDIF()

IF (CUDA_VERSION GREATER 9.9)
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_75,code=sm_75")
ENDIF() 

IF (CUDA_VERSION GREATER 10.9) 
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_80,code=sm_80")
ENDIF()                                                                          

if (NOT APPLE)    
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} --std=c++14") 
    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS}")
ENDIF() 

It works for the following testing codes,

import torch        
import warpctc_pytorch as warp_ctc

def test_simple():      
    probs = torch.FloatTensor([[[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.6, 0.1, 0.1]]]).transpose(0, 1).contiguous()
    grads = torch.zeros(probs.size())                                                                               
    labels = torch.IntTensor([1, 2])                                                                                
    label_sizes = torch.IntTensor([2])   
    sizes = torch.IntTensor(probs.size(1)).fill_(probs.size(0)) 
    minibatch_size = probs.size(1)                                                                                  
    costs = torch.zeros(minibatch_size)                                                                             
    warp_ctc.cpu_ctc(probs,                                                                                         
                     grads,                                                                                         
                     labels,                                                                                        
                     label_sizes,                                                                                   
                     sizes,                                                                                         
                     minibatch_size,                                                                                
                     costs,                                                                                         
                     0)                                                                                             
    print('CPU_cost: %f' % costs.sum())                                                                             
    probs = probs.clone().cuda()                                                                                    
    grads = torch.zeros(probs.size()).cuda()                                                                        
    costs = torch.zeros(minibatch_size)                                                                             
    warp_ctc.gpu_ctc(probs,                                                                                         
                     grads,                                                                                         
                     labels,                                                                                        
                     label_sizes,                                                                                   
                     sizes,                                                                                         
                     minibatch_size,                                                                                
                     costs,                                                                                         
                     0)                                                                                             
    print('GPU_cost: %f' % costs.sum())                                                                             
    print(grads.view(grads.size(0) * grads.size(1), grads.size(2)))

if __name__ == '__main__':
    test_simple()

Output:

CPU_cost: 2.462858        
GPU_cost: 2.462858    
tensor([[ 0.1770, -0.7081,  0.1770,  0.1770,  0.1770],    
        [ 0.1770,  0.1770, -0.7081,  0.1770,  0.1770]], device='cuda:0')

(yu_tf_py) root@gao-PowerEdge-T640:/yu_ex/warp-ctc3333/pytorch_binding# python setup.py install
Traceback (most recent call last):
File "setup.py", line 6, in
from torch.utils.cpp_extension import BuildExtension, CppExtension
ModuleNotFoundError: No module named 'torch.utils'
(yu_tf_py) root@gao-PowerEdge-T640:/yu_ex/warp-ctc3333/pytorch_binding# pip install torch.utils
Requirement already satisfied: torch.utils in /root/ana3/envs/yu_tf_py/lib/python3.6/site-packages (0.1.2)
Requirement already satisfied: torch in /root/ana3/envs/yu_tf_py/lib/python3.6/site-packages (from torch.utils) (1.7.1+cu110)
Requirement already satisfied: numpy in /root/ana3/envs/yu_tf_py/lib/python3.6/site-packages (from torch->torch.utils) (1.19.2)
Requirement already satisfied: dataclasses in /root/ana3/envs/yu_tf_py/lib/python3.6/site-packages (from torch->torch.utils) (0.8)
Requirement already satisfied: typing-extensions in /root/ana3/envs/yu_tf_py/lib/python3.6/site-packages (from torch->torch.utils) (3.7.4.3)
I encountered such a problem when running install. Would you like to ask me if you have encountered any?
cuda11.1 torch 1.7.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants