VALI/src/TC/src/CudaUtils.cpp:32 CUDA error: CUDA_ERROR_NOT_INITIALIZED initialization error #75

tianyan01 · 2024-07-25T08:07:11Z

Hi,
I use VALI in a torch training task.
Here's my test code(need to modify the video's url):

import PyNvCodec as nvc
import numpy as np
import torch
import os
import random
from torch.utils.data.distributed import DistributedSampler
from torch.utils.data import DataLoader

def pynv_read_video(pyDec, frame_indice, gpu_id):
    # GPU-accelerated converter
    pyCvt = nvc.PySurfaceConverter(
        pyDec.Format(),
        nvc.PixelFormat.RGB,
        gpu_id=gpu_id
    )

    # Colorspace conversion context
    cc_ctx = nvc.ColorspaceConversionContext(
        pyDec.ColorSpace(),
        pyDec.ColorRange()
    )

    # GPU-accelerated Surface downloader
    pyDwn = nvc.PySurfaceDownloader(
        gpu_id=gpu_id
    )

    # Raw decoded Surface
    surf_src = nvc.Surface.Make(
        format=pyDec.Format(),
        width=pyDec.Width(),
        height=pyDec.Height(),
        gpu_id=gpu_id
    )

    # Raw Surface, converted to RGB
    surf_dst = nvc.Surface.Make(
        format=nvc.PixelFormat.RGB,
        width=pyDec.Width(),
        height=pyDec.Height(),
        gpu_id=gpu_id
    )

    # Numpy array which contains decoded RGB Surface
    frame = np.ndarray(
        dtype=np.uint8,
        shape=surf_dst.HostSize())

    video = []
    for idx in frame_indice:
        seek_ctx = nvc.SeekContext(seek_frame=idx)
        success, details = pyDec.DecodeSingleSurface(surf_src, seek_ctx=seek_ctx)

        # Convert tot RGB
        success, details = pyCvt.Run(surf_src, surf_dst, cc_ctx)

        # Copy pixels to numpy ndarray
        pyDwn.Run(surf_dst, frame)

        res_frame = np.reshape(
            frame,
            (pyDec.Height(), pyDec.Width(), 3))
        
        t = torch.Tensor(res_frame)
        video.append(t)
    video = torch.stack(video)
    return video


class MyDataset(torch.utils.data.Dataset):
    def __init__(self, path):
        """ init """
        self.rank = int(os.environ["LOCAL_RANK"])
        self.samples_num = 10
        self.path = path

    def read_video(self, num_frames):
        """ read video """
        # GPU-accelerated decoder
        pyDec = nvc.PyDecoder(
            self.path,
            {},
            self.rank,
        )
        total_frames = pyDec.NumFrames()
        # video_fps = pyDec.AvgFramerate()

        frame_indice = np.linspace(0, total_frames - 1, num_frames, dtype=int)
        video = pynv_read_video(pyDec, frame_indice, self.rank)
        return video

    def __getitem__(self, index):
        """ get item """
        try:
            video = self.read_video(10)
            return video
        except Exception as e:
            print(f"Error {e}")

    def __len__(self):
        """len"""
        return self.samples_num

def seed_worker(worker_id):
    """ seed worker """
    worker_seed = 1024
    np.random.seed(worker_seed)
    torch.manual_seed(worker_seed)
    random.seed(worker_seed)

url = "/path/to/test.mp4"
my_dataset = MyDataset(url)
sampler = DistributedSampler(
    my_dataset, 
    num_replicas=8, 
    rank=my_dataset.rank, 
    shuffle=False
)
my_dataloader = DataLoader(
    my_dataset,
    batch_size=1,
    sampler=sampler,
    worker_init_fn=seed_worker,
    drop_last=False,
    pin_memory=True,
    num_workers=8,
)
dataloader_iter = iter(my_dataloader)
video = next(dataloader_iter)
print(my_dataset.rank, video.shape)

It send me an error: VALI/src/TC/src/CudaUtils.cpp:32 CUDA error: CUDA_ERROR_NOT_INITIALIZED initialization error.
How to fix it? Thanks!

RomanArzumanyan · 2024-07-25T08:21:06Z

Hi @tianyan01

Can you run samples or unit tests ?

tianyan01 · 2024-07-25T08:48:55Z

Hi @tianyan01

Can you run samples or unit tests ?

I can run unit test.But when I wrap it with torch's dataset and dataloader, it can't run.

RomanArzumanyan · 2024-07-25T08:56:10Z

@tianyan01

Most probably something is going on with the script, not the VALI itself. E. g. torch does something to CUDA runtime.
The only thing I can recommend you is to simplify your app step by step until you are able to run it and isolate the culprit.

In order to find and fix the VALI bug I need an MVP, not he whole user script.

P. S.
Please don't use PyDecoder.Seek method to shuffle the frames.
Seek is costly operation. Just decode Surfaces one by one and shuffle your video list container.

tianyan01 · 2024-07-25T09:06:42Z

@RomanArzumanyan
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VALI/src/TC/src/CudaUtils.cpp:32 CUDA error: CUDA_ERROR_NOT_INITIALIZED initialization error #75

VALI/src/TC/src/CudaUtils.cpp:32 CUDA error: CUDA_ERROR_NOT_INITIALIZED initialization error #75

tianyan01 commented Jul 25, 2024

RomanArzumanyan commented Jul 25, 2024

tianyan01 commented Jul 25, 2024

RomanArzumanyan commented Jul 25, 2024

tianyan01 commented Jul 25, 2024

VALI/src/TC/src/CudaUtils.cpp:32 CUDA error: CUDA_ERROR_NOT_INITIALIZED initialization error #75

VALI/src/TC/src/CudaUtils.cpp:32 CUDA error: CUDA_ERROR_NOT_INITIALIZED initialization error #75

Comments

tianyan01 commented Jul 25, 2024

RomanArzumanyan commented Jul 25, 2024

tianyan01 commented Jul 25, 2024

RomanArzumanyan commented Jul 25, 2024

tianyan01 commented Jul 25, 2024