Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VALI/src/TC/src/CudaUtils.cpp:32 CUDA error: CUDA_ERROR_NOT_INITIALIZED initialization error #75

Open
tianyan01 opened this issue Jul 25, 2024 · 4 comments

Comments

@tianyan01
Copy link

Hi,
I use VALI in a torch training task.
Here's my test code(need to modify the video's url):

import PyNvCodec as nvc
import numpy as np
import torch
import os
import random
from torch.utils.data.distributed import DistributedSampler
from torch.utils.data import DataLoader

def pynv_read_video(pyDec, frame_indice, gpu_id):
    # GPU-accelerated converter
    pyCvt = nvc.PySurfaceConverter(
        pyDec.Format(),
        nvc.PixelFormat.RGB,
        gpu_id=gpu_id
    )

    # Colorspace conversion context
    cc_ctx = nvc.ColorspaceConversionContext(
        pyDec.ColorSpace(),
        pyDec.ColorRange()
    )

    # GPU-accelerated Surface downloader
    pyDwn = nvc.PySurfaceDownloader(
        gpu_id=gpu_id
    )

    # Raw decoded Surface
    surf_src = nvc.Surface.Make(
        format=pyDec.Format(),
        width=pyDec.Width(),
        height=pyDec.Height(),
        gpu_id=gpu_id
    )

    # Raw Surface, converted to RGB
    surf_dst = nvc.Surface.Make(
        format=nvc.PixelFormat.RGB,
        width=pyDec.Width(),
        height=pyDec.Height(),
        gpu_id=gpu_id
    )

    # Numpy array which contains decoded RGB Surface
    frame = np.ndarray(
        dtype=np.uint8,
        shape=surf_dst.HostSize())

    video = []
    for idx in frame_indice:
        seek_ctx = nvc.SeekContext(seek_frame=idx)
        success, details = pyDec.DecodeSingleSurface(surf_src, seek_ctx=seek_ctx)

        # Convert tot RGB
        success, details = pyCvt.Run(surf_src, surf_dst, cc_ctx)

        # Copy pixels to numpy ndarray
        pyDwn.Run(surf_dst, frame)

        res_frame = np.reshape(
            frame,
            (pyDec.Height(), pyDec.Width(), 3))
        
        t = torch.Tensor(res_frame)
        video.append(t)
    video = torch.stack(video)
    return video


class MyDataset(torch.utils.data.Dataset):
    def __init__(self, path):
        """ init """
        self.rank = int(os.environ["LOCAL_RANK"])
        self.samples_num = 10
        self.path = path

    def read_video(self, num_frames):
        """ read video """
        # GPU-accelerated decoder
        pyDec = nvc.PyDecoder(
            self.path,
            {},
            self.rank,
        )
        total_frames = pyDec.NumFrames()
        # video_fps = pyDec.AvgFramerate()

        frame_indice = np.linspace(0, total_frames - 1, num_frames, dtype=int)
        video = pynv_read_video(pyDec, frame_indice, self.rank)
        return video

    def __getitem__(self, index):
        """ get item """
        try:
            video = self.read_video(10)
            return video
        except Exception as e:
            print(f"Error {e}")

    def __len__(self):
        """len"""
        return self.samples_num

def seed_worker(worker_id):
    """ seed worker """
    worker_seed = 1024
    np.random.seed(worker_seed)
    torch.manual_seed(worker_seed)
    random.seed(worker_seed)

url = "/path/to/test.mp4"
my_dataset = MyDataset(url)
sampler = DistributedSampler(
    my_dataset, 
    num_replicas=8, 
    rank=my_dataset.rank, 
    shuffle=False
)
my_dataloader = DataLoader(
    my_dataset,
    batch_size=1,
    sampler=sampler,
    worker_init_fn=seed_worker,
    drop_last=False,
    pin_memory=True,
    num_workers=8,
)
dataloader_iter = iter(my_dataloader)
video = next(dataloader_iter)
print(my_dataset.rank, video.shape)

It send me an error: VALI/src/TC/src/CudaUtils.cpp:32 CUDA error: CUDA_ERROR_NOT_INITIALIZED initialization error.
How to fix it? Thanks!

@RomanArzumanyan
Copy link
Owner

Hi @tianyan01

Can you run samples or unit tests ?

@tianyan01
Copy link
Author

Hi @tianyan01

Can you run samples or unit tests ?

I can run unit test.But when I wrap it with torch's dataset and dataloader, it can't run.

@RomanArzumanyan
Copy link
Owner

@tianyan01

Most probably something is going on with the script, not the VALI itself. E. g. torch does something to CUDA runtime.
The only thing I can recommend you is to simplify your app step by step until you are able to run it and isolate the culprit.

In order to find and fix the VALI bug I need an MVP, not he whole user script.

P. S.
Please don't use PyDecoder.Seek method to shuffle the frames.
Seek is costly operation. Just decode Surfaces one by one and shuffle your video list container.

@tianyan01
Copy link
Author

@RomanArzumanyan
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants