PySurfaceDownloader (minor) issue: too fast! #71

Yves33 · 2024-07-23T06:12:01Z

Hi again!
On my machine (core i7 12700 / GeForce RTX 3060), I find that running successfully.

PyDecoder.DecodeSingleSurface(nv12_surface)
PySurfaceDownloader.Run(nv12_surface, nv12_cpu_buffer)

may result in altered nv12 buffer ( top of image is cropped - see reconstituted and resized image from buffer below)

the problem

only appears with hi res videos (here 5760 x 2880)
does not appear when the same code is run in jupyter notebook (or at least appears less frequently...)
does not appear if pycuda is used to copy to cpu (required GpuMem branch)
does not appear if frame is converted to rgb on gpu then downloaded.
can be solved by introducing 2ms delay between both operations (time.sleep(0.002))

minimal code to reproduce:
https://github.com/Yves33/Vali_luma_chroma_shift/blob/main/Vali_nv12_download.py

it seems to me that the download starts before decoded frame is ready.

The text was updated successfully, but these errors were encountered:

RomanArzumanyan · 2024-07-23T07:50:49Z

Hi @Yves33

Thank you for the detailed analysis.
I suspect the actual source of problem is here:

VALI/src/TC/src/TaskDecodeFrame.cpp

Lines 518 to 534 in 3236822

 static void CopyToSurface(AVFrame& src, Surface& dst) { 

 CUDA_MEMCPY2D m = {0}; 

 m.srcMemoryType = CU_MEMORYTYPE_DEVICE; 

 m.dstMemoryType = CU_MEMORYTYPE_DEVICE; 

 for (auto i = 0U; src.data[i]; i++) { 

 m.srcDevice = (CUdeviceptr)src.data[i]; 

 m.srcPitch = src.linesize[i]; 

 m.dstDevice = dst.PixelPtr(i); 

 m.dstPitch = dst.Pitch(i); 

 m.WidthInBytes = dst.Width(i) * dst.ElemSize(); 

 m.Height = dst.Height(i); 

 CudaCtxPush push_ctx(GetContextByDptr(m.dstDevice)); 

 ThrowOnCudaError(cuMemcpy2D(&m), __LINE__); 

 } 

 }

After I've switched to default CUDA stream and started to push context created by FFMpeg, the reproduction ratio fell but still some decoder unit tests fail to pass from time to time.

I'll continue investigation on this. Maybe if I make FFMpeg use the same CUDA context as VALI, it will be solved. Anyway, as I find something I'll come back to you.

RomanArzumanyan · 2024-07-23T11:00:43Z

Hi @Yves33

Please check out latest version 3.2.10. Looks like it solves the issue, at least 2 previously unstable hw decoder unit tests are now passing.

Yves33 · 2024-07-23T17:22:19Z

version 3.2.10 sovles the issue (at least with my test videos...)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PySurfaceDownloader (minor) issue: too fast! #71

PySurfaceDownloader (minor) issue: too fast! #71

Yves33 commented Jul 23, 2024

RomanArzumanyan commented Jul 23, 2024

RomanArzumanyan commented Jul 23, 2024

Yves33 commented Jul 23, 2024

PySurfaceDownloader (minor) issue: too fast! #71

PySurfaceDownloader (minor) issue: too fast! #71

Comments

Yves33 commented Jul 23, 2024

RomanArzumanyan commented Jul 23, 2024

RomanArzumanyan commented Jul 23, 2024

Yves33 commented Jul 23, 2024