-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apparent locking issues when running across multiple GPUs #283
Comments
i've been using claude opus to do ai pyton coding on complex tasks - and I frequently just throw the entire codebase (as one file) as context and ask ai to fix / find problems. are you using gpu context because it's faster? are you on intel? im looking to do some processing on 35,000 videos - and currently it's taking 1-5 mins per video. claude opus may have found your locking problem here |
Oh I've been meaning to do a proper writeup on video decoding for a few months now, and just haven't had the time. Quick notes for now on what we learned from processing video at reasonably large scale (millions of files/billions of frames/few hundred TB of data):
Throwing the whole repo into an LLM is a technique I hadn't thought of, so it's interesting to see what it came back with! Realistically I'm not likely to get the time to dive into the decord codebase and see how accurate the suggestions are, but if it is possible to get the threading and context locking sorted out that'd be a big win for the library. |
I've noticed an interesting issue when running on multi-GPU machines: although selecting
gpu(N)
as the decoding context initially works as expected, the overall throughput when running multiple processes drops off very rapidly until there's only one process showing activity on a single GPU, sometimes with occasional very short bursts of processing from others.This happens even when the processes are totally independent (started separately from different
screen
sessions, operating on entirely different files, using separate GPUs, for example), which leads me to think there's probably a hardware- or system-level locking mechanism being used globally rather than per-process since it occurs even between separate python instances.Working theory is that it could be falling through to a global lock of some kind due to setting
decoder_info_.vidLock = nullptr;
, but so far that hasn't brought us closer to a fix. Would be very helpful to hear if anyone else has (or hasn't!) run into similar issues?Possibly related to #187 and/or #159?
The text was updated successfully, but these errors were encountered: