-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-threaded dataloading with tch-rs #1
Comments
Hello, thanks for your comment! I'm currently working on the single threaded version, this should be available soon |
Mono-threaded |
Hello, Thank you for your excellent work. Any idea by what time we can have a data loader with multi-threaded parallelism support? Thank you |
I'm currently benchmarking the mono-threaded version against the Pytorch dataloader, multi-thread is definitely the next stop for me. As I'm doing this on my spare times, I can't promise when it will be released, though. |
Hello :) |
Hello ;) I've also observed 2x speedup against PyTorch in my benchmarks, which are available here. I'm also sure multi-threaded will improve the speed up, as Rust won't have the same limitation of the Python GIL
Totally agree, I haven't found that much documentation on the subject, other than this tutorial which seems pretty good. |
Another approach could also be to inspire from the burn parallel datalaoder |
Nice I'll look into that. |
Nice founding! x4 time speedup could justify adding this solution as an MVP |
Hello, FYI my implementation is slightly different, and I'm not sure your implementation might use multiple thread. In fact using Also what I'm trying now is to add prefetching. This way each time IMO there is 3 way to achieve multithreading. |
Hello @AzHicham , thanks for your comments.
Good catch! I think I will keep the install to be able to setup the number of threads but I will use rayon primitive inside of it to make sure parallelism is used.
I think it's a great idea! Prefetching is definitely something I wanted to add and any works on this is welcome. Using a fixed-sized queue seems to be fine, I will need to take a closer look to PyTorch implementation to give better insight. |
The multithreaded version should be fixed by b7035e2 , thanks again for spotting the issue. |
Hello,
Thank you for your awesome work !!!
As you may know there is no dataloader feature in tch-rs, it could be really cool to have it with ai-dataloader and maybe with multi-threading handling in further steps.
Thank you :)
The text was updated successfully, but these errors were encountered: