-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status of libdnn as of April 2018 #25
Comments
|
Thanks a lot for such a quick response!
Nice! Any hints about the time when it is likely to happen? E.g. this summer, this year?
May be you should introduce a higher-level device abstraction (abstract base class?), which does not depend on anything directly? For integrating libdnn with a specific platform/package one would need to implement a concrete code to map libdnn abstraction to the platform/package abstractions.
Thanks for explaining the differences.
I look forward to reading your paper.
OK.
Sounds very interesting. IMHO, caching cries for being an abstraction just like a device. Depending on the usage of libdnn, using SQLite may be good or not so good idea. Auto-tuner improvements (and usage examples) would be very nice. I got the current version of the tuner to run, but I'm not sure how to persist the results or reuse the tuned parameters for the different convolution kernels being generated for the same HW platform. Is it even possible to tune for multiple convolution kernels at once? It doesn't look like libDNN provides a lot of machinery for this, or? |
Yes exactly, the autotuner was more of a proof-of-concept thus far. I'm also working with lower-end devices now (Rasperry Pi VC4CL and Mali T740) to check how the autotuner can be made economic, reliable and then again, results will be stored in SQLite. The time frame for pushing standalone LibDNN update would be ~August. Contributions to LibDNN within Caffe (the interface can already be used from there if you are interested in developing apps with LibDNN support) are welcome, as well as suggestions to improvement. |
Your library is pretty cool, but looks like it was not updated for a long period of time.
At the same time, the version of libdnn in your Caffe fork seems to be more maintained and even got some new features, like BLAS routines generators, etc.
Could you provide some insight about your plans regarding the standalone libdnn or libdnn in general?
Specifically, it would be nice if you could answer some of the following questions:
Do you plan to update the standalone libdnn, e.g. from the version in your Caffe fork?
What is the status of the BLAS support in the Caffe's version of libdnn? How does it compare to something like CLBlas, CLBlast or CUDA counterparts of those?
Could you provide a brief description of the algorithms you use when producing optimized fused convolution (and other) kernels and how/why they are better/faster than e.g. im2col-based approaches or other well-known implementations of convolutions either in terms of performance or memory consumption? The documentation is pretty sparse currently. If it is based on any specific papers or well-known approaches, it would be nice if you could provide references.
How is libdnn In terms of the convolutions performance compared to the current versions of cuDNN and other well-known implementations. You reported it was very fast, often faster than competitors in the past. Is it still the case, or may be there were some recent achievements that made other implementations faster?
Do you plan to add any new interesting features or improvements? If so, could you describe your them?
Thanks!
The text was updated successfully, but these errors were encountered: