Some information.. #6

masc-it · 2021-09-14T15:37:04Z

masc-it
Sep 14, 2021

Hello @artyom-beilis, my name is Mauro, an AI student from Italy.

First thing first, I love the idea behind this project. I am an (un)lucky owner of a 6600xt, a powerful card yet unusable for ML tasks due to the lack of compatibility in the current frameworks. (I tried PlaidML and directML, but c'mon..).

I have thoroughly read your blog post, motivations are strong, but as you might imagine building a community is one of the most important things in the open source world.

What if I wanted to contribute to the project? You have shown all the benchmarks and stuff (and look, results seem great), but
do you plan to include some documentation of the code and some build-related tutorials?

Are you planning to make dlprimitives a tensorflow/pytorch backend?

Thanks for your attention,
Mauro.

artyom-beilis · 2021-09-14T18:39:34Z

artyom-beilis
Sep 14, 2021
Maintainer

Hello Mauro,

I am an (un)lucky owner of a 6600xt, a powerful card yet unusable for ML tasks due to the lack of compatibility in the current frameworks. (I tried PlaidML and directML, but c'mon..).

First of all if you are Linux user you may be lucky in few month. ROCm promise to finally support RNDA(2) till end of 2021. ROCm/ROCm#1180 (comment)

This way you'll be able to use it for ML. Also as a stop gap I suggest to check Caffe-OpenCL, also it is as efficient in terms of memory management as Keras/PlaidML - it has quite good performance - also not as fast as cuDNN/miOpen solutions - but it also works (I recommend building it with clBlas support of as BLAS library).

Of course if you are on Windows the options are much more limited - as you mentioned PlaidML (that has quite poor performance) .

Regarding dlprimitives - it is very young project but what exists seems to be working.

What if I wanted to contribute to the project?

I'd be glad.

1st of all start using it try few things (there are under examples) - try to train some nets.

2nd of course is contributions of code, there are several areas, that each require little bit different knowledge. In general this projects needs one of: knowledge in GPU programming, knowledge of C++ or Python (of course more is better)

But it terms of things to do I have a lot:

I need to improve python/c++ interface to make it easier build different network variants: transfer learning, Siamese networks, GANs, recurrent nets etc - this is probably the easiest stuff to start contributing to.
I have only few operators, much more needed: Dropout, PreLU, MSE Loss, RNN, LSTM, Reshaping and much more - these are of course the quite complex things as you need to know both DL Math, GPU programming and C++ to some extent.
I need to extend the support of operator to other types: float16 and bfloat16 for more efficient GPU training/inference
I need to add support of multi-gpu training
I need to improve performance of exiting layers for other GPUs/CPUs: Mali, improve Intel performance (maybe take adopt some kernels from oneDNN for intel GPUs), provide optimization for CPU based OpenCL training
Need to add more conv algorithms, FFT based, Wingorad for 5x5 and so on.
Provide better converters from/to ONNX, Caffe and other formats.

do you plan to include some documentation of the code and some build-related tutorials?

I agree documentation is lacking, considering it exists only for few month and only recently I proved to myself it actually works well, I think it isn't in that bad shape but need to dig in to get it.

C++ API is mostly documented via Doxygen I didn't published them in HTML format yet (dlprimitives.org is up only few days ago) but all you need is clone the repo and run doxigen and you'll got the C++ API reference under docs/doxygen/html
Python documentation is more problematic... But you can look into examples, finally python interface is still very limited and need to be extended
The JSON network format can be found there under docs/network_json_format.md it gives information about operators, options etc to start training nets.
Build instructions are placed in root directory under BUILD.md and I assume you need have some knowledge of how to compile C++/cmake project in general. The only difference for Windows will be using nmake or ninja... but once again I assume you have basic knowledge of cmake.

Are you planning to make dlprimitives a tensorflow/pytorch backend?

Yes. I want to implement opencl backend for pytorch since this one sees to have much better code base and it is very fast, but it is very complex project as I need to learn lots of PT internals.

I did started working to adding DLPrimitives support to caffe-opencl since it is very simple and its code really-really readable (and I'm familiar with it)

I do want to extend PlaidML with dlprimitives plaidml/plaidml#1857 but this is probably my last priority unless I'll have git problems with PT. In any case it can be good option as well as it may allow having relatively good performance framework in short time, but I don't see huge community that even willing to answer my question.

PlaidML are working to be integrated to TF as backend (since multi-backend-Keras RIP). If they succeed it would be much easier to replace few performance critical ops with dlprimitives at least for channel-first format I support.

Regards and remember 1st thing that helps - start using and report issues!

(I has written a lot there - probably need to add to blog post)

2 replies

sevagh Oct 15, 2021

First of all if you are Linux user you may be lucky in few month. ROCm promise to finally support RNDA(2) till end of 2021. ROCm/ROCm#1180 (comment)

I certainly hope it's true and not an empty promise.

alex3s Oct 16, 2021

First of all if you are Linux user you may be lucky in few month. ROCm promise to finally support RNDA(2) till end of 2021. RadeonOpenCompute/ROCm#1180 (comment)

I certainly hope it's true and not an empty promise.

for me clinfo -l already works on 6700xt and 6600xt cards!
I'm able to run opencl programs with the amdgpu driver, no separate ROCm needed

masc-it · 2021-09-15T06:42:11Z

masc-it
Sep 15, 2021
Author

Appreciate your effort man, hope other people will notice this repo!

1 reply

alex3s Oct 1, 2021

@masc-it AMD RX 6600xt supports OpenCL so you should be able to run DLPrimitives without any problem.
If you running Linux just install the AMDGPU official driver like this:
https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-21-30


cd amdgpu-pro-21.30-1290604-ubuntu-20.04
./amdgpu-install --no-32 --no-dkms --opencl=rocr --headless

" --headless " is only if you are not needing this card to run the graphical x11windows on your machine

sudo apt install clinfo
clinfo -l

Platform #0: AMD Accelerated Parallel Processing
 -- Device #0: gfx1020/gfx1030

if the above shows your card, then you are all good, you can proceed with installation of DLPrimitivies.
By the way @artyom-beilis , maybe it would be good to add a few first steps to official installation guide :

git clone https://github.com/artyom-beilis/dlprimitives/
cd dlprimitives
mkdir build 
cd build

or even these steps may be necessary if you have fresh Linux installation or didn't compile other things on your computer:

sudo apt install git build-essential opencl-headers

if you have an AMD cpu and want better performance, you can also try downloading optimized clang:

https://developer.amd.com/amd-aocc/
and then do:
source /opt/AMD/aocc-compiler-3.1.0/setenv_AOCC.sh

cmake .. -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_CXX_FLAGS="-O2 -march=native" -DCMAKE_C_FLAGS="-O2 -march=native"
cheers!
edit: code was not displayed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some information.. #6

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Some information.. #6

masc-it Sep 14, 2021

Replies: 2 comments · 3 replies

artyom-beilis Sep 14, 2021 Maintainer

sevagh Oct 15, 2021

alex3s Oct 16, 2021

masc-it Sep 15, 2021 Author

alex3s Oct 1, 2021

masc-it
Sep 14, 2021

Replies: 2 comments 3 replies

artyom-beilis
Sep 14, 2021
Maintainer

masc-it
Sep 15, 2021
Author