You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We may start from single thread implementation. However, for near future:
How many threads fully connected kernel for block quantization will use?
I don't remember vividly how it works in other kernels. (e.g. ruy, eigen, ...).
As I remember, the number of threads is not controlled by core, but each kernel uses threads as it like.
🌳 Config.lst
CONFIG(RUY_THREADS , int , "-1")
CONFIG(XNNPACK_THREADS , int , "-1")
Also, I don't remember what is the relationship between previous environment variable THREAD about 5 years ago.
What
Porting ggml compuation library: Q4_0 FullyConnected (mulmat) into
compute/cker
or better placeWhy
Prepare to support block quantized FullyConnected layer in CPU backend
The text was updated successfully, but these errors were encountered: