Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compute] Porting Q4_0 and Q8_0 weight FullyConnected compute library #13909

Closed
hseok-oh opened this issue Sep 3, 2024 · 2 comments
Closed
Assignees
Labels
area/onert ONE runtime

Comments

@hseok-oh
Copy link
Contributor

hseok-oh commented Sep 3, 2024

What

Porting ggml compuation library: Q4_0 FullyConnected (mulmat) into compute/cker or better place

Why

Prepare to support block quantized FullyConnected layer in CPU backend

@hseok-oh hseok-oh converted this from a draft issue Sep 3, 2024
@hseok-oh hseok-oh added this to the ONERT LLM Milestone 1 milestone Sep 3, 2024
@hseok-oh hseok-oh added the area/onert ONE runtime label Sep 3, 2024
@glistening
Copy link
Contributor

glistening commented Sep 4, 2024

We may start from single thread implementation. However, for near future:

How many threads fully connected kernel for block quantization will use?
I don't remember vividly how it works in other kernels. (e.g. ruy, eigen, ...).
As I remember, the number of threads is not controlled by core, but each kernel uses threads as it like.

🌳 Config.lst

CONFIG(RUY_THREADS             , int          , "-1")
CONFIG(XNNPACK_THREADS         , int          , "-1")

Also, I don't remember what is the relationship between previous environment variable THREAD about 5 years ago.

@hseok-oh hseok-oh moved this from Ready to Start to In Progress in [ONE] onert - LLM support Sep 4, 2024
@hseok-oh
Copy link
Contributor Author

hseok-oh commented Sep 4, 2024

I think we can merge this two config to ONERT_THREADS

Usage example:

cmd += [f"BACKENDS={';'.join(backend_list)}"]
cmd += [f"RUY_THREADS={self.num_threads}"]
cmd += [f"XNNPACK_THREADS={self.num_threads}"]

I'll make PR to merge config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/onert ONE runtime
Projects
Status: Done
Development

No branches or pull requests

2 participants