Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: [onert] block quantization #13693

Closed
wants to merge 8 commits into from

Conversation

hseok-oh
Copy link
Contributor

@hseok-oh hseok-oh commented Aug 19, 2024

Support runtime block quantization: uint4 / int8 with 32 chunk size, fp16 delta.

ONE-DCO-1.0-Signed-off-by: Hyeongseok Oh [email protected]

@hseok-oh hseok-oh changed the title Draft: [onert] blockwise quantization Draft: [onert] block quantization Aug 19, 2024
@hseok-oh hseok-oh changed the title Draft: [onert] block quantization Draft: [onert] chunk quantization Aug 20, 2024
// Represents a specific quantization technique's parameters.
union QuantizationDetails {
CustomQuantization,
CircleBlockQuantization
Copy link
Contributor

@glistening glistening Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the prefix Circle necessary to avoid name conflict from flatbuffers generated files? I guess GGMLBlockQuantization may be better as @jinevening suggested offline. It makes it clear what CircleBlockQuantization means.

@hseok-oh hseok-oh changed the title Draft: [onert] chunk quantization Draft: [onert] block quantization Aug 28, 2024
@hseok-oh
Copy link
Contributor Author

All merged

@hseok-oh hseok-oh closed this Sep 12, 2024
@hseok-oh hseok-oh deleted the draft/block_quant branch September 12, 2024 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR/NO TEST Tell CI to not run test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants