Int4 Support #1104

fmac2000 · 2023-02-28T13:17:00Z

Hello Authors,

I apologise for asking questions unrelated to an issue with the repo however, would you consider support a newer paradigm I came across whilst reading a recent paper?

It looks incredibly promising and rather well written I must say, especially when considering the performance of such a precision.
Is there anyone on the team able to give this a shot?

guillaumekln · 2023-03-09T16:38:10Z

Hello,

Thank you for sharing this paper!

At this time I don't plan on integrating INT4 which would require using CUTLASS to define custom kernels. We are currently using cuBLAS for matrix multiplication.

jncraton · 2023-06-17T21:07:53Z

Would it be reasonable to implement this as a CPU-only optimization? GGML supports this on CPU, but I'm not sure if that approach makes sense here or not.

Matthieu-Tinycoaching · 2023-06-22T11:51:57Z

Hi,

Would be great to have the possibility to integrate int4 quantization regarding the very interesting results in terms of performance and inference!

nickchomey · 2023-09-14T13:59:50Z

I see that the last few versions of opennmt have added support for 4bit and other quantization methods. https://forum.opennmt.net/t/opennmt-py-v3-3-released-following-3-2-with-plenty-of-new-features/5366

Might any of that be integrated into CTranslate2?

bil-ash · 2024-04-08T01:44:33Z

@guillaumekln Yes, 4bit quantization (on cpu) is a very much required feature. Any plans of taking this up?

bil-ash · 2024-04-08T02:07:52Z

Or maybe @ebraraktas can go one step further and implement 2bit and 3bit quantization using by taking clues from intel/neural-speed#178

guillaumekln added the enhancement New feature or request label Mar 3, 2023

guillaumekln mentioned this issue Jun 21, 2023

4 bit quantization support #1304

Closed

guillaumekln mentioned this issue Sep 11, 2023

support for gguf #1465

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Int4 Support #1104

Int4 Support #1104

fmac2000 commented Feb 28, 2023

guillaumekln commented Mar 9, 2023

jncraton commented Jun 17, 2023 •

edited

Loading

Matthieu-Tinycoaching commented Jun 22, 2023

nickchomey commented Sep 14, 2023

bil-ash commented Apr 8, 2024

bil-ash commented Apr 8, 2024

Int4 Support #1104

Int4 Support #1104

Comments

fmac2000 commented Feb 28, 2023

guillaumekln commented Mar 9, 2023

jncraton commented Jun 17, 2023 • edited Loading

Matthieu-Tinycoaching commented Jun 22, 2023

nickchomey commented Sep 14, 2023

bil-ash commented Apr 8, 2024

bil-ash commented Apr 8, 2024

jncraton commented Jun 17, 2023 •

edited

Loading