-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attributes: remove obsolete quantization internals #2112
attributes: remove obsolete quantization internals #2112
Conversation
@oneapi-src/onednn-cpu-aarch64 @oneapi-src/onednn-gpu-nvidia Kind ping to check on the changes and confirm you are fine with them. |
@theComputeKid @t4c1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SYCL/AMD/Nvidia changes look fine, but they are relatively small. The main question I have is do we actually want to remove a feature without having a replacement ready?
We actually do have the replacement, see this RFC. The output_scales mechanism was replaced with per-argument scales. In a nutshell, it allows:
Internally, you have two options:
|
@mgouicem we are currently in the process of making some changes in aarch64 for quantization and will check how it affects us. Will get back to you shortly. cc: @renato-arantes |
Just checked and all good from my side. From an aesthetic perspective, could you please trigger the new MacOS pipelines with Werror enabled to make sure it passes build? (You might need to rebase to run them). |
2d1f560
to
7cdb506
Compare
I'd like to start the process of improving attributes internals responsible for quantization.
There were two major changes affected that piece of functionality:
Both changes left a pile of technical debt towards a better solution, and it start biting in attempts to extends groups and data type support further.
To succeed with refactor, some obsolete entries required to be removed, and later to be replaced with modern analogue of that functionality in affected implementations.
This PR removes
oscale
abstraction and everything related to it as there's no way to invoke it through a public API. It also removesdefined()
method as unneeded. The change touches all teams as there are old pieces of code laying there. Please confirm you are fine with the change and will update quantized support in the correspondent backend as and if needed.