You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generalized Static Quantizatoin infra, which can handle different model architectures.
For models which maintains states and take previous inference output as input (conformer-based), current Calbrationdatareader is not very supportive for models such as this.
Current implementation for Static Quantization function quant_static() takes in fp32 model and calibrationdatsetreader
and does the quantization but, this implementation does not support models like zipformers where in input and output are dependent on states
Worked with Zipformer2 (based on conformers) models from icefall, the model has its input states from previous inference output(except for first inference as all are zeros).
Zipformer is a Transformer based ASR model, with major compute ops of Matmul and Conv(depth and pointwise)
Model : Zipformer2 https://arxiv.org/abs/2310.11230
Inputs : X(feature vector), and States -> a list of 96 states(initialised as 0).
output: Y, and list of 96 states (these output states are maintained separately and fed as input for next output)
Describe the feature request
Generalized Static Quantizatoin infra, which can handle different model architectures.
For models which maintains states and take previous inference output as input (conformer-based), current Calbrationdatareader is not very supportive for models such as this.
Current implementation for Static Quantization function quant_static() takes in fp32 model and calibrationdatsetreader
and does the quantization but, this implementation does not support models like zipformers where in input and output are dependent on states
Worked with Zipformer2 (based on conformers) models from icefall, the model has its input states from previous inference output(except for first inference as all are zeros).
Zipformer is a Transformer based ASR model, with major compute ops of Matmul and Conv(depth and pointwise)
Model : Zipformer2 https://arxiv.org/abs/2310.11230
Inputs : X(feature vector), and States -> a list of 96 states(initialised as 0).
output: Y, and list of 96 states (these output states are maintained separately and fed as input for next output)
Discussion started on this : #19538
Describe scenario use case
As new and evolved model architectures are growing in numbers, it will be very helpful in handling future requirements for the model.
The text was updated successfully, but these errors were encountered: