[Performance] fp16 model performance decreases when the "inter op threads" setting is greater than 1. #18822

yuandcc · 2023-12-14T13:40:57Z

Describe the issue

I converted the fp32 model to an fp16 model through the "convert_float_to_float16" and tested the inference time with the same data. When intra_op_threads is equal to 1, 4, and 8, the cost time of the fp32 model meets expectations. The cost time of the fp16 model meets expectations when intra_op_threads is equal to 1, but does not meet expectations when intra_op_threads is equal to 4 or 8.

To reproduce

Using both the fp32 model and the fp16 model, test different values of intra_op_threads.

Urgency

No response

Platform

Linux

OS Version

Ububtu 20.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16.3

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

Yes

github-actions · 2024-01-13T15:00:50Z

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

github-actions · 2024-02-12T15:00:52Z

This issue has been automatically closed due to inactivity. Please reactivate if further support is needed.

github-actions bot added the quantization issues related to quantization label Dec 14, 2023

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Jan 13, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] fp16 model performance decreases when the "inter op threads" setting is greater than 1. #18822

[Performance] fp16 model performance decreases when the "inter op threads" setting is greater than 1. #18822

yuandcc commented Dec 14, 2023

github-actions bot commented Jan 13, 2024

github-actions bot commented Feb 12, 2024

[Performance] fp16 model performance decreases when the "inter op threads" setting is greater than 1. #18822

[Performance] fp16 model performance decreases when the "inter op threads" setting is greater than 1. #18822

Comments

yuandcc commented Dec 14, 2023

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

github-actions bot commented Jan 13, 2024

github-actions bot commented Feb 12, 2024