Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5-Small different output for decoder inference with CPU and DirectML EPs #22896

Closed
r4ghu opened this issue Nov 19, 2024 · 2 comments
Closed
Labels
ep:DML issues related to the DirectML execution provider stale issues that have not been addressed in a while; categorized by a bot

Comments

@r4ghu
Copy link

r4ghu commented Nov 19, 2024

Describe the issue

Hi team,
I am currently running T5-Small model inference using OnnxRuntime. The model I am using to run the inference is - https://huggingface.co/Xenova/t5-small/tree/main/onnx

I tested the same model on CPU and DirectML execution providers and observed different outputs for the same input during the decoding stage.

  • encoder_model.onnx - This model is working as expected in both CPU and DirectML EPs.
  • decoder_model_merged.onnx - This model has outputs beyond acceptable range for CPU and DirectML. If anyone from ORT team can investigate, it would be really helpful.

I am attaching some results for CPU and DirectML comparisons for reference -

=== Comparing Encoder Outputs ===

Comparing Encoder outputs:
Shapes: (1, 12, 512) vs (1, 12, 512)

Statistics for first array:
  mean: -0.002746098442003131
  std: 0.12785771489143372
  min: -0.5774061679840088
  max: 0.5452761054039001
  abs_max: 0.5774061679840088
  has_nan: False
  has_inf: False

Statistics for second array:
  mean: -0.00274610030464828
  std: 0.1278577446937561
  min: -0.5774062871932983
  max: 0.5452762246131897
  abs_max: 0.5774062871932983
  has_nan: False
  has_inf: False

Difference analysis:
  Maximum absolute difference: 5.736947059631348e-07
  Mean absolute difference: 5.666575475515856e-08
  Maximum relative difference: 0.07109003514051437
  Position of max difference: (np.int64(0), np.int64(1), np.int64(401))
✅ Differences within acceptable threshold (1e-05)

=== Comparing Decoder Outputs ===

Comparing Decoder logits:
Shapes: (1, 1, 32128) vs (1, 1, 32128)

Statistics for first array:
  mean: -19.10366439819336
  std: 4.460851669311523
  min: -43.21986389160156
  max: -1.202622890472412
  abs_max: 43.21986389160156
  has_nan: False
  has_inf: False

Statistics for second array:
  mean: -19.10366439819336
  std: 4.460851669311523
  min: -43.21989059448242
  max: -1.2026221752166748
  abs_max: 43.21989059448242
  has_nan: False
  has_inf: False

Difference analysis:
  Maximum absolute difference: 5.7220458984375e-05
  Mean absolute difference: 7.175476639531553e-06
  Maximum relative difference: 2.00232352653984e-06
  Position of max difference: (np.int64(0), np.int64(0), np.int64(32113))
❌ Large difference detected! (> 1e-05)

Values at maximum difference point:
  Array1: -43.13878631591797
  Array2: -43.13884353637695

Surrounding values (if available):
  Array1 at [np.int64(0), np.int64(0), np.int64(32112)]: -43.058406829833984
  Array2 at [np.int64(0), np.int64(0), np.int64(32112)]: -43.058406829833984
  Array1 at [np.int64(0), np.int64(0), np.int64(32114)]: -43.1171760559082
  Array2 at [np.int64(0), np.int64(0), np.int64(32114)]: -43.11715316772461

To reproduce

Please run the above mentioned model using encode and decode methods.

Urgency

I would like to get this resolved by end of Dec 2024.

Platform

Windows

OS Version

Windows 11 Enterprise 22631.4169

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.20.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

DirectML

Execution Provider Library Version

DirectML 1.15.4

@github-actions github-actions bot added the ep:DML issues related to the DirectML execution provider label Nov 19, 2024
@tianleiwu
Copy link
Contributor

@r4ghu,

5.7220458984375e-05 does not seems a large difference for a model.
Could you use end-to-end metrics (like precision/recall etc) to measure and see whether it makes any difference between CPU and DirectML?

Copy link
Contributor

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

@github-actions github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:DML issues related to the DirectML execution provider stale issues that have not been addressed in a while; categorized by a bot
Projects
None yet
Development

No branches or pull requests

2 participants