Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DONT REVIEW] Debug Info for tensor dump & move MemcpyToHost,MemcpyFromHost to a different stream #19628

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

dump Graph topology in allocation_planner.cc

85414e4
Select commit
Loading
Failed to load commit list.
Draft

[DONT REVIEW] Debug Info for tensor dump & move MemcpyToHost,MemcpyFromHost to a different stream #19628

dump Graph topology in allocation_planner.cc
85414e4
Select commit
Loading
Failed to load commit list.
Azure Pipelines / orttraining-linux-gpu-ci-pipeline failed Sep 19, 2024 in 1h 39m 27s

Build #20240918.35 had test failures

Details

Tests

  • Failed: 8 (0.04%)
  • Passed: 17,842 (97.93%)
  • Other: 370 (2.03%)
  • Total: 18,220

Annotations

Check failure on line 5959557 in Build log

See this annotation in the file changed.

@azure-pipelines azure-pipelines / orttraining-linux-gpu-ci-pipeline

Build log #L5959557

Bash exited with code '1'.

Check failure on line 101 in Build log

See this annotation in the file changed.

@azure-pipelines azure-pipelines / orttraining-linux-gpu-ci-pipeline

Build log #L101

Bash exited with code '1'.

Check failure on line 30 in Build log

See this annotation in the file changed.

@azure-pipelines azure-pipelines / orttraining-linux-gpu-ci-pipeline

Build log #L30

Bash exited with code '1'.

Check failure on line 1 in ParaPlanCreation

See this annotation in the file changed.

@azure-pipelines azure-pipelines / orttraining-linux-gpu-ci-pipeline

ParaPlanCreation

/onnxruntime_src/onnxruntime/test/framework/allocation_planner_test.cc:1912
Value of: reuse_pairs.empty()
  Actual: false
Expected: true
Google Test trace:
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 2345
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 5678
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 1234
Raw output
/onnxruntime_src/onnxruntime/test/framework/allocation_planner_test.cc:1912
Value of: reuse_pairs.empty()
  Actual: false
Expected: true
Google Test trace:
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 2345
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 5678
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 1234

Check failure on line 1 in LoadOptimState_MixedPrecision_FP32Moments_Adam

See this annotation in the file changed.

@azure-pipelines azure-pipelines / orttraining-linux-gpu-ci-pipeline

LoadOptimState_MixedPrecision_FP32Moments_Adam

/onnxruntime_src/orttraining/orttraining/test/session/training_session_test_utils.cc:123
Expected equality of these values:
  expected[0] + 1
    Which is: 5
  actual[0]
    Which is: 4
Google Test trace:
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 2345
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 5678
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 1234
Raw output
/onnxruntime_src/orttraining/orttraining/test/session/training_session_test_utils.cc:123
Expected equality of these values:
  expected[0] + 1
    Which is: 5
  actual[0]
    Which is: 4
Google Test trace:
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 2345
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 5678
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 1234

Check failure on line 1 in LoadOptimState_MixedPrecision_FP16Moments_Adam

See this annotation in the file changed.

@azure-pipelines azure-pipelines / orttraining-linux-gpu-ci-pipeline

LoadOptimState_MixedPrecision_FP16Moments_Adam

/onnxruntime_src/orttraining/orttraining/test/session/training_session_test_utils.cc:123
Expected equality of these values:
  expected[0] + 1
    Which is: 5
  actual[0]
    Which is: 4
Google Test trace:
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 2345
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 5678
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 1234
Raw output
/onnxruntime_src/orttraining/orttraining/test/session/training_session_test_utils.cc:123
Expected equality of these values:
  expected[0] + 1
    Which is: 5
  actual[0]
    Which is: 4
Google Test trace:
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 2345
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 5678
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 1234

Check failure on line 1 in LoadOptimState_MixedPrecision_FP32Moments_Lamb

See this annotation in the file changed.

@azure-pipelines azure-pipelines / orttraining-linux-gpu-ci-pipeline

LoadOptimState_MixedPrecision_FP32Moments_Lamb

/onnxruntime_src/orttraining/orttraining/test/session/training_session_test_utils.cc:123
Expected equality of these values:
  expected[0] + 1
    Which is: 5
  actual[0]
    Which is: 4
Google Test trace:
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 2345
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 5678
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 1234
Raw output
/onnxruntime_src/orttraining/orttraining/test/session/training_session_test_utils.cc:123
Expected equality of these values:
  expected[0] + 1
    Which is: 5
  actual[0]
    Which is: 4
Google Test trace:
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 8910
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 2345
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 5678
/onnxruntime_src/onnxruntime/test/common/random_generator.h:50: ORT test random seed: 1234