-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEBUG {2023.06}[foss/2023a] TensorFlow v2.15.1 w/ CUDA 12.1.1 #808
base: 2023.06-software.eessi.io
Are you sure you want to change the base?
DEBUG {2023.06}[foss/2023a] TensorFlow v2.15.1 w/ CUDA 12.1.1 #808
Conversation
Instance
|
Instance
|
Instance
|
Just build for a single CPU architecture... bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2 accel:nvidia/cc80 |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
Rebuilding after arg typo got fixed... bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2 accel:nvidia/cc80 |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
…-layer into 2023.06-software.eessi.io-TensorFlow-2.15.1-2023a-CUDA-12.1.1-debug
…-layer into 2023.06-software.eessi.io-TensorFlow-2.15.1-2023a-CUDA-12.1.1-debug
Rebuilding after EESSI-extend module got updated... bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2 accel:nvidia/cc80 |
Updates by the bot instance
|
1 similar comment
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
Try building with different Bazel version (6.3.1 instead of 6.1.0)... bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2 accel:nvidia/cc80 |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
New job on instance
|
PR to debug issues building TensorFlow v2.15.1 with CUDA v12.1.1
tensorflow.py
easyblock that solves anImportError
issue withlibnccl.so.2
. See tweak libpaths in TensorFlow easyblock by adding directory containing libnccl.so.2 easybuilders/easybuild-easyblocks#3497Notes:
Bazel
,ml_dtypes
andtensorboard
first and install them in the directory for CPU-only software (double-check if and why there are not there yet)Bazel/6.3.1
is installed but notBazel/6.1.0
which is a dependency for this PRml_dtypes
is not installed ... not sure if it should be (see comment/question for tensorboard below) ... OR it's a new dependency for TensorFlow (check easyconfig for CPU-only version)tensorboard/2.13.0
is available as an extension of the CPU-only installation ofTensorFlow/2.13.0-foss-2023a
... we might want to install the extension under the GPU directory?cuDNN
is installed again (in directory for CPU-only software) ... maybe related to switching toEESSI-extend/2023.06-easybuild
and the installation path not being configured correctly