Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Tensorflow 2.14 on NVIDIA RTX-A6000 48G on 14900K OK #34

Open
obriensystems opened this issue Dec 1, 2024 · 1 comment
Open
Assignees

Comments

@obriensystems
Copy link
Member

see #33

FROM tensorflow/tensorflow:2.14.0-gpu

512,256,512,1024,2048,4096,6144,8192,5120

#6 DONE 87.2s

#7 [2/3] WORKDIR /src
#7 DONE 1.2s

#8 [3/3] COPY /src/tflow.py .
#8 DONE 0.1s

#9 exporting to image
#9 exporting layers 0.1s done
#9 writing image sha256:8b966edcc9730e67e965c6510db36c3025ce4a1515621d6bc1f75bada555e015
#9 writing image sha256:8b966edcc9730e67e965c6510db36c3025ce4a1515621d6bc1f75bada555e015 done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.1s
2024-12-01 02:08:36.768486: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:08:36.790429: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:08:36.790462: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:08:36.790475: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:08:36.794271: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:08:37.621775: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.624085: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.624118: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.624670: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.624690: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.624696: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.720861: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.720898: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.720903: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:08:37.720919: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:08:37.720944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 3s 0us/step
Epoch 1/25
2024-12-01 02:08:49.168762: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:08:50.183429: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f66dcacff80 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:08:50.183447: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:08:50.186364: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:08:50.235324: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
49/49 [==============================] - 21s 126ms/step - loss: 4.6119 - accuracy: 0.0572
Epoch 2/25
49/49 [==============================] - 3s 65ms/step - loss: 3.6056 - accuracy: 0.1604
Epoch 3/25
49/49 [==============================] - 3s 65ms/step - loss: 3.2608 - accuracy: 0.2271
Epoch 4/25
49/49 [==============================] - 3s 65ms/step - loss: 3.0407 - accuracy: 0.2720
Epoch 5/25
49/49 [==============================] - 3s 65ms/step - loss: 2.8468 - accuracy: 0.3044
Epoch 6/25
49/49 [==============================] - 3s 65ms/step - loss: 2.7389 - accuracy: 0.3277
Epoch 7/25
49/49 [==============================] - 3s 65ms/step - loss: 2.3097 - accuracy: 0.4097
Epoch 8/25
49/49 [==============================] - 3s 65ms/step - loss: 1.9889 - accuracy: 0.4859
Epoch 9/25
49/49 [==============================] - 3s 66ms/step - loss: 1.7394 - accuracy: 0.5371
Epoch 10/25
49/49 [==============================] - 3s 65ms/step - loss: 1.4602 - accuracy: 0.5983
Epoch 11/25
49/49 [==============================] - 3s 65ms/step - loss: 1.2628 - accuracy: 0.6555
Epoch 12/25
49/49 [==============================] - 3s 66ms/step - loss: 2.0973 - accuracy: 0.4642
Epoch 13/25
49/49 [==============================] - 3s 66ms/step - loss: 1.4020 - accuracy: 0.6110
Epoch 14/25
49/49 [==============================] - 3s 66ms/step - loss: 2.3063 - accuracy: 0.4661
Epoch 15/25
49/49 [==============================] - 3s 66ms/step - loss: 2.1843 - accuracy: 0.4439
Epoch 16/25
49/49 [==============================] - 3s 66ms/step - loss: 1.3761 - accuracy: 0.6398
Epoch 17/25
49/49 [==============================] - 3s 66ms/step - loss: 0.8350 - accuracy: 0.7709
Epoch 18/25
49/49 [==============================] - 3s 66ms/step - loss: 0.6167 - accuracy: 0.8280
Epoch 19/25
49/49 [==============================] - 3s 66ms/step - loss: 0.4185 - accuracy: 0.8904
Epoch 20/25
49/49 [==============================] - 3s 66ms/step - loss: 0.4127 - accuracy: 0.8891
Epoch 21/25
49/49 [==============================] - 3s 66ms/step - loss: 0.7419 - accuracy: 0.8033
Epoch 22/25
49/49 [==============================] - 3s 66ms/step - loss: 0.9758 - accuracy: 0.7173
Epoch 23/25
49/49 [==============================] - 3s 67ms/step - loss: 0.3882 - accuracy: 0.8952
Epoch 24/25
49/49 [==============================] - 3s 67ms/step - loss: 0.8176 - accuracy: 0.7854
Epoch 25/25
49/49 [==============================] - 3s 66ms/step - loss: 0.6328 - accuracy: 0.8190

michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 486B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 4.04kB done
#5 DONE 0.0s

#6 [2/3] WORKDIR /src
#6 CACHED

#7 [3/3] COPY /src/tflow.py .
#7 DONE 0.1s

#8 exporting to image
#8 exporting layers 0.1s done
#8 writing image sha256:6e60036543c59aefd9339304cff9fac234f083ef97fd9990ac261c24600b9a4e done
#8 naming to docker.io/library/ml-tensorflow-win done
#8 DONE 0.1s
2024-12-01 02:11:27.545000: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:11:27.562541: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:11:27.562574: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:11:27.562585: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:11:27.565927: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:11:28.313601: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.316118: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.316152: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.316827: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.316855: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.316863: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.438175: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.438213: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.438219: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:11:28.438234: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:11:28.438258: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 4s 0us/step
Epoch 1/25
2024-12-01 02:11:41.326968: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:11:42.157640: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f93cc0059f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:11:42.157660: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:11:42.160494: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:11:42.208280: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
196/196 [==============================] - 20s 33ms/step - loss: 4.2993 - accuracy: 0.0849
Epoch 2/25
196/196 [==============================] - 5s 26ms/step - loss: 3.6894 - accuracy: 0.1592
Epoch 3/25
196/196 [==============================] - 5s 26ms/step - loss: 3.9031 - accuracy: 0.1769
Epoch 4/25
196/196 [==============================] - 5s 26ms/step - loss: 3.3814 - accuracy: 0.2194
Epoch 5/25
196/196 [==============================] - 5s 26ms/step - loss: 3.3194 - accuracy: 0.2341
Epoch 6/25
196/196 [==============================] - 5s 26ms/step - loss: 3.0407 - accuracy: 0.2687
Epoch 7/25
196/196 [==============================] - 5s 26ms/step - loss: 2.7637 - accuracy: 0.3188
Epoch 8/25
196/196 [==============================] - 5s 26ms/step - loss: 2.5698 - accuracy: 0.3613
Epoch 9/25
196/196 [==============================] - 5s 26ms/step - loss: 2.5460 - accuracy: 0.3648
Epoch 10/25
196/196 [==============================] - 5s 26ms/step - loss: 2.3306 - accuracy: 0.4091
Epoch 11/25
196/196 [==============================] - 5s 26ms/step - loss: 2.0972 - accuracy: 0.4559
Epoch 12/25
196/196 [==============================] - 5s 26ms/step - loss: 1.9400 - accuracy: 0.4915
Epoch 13/25
196/196 [==============================] - 5s 26ms/step - loss: 1.7816 - accuracy: 0.5330
Epoch 14/25
196/196 [==============================] - 5s 26ms/step - loss: 1.6454 - accuracy: 0.5596
Epoch 15/25
196/196 [==============================] - 5s 27ms/step - loss: 1.4928 - accuracy: 0.5853
Epoch 16/25
196/196 [==============================] - 5s 27ms/step - loss: 1.2559 - accuracy: 0.6541
Epoch 17/25
196/196 [==============================] - 5s 26ms/step - loss: 1.5856 - accuracy: 0.5890
Epoch 18/25
196/196 [==============================] - 5s 27ms/step - loss: 1.1293 - accuracy: 0.6909
Epoch 19/25
196/196 [==============================] - 5s 27ms/step - loss: 0.7513 - accuracy: 0.7918
Epoch 20/25
196/196 [==============================] - 5s 27ms/step - loss: 0.7256 - accuracy: 0.8065
Epoch 21/25
196/196 [==============================] - 5s 27ms/step - loss: 0.7463 - accuracy: 0.8019
Epoch 22/25
196/196 [==============================] - 5s 27ms/step - loss: 0.7830 - accuracy: 0.7820
Epoch 23/25
196/196 [==============================] - 5s 27ms/step - loss: 0.5410 - accuracy: 0.8621
Epoch 24/25
196/196 [==============================] - 5s 27ms/step - loss: 0.4418 - accuracy: 0.8811
Epoch 25/25
196/196 [==============================] - 5s 27ms/step - loss: 0.2974 - accuracy: 0.9276

michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 486B done
#1 DONE 0.0s

#2 [auth] tensorflow/tensorflow:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#3 DONE 0.5s

#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#5 DONE 0.0s

#6 [internal] load build context
#6 transferring context: 4.04kB 0.0s done
#6 DONE 0.0s

#7 [2/3] WORKDIR /src
#7 CACHED

#8 [3/3] COPY /src/tflow.py .
#8 DONE 0.0s

#9 exporting to image
#9 exporting layers 0.0s done
#9 writing image sha256:b5ddfa9121dda4d112c676f45ea39262a2d0c6fdd82bb1e315288dee6a7f79f5 done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.1s
2024-12-01 02:14:15.810167: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:14:15.828840: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:14:15.828874: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:14:15.828883: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:14:15.832934: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:14:16.582776: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.585048: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.585086: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.585769: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.585789: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.585796: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.682906: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.682943: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.682948: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:14:16.682975: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:14:16.682991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 3s 0us/step
Epoch 1/25
2024-12-01 02:14:28.717906: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:14:29.567662: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f82bc3ce910 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:14:29.567681: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:14:29.570853: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:14:29.618299: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
98/98 [==============================] - 20s 58ms/step - loss: 4.3944 - accuracy: 0.0774
Epoch 2/25
98/98 [==============================] - 4s 40ms/step - loss: 3.5035 - accuracy: 0.1834
Epoch 3/25
98/98 [==============================] - 4s 40ms/step - loss: 3.0958 - accuracy: 0.2569
Epoch 4/25
98/98 [==============================] - 4s 40ms/step - loss: 3.0363 - accuracy: 0.2691
Epoch 5/25
98/98 [==============================] - 4s 40ms/step - loss: 3.3067 - accuracy: 0.2468
Epoch 6/25
98/98 [==============================] - 4s 40ms/step - loss: 3.0993 - accuracy: 0.2654
Epoch 7/25
98/98 [==============================] - 4s 40ms/step - loss: 2.8687 - accuracy: 0.3160
Epoch 8/25
98/98 [==============================] - 4s 40ms/step - loss: 2.6246 - accuracy: 0.3543
Epoch 9/25
98/98 [==============================] - 4s 41ms/step - loss: 2.3360 - accuracy: 0.4167
Epoch 10/25
98/98 [==============================] - 4s 41ms/step - loss: 2.1165 - accuracy: 0.4604
Epoch 11/25
98/98 [==============================] - 4s 41ms/step - loss: 2.2452 - accuracy: 0.4401
Epoch 12/25
98/98 [==============================] - 4s 41ms/step - loss: 1.8855 - accuracy: 0.5193
Epoch 13/25
98/98 [==============================] - 4s 41ms/step - loss: 2.3692 - accuracy: 0.4137
Epoch 14/25
98/98 [==============================] - 4s 41ms/step - loss: 1.8424 - accuracy: 0.5165
Epoch 15/25
98/98 [==============================] - 4s 41ms/step - loss: 1.5206 - accuracy: 0.5962
Epoch 16/25
98/98 [==============================] - 4s 41ms/step - loss: 1.1749 - accuracy: 0.6809
Epoch 17/25
98/98 [==============================] - 4s 41ms/step - loss: 0.8922 - accuracy: 0.7548
Epoch 18/25
98/98 [==============================] - 4s 41ms/step - loss: 0.7526 - accuracy: 0.8069
Epoch 19/25
98/98 [==============================] - 4s 41ms/step - loss: 0.9976 - accuracy: 0.7220
Epoch 20/25
98/98 [==============================] - 4s 41ms/step - loss: 0.5701 - accuracy: 0.8443
Epoch 21/25
98/98 [==============================] - 4s 41ms/step - loss: 0.4559 - accuracy: 0.8825
Epoch 22/25
98/98 [==============================] - 4s 41ms/step - loss: 0.3698 - accuracy: 0.9056
Epoch 23/25
98/98 [==============================] - 4s 41ms/step - loss: 0.4516 - accuracy: 0.8940
Epoch 24/25
98/98 [==============================] - 4s 41ms/step - loss: 0.3864 - accuracy: 0.9092
Epoch 25/25
98/98 [==============================] - 4s 41ms/step - loss: 0.3889 - accuracy: 0.9150

michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 486B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 4.04kB done
#5 DONE 0.0s

#6 [2/3] WORKDIR /src
#6 CACHED

#7 [3/3] COPY /src/tflow.py .
#7 CACHED

#8 exporting to image
#8 exporting layers done
#8 writing image sha256:8b966edcc9730e67e965c6510db36c3025ce4a1515621d6bc1f75bada555e015 done
#8 naming to docker.io/library/ml-tensorflow-win done
#8 DONE 0.0s
2024-12-01 02:16:38.446559: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:16:38.464566: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:16:38.464597: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:16:38.464608: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:16:38.468013: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:16:39.253232: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.255511: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.255543: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.255858: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.255870: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.255875: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.358115: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.358148: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.358153: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:16:39.358166: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:16:39.358180: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 6s 0us/step
Epoch 1/25
2024-12-01 02:16:53.393099: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:16:54.429662: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f3c44586480 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:16:54.429688: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:16:54.432516: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:16:54.479116: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
49/49 [==============================] - 21s 128ms/step - loss: 4.6086 - accuracy: 0.0625
Epoch 2/25
49/49 [==============================] - 3s 66ms/step - loss: 3.5727 - accuracy: 0.1668
Epoch 3/25
49/49 [==============================] - 3s 66ms/step - loss: 3.1827 - accuracy: 0.2383
Epoch 4/25
49/49 [==============================] - 3s 66ms/step - loss: 2.8822 - accuracy: 0.2932
Epoch 5/25
49/49 [==============================] - 3s 66ms/step - loss: 2.6969 - accuracy: 0.3283
Epoch 6/25
49/49 [==============================] - 3s 67ms/step - loss: 2.3838 - accuracy: 0.3900
Epoch 7/25
49/49 [==============================] - 3s 67ms/step - loss: 2.0301 - accuracy: 0.4698
Epoch 8/25
49/49 [==============================] - 3s 67ms/step - loss: 1.8291 - accuracy: 0.5123
Epoch 9/25
49/49 [==============================] - 3s 67ms/step - loss: 2.1018 - accuracy: 0.4589
Epoch 10/25
49/49 [==============================] - 3s 67ms/step - loss: 1.9945 - accuracy: 0.4770
Epoch 11/25
49/49 [==============================] - 3s 67ms/step - loss: 1.3959 - accuracy: 0.6179
Epoch 12/25
49/49 [==============================] - 3s 67ms/step - loss: 1.0794 - accuracy: 0.7015
Epoch 13/25
49/49 [==============================] - 3s 67ms/step - loss: 0.8917 - accuracy: 0.7511
Epoch 14/25
49/49 [==============================] - 3s 67ms/step - loss: 0.6968 - accuracy: 0.8015
Epoch 15/25
49/49 [==============================] - 3s 68ms/step - loss: 0.5796 - accuracy: 0.8326
Epoch 16/25
49/49 [==============================] - 3s 68ms/step - loss: 0.4675 - accuracy: 0.8645
Epoch 17/25
49/49 [==============================] - 3s 68ms/step - loss: 0.4478 - accuracy: 0.8722
Epoch 18/25
49/49 [==============================] - 3s 68ms/step - loss: 0.4528 - accuracy: 0.8734
Epoch 19/25
49/49 [==============================] - 3s 68ms/step - loss: 0.3850 - accuracy: 0.8913
Epoch 20/25
49/49 [==============================] - 3s 68ms/step - loss: 0.3349 - accuracy: 0.9073
Epoch 21/25
49/49 [==============================] - 3s 68ms/step - loss: 0.3287 - accuracy: 0.9082
Epoch 22/25
49/49 [==============================] - 3s 68ms/step - loss: 3.8923 - accuracy: 0.1823
Epoch 23/25
49/49 [==============================] - 3s 68ms/step - loss: 3.5242 - accuracy: 0.1906
Epoch 24/25
49/49 [==============================] - 3s 68ms/step - loss: 3.1717 - accuracy: 0.2347
Epoch 25/25
49/49 [==============================] - 3s 68ms/step - loss: 2.8021 - accuracy: 0.3053

michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 486B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 4.04kB done
#5 DONE 0.0s

#6 [2/3] WORKDIR /src
#6 CACHED

#7 [3/3] COPY /src/tflow.py .
#7 DONE 0.0s

#8 exporting to image
#8 exporting layers 0.0s done
#8 writing image sha256:0c182c8490dbb66723ca8bdc5464d6f3cd1d3c7be5841ca588d978d102e92196 done
#8 naming to docker.io/library/ml-tensorflow-win done
#8 DONE 0.1s
2024-12-01 02:18:42.501568: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:18:42.520114: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:18:42.520140: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:18:42.520150: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:18:42.523991: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:18:43.275354: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.277639: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.277677: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.278475: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.278502: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.278510: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.378899: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.378937: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.378942: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:18:43.378959: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:18:43.378984: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 5s 0us/step
Epoch 1/25
2024-12-01 02:18:56.729768: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:18:58.002254: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7eff40909a50 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:18:58.002295: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:18:58.005224: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:18:58.053014: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
25/25 [==============================] - 22s 244ms/step - loss: 5.0263 - accuracy: 0.0390
Epoch 2/25
25/25 [==============================] - 3s 119ms/step - loss: 3.7941 - accuracy: 0.1250
Epoch 3/25
25/25 [==============================] - 3s 120ms/step - loss: 3.3165 - accuracy: 0.2059
Epoch 4/25
25/25 [==============================] - 3s 120ms/step - loss: 2.9931 - accuracy: 0.2666
Epoch 5/25
25/25 [==============================] - 3s 120ms/step - loss: 2.5748 - accuracy: 0.3486
Epoch 6/25
25/25 [==============================] - 3s 121ms/step - loss: 2.1712 - accuracy: 0.4324
Epoch 7/25
25/25 [==============================] - 3s 121ms/step - loss: 1.8050 - accuracy: 0.5168
Epoch 8/25
25/25 [==============================] - 3s 121ms/step - loss: 1.5300 - accuracy: 0.5794
Epoch 9/25
25/25 [==============================] - 3s 121ms/step - loss: 1.2819 - accuracy: 0.6376
Epoch 10/25
25/25 [==============================] - 3s 121ms/step - loss: 1.1327 - accuracy: 0.6783
Epoch 11/25
25/25 [==============================] - 3s 122ms/step - loss: 0.9470 - accuracy: 0.7240
Epoch 12/25
25/25 [==============================] - 3s 122ms/step - loss: 0.6863 - accuracy: 0.7953
Epoch 13/25
25/25 [==============================] - 3s 122ms/step - loss: 0.5211 - accuracy: 0.8437
Epoch 14/25
25/25 [==============================] - 3s 123ms/step - loss: 0.4399 - accuracy: 0.8653
Epoch 15/25
25/25 [==============================] - 3s 122ms/step - loss: 0.3753 - accuracy: 0.8845
Epoch 16/25
25/25 [==============================] - 3s 122ms/step - loss: 0.9662 - accuracy: 0.7294
Epoch 17/25
25/25 [==============================] - 3s 122ms/step - loss: 0.7160 - accuracy: 0.7854
Epoch 18/25
25/25 [==============================] - 3s 123ms/step - loss: 0.3608 - accuracy: 0.8888
Epoch 19/25
25/25 [==============================] - 3s 122ms/step - loss: 0.2300 - accuracy: 0.9299
Epoch 20/25
25/25 [==============================] - 3s 123ms/step - loss: 0.1514 - accuracy: 0.9552
Epoch 21/25
25/25 [==============================] - 3s 123ms/step - loss: 0.1078 - accuracy: 0.9701
Epoch 22/25
25/25 [==============================] - 3s 122ms/step - loss: 0.0756 - accuracy: 0.9791
Epoch 23/25
25/25 [==============================] - 3s 123ms/step - loss: 0.0699 - accuracy: 0.9805
Epoch 24/25
25/25 [==============================] - 3s 123ms/step - loss: 0.0634 - accuracy: 0.9828
Epoch 25/25
25/25 [==============================] - 3s 123ms/step - loss: 0.0692 - accuracy: 0.9810

michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 486B done
#1 DONE 0.0s

#2 [auth] tensorflow/tensorflow:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#3 DONE 0.4s

#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#5 DONE 0.0s

#6 [internal] load build context
#6 transferring context: 4.04kB done
#6 DONE 0.0s

#7 [2/3] WORKDIR /src
#7 CACHED

#8 [3/3] COPY /src/tflow.py .
#8 DONE 0.0s

#9 exporting to image
#9 exporting layers 0.1s done
#9 writing image sha256:1c6b890b0fb52ca6bab30d0589fdb78d7af368959620f4c38947486dc0cae98d done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.1s
2024-12-01 02:21:04.877044: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:21:04.895115: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:21:04.895147: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:21:04.895158: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:21:04.898832: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:21:05.651012: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.653172: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.653235: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.654180: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.654205: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.654212: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.772419: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.772457: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.772462: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:21:05.772478: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:21:05.772490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 2s 0us/step
Epoch 1/25
2024-12-01 02:21:16.745270: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:21:18.594612: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f7bca4ec090 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:21:18.594635: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:21:18.597623: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:21:18.646527: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
13/13 [==============================] - 26s 475ms/step - loss: 5.4693 - accuracy: 0.0266
Epoch 2/25
13/13 [==============================] - 3s 227ms/step - loss: 4.2336 - accuracy: 0.0680
Epoch 3/25
13/13 [==============================] - 3s 229ms/step - loss: 3.8065 - accuracy: 0.1216
Epoch 4/25
13/13 [==============================] - 3s 229ms/step - loss: 3.4664 - accuracy: 0.1779
Epoch 5/25
13/13 [==============================] - 3s 229ms/step - loss: 3.1102 - accuracy: 0.2426
Epoch 6/25
13/13 [==============================] - 3s 229ms/step - loss: 2.7871 - accuracy: 0.3073
Epoch 7/25
13/13 [==============================] - 3s 230ms/step - loss: 2.4539 - accuracy: 0.3735
Epoch 8/25
13/13 [==============================] - 3s 229ms/step - loss: 2.3678 - accuracy: 0.3922
Epoch 9/25
13/13 [==============================] - 3s 230ms/step - loss: 2.0209 - accuracy: 0.4674
Epoch 10/25
13/13 [==============================] - 3s 231ms/step - loss: 1.6048 - accuracy: 0.5703
Epoch 11/25
13/13 [==============================] - 3s 230ms/step - loss: 2.1097 - accuracy: 0.4559
Epoch 12/25
13/13 [==============================] - 3s 231ms/step - loss: 1.9024 - accuracy: 0.5015
Epoch 13/25
13/13 [==============================] - 3s 232ms/step - loss: 1.3100 - accuracy: 0.6388
Epoch 14/25
13/13 [==============================] - 3s 232ms/step - loss: 0.9129 - accuracy: 0.7414
Epoch 15/25
13/13 [==============================] - 3s 233ms/step - loss: 0.6478 - accuracy: 0.8147
Epoch 16/25
13/13 [==============================] - 3s 234ms/step - loss: 0.4697 - accuracy: 0.8644
Epoch 17/25
13/13 [==============================] - 3s 234ms/step - loss: 0.3913 - accuracy: 0.8861
Epoch 18/25
13/13 [==============================] - 3s 233ms/step - loss: 0.3083 - accuracy: 0.9081
Epoch 19/25
13/13 [==============================] - 3s 234ms/step - loss: 0.2538 - accuracy: 0.9260
Epoch 20/25
13/13 [==============================] - 3s 233ms/step - loss: 0.2069 - accuracy: 0.9392
Epoch 21/25
13/13 [==============================] - 3s 234ms/step - loss: 0.2036 - accuracy: 0.9413
Epoch 22/25
13/13 [==============================] - 3s 234ms/step - loss: 0.1980 - accuracy: 0.9408
Epoch 23/25
13/13 [==============================] - 3s 231ms/step - loss: 0.1990 - accuracy: 0.9394
Epoch 24/25
13/13 [==============================] - 3s 233ms/step - loss: 0.1906 - accuracy: 0.9409
Epoch 25/25
13/13 [==============================] - 3s 233ms/step - loss: 0.1857 - accuracy: 0.9437


@obriensystems obriensystems self-assigned this Dec 1, 2024
@obriensystems
Copy link
Member Author

michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 486B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 4.04kB done
#5 DONE 0.0s

#6 [2/3] WORKDIR /src
#6 CACHED

#7 [3/3] COPY /src/tflow.py .
#7 DONE 0.0s

#8 exporting to image
#8 exporting layers
#8 exporting layers 0.0s done
#8 writing image sha256:c0831f4a03856d372fedde7d563a2ab6b3983a1f88dd93e34ca348d8b981d586 done
#8 naming to docker.io/library/ml-tensorflow-win done
#8 DONE 0.1s
2024-12-01 02:23:10.019137: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:23:10.035434: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:23:10.035467: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:23:10.035478: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:23:10.039315: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:23:10.827037: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.830004: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.830034: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.830916: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.830939: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.830946: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.930081: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.930118: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.930123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:23:10.930139: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:23:10.930163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 3s 0us/step
Epoch 1/25
2024-12-01 02:23:22.458429: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:23:24.912260: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x9f36010 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:23:24.912284: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:23:24.915227: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:23:24.966475: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
9/9 [==============================] - 30s 694ms/step - loss: 5.8745 - accuracy: 0.0215
Epoch 2/25
9/9 [==============================] - 3s 329ms/step - loss: 4.4032 - accuracy: 0.0592
Epoch 3/25
9/9 [==============================] - 3s 329ms/step - loss: 3.9991 - accuracy: 0.0968
Epoch 4/25
9/9 [==============================] - 3s 330ms/step - loss: 3.7315 - accuracy: 0.1350
Epoch 5/25
9/9 [==============================] - 3s 330ms/step - loss: 3.4930 - accuracy: 0.1771
Epoch 6/25
9/9 [==============================] - 3s 330ms/step - loss: 3.2226 - accuracy: 0.2223
Epoch 7/25
9/9 [==============================] - 3s 333ms/step - loss: 3.0065 - accuracy: 0.2659
Epoch 8/25
9/9 [==============================] - 3s 331ms/step - loss: 2.7064 - accuracy: 0.3253
Epoch 9/25
9/9 [==============================] - 3s 332ms/step - loss: 2.4498 - accuracy: 0.3777
Epoch 10/25
9/9 [==============================] - 3s 331ms/step - loss: 2.1219 - accuracy: 0.4489
Epoch 11/25
9/9 [==============================] - 3s 334ms/step - loss: 1.9020 - accuracy: 0.5028
Epoch 12/25
9/9 [==============================] - 3s 334ms/step - loss: 1.6502 - accuracy: 0.5581
Epoch 13/25
9/9 [==============================] - 3s 334ms/step - loss: 1.5115 - accuracy: 0.5869
Epoch 14/25
9/9 [==============================] - 3s 334ms/step - loss: 1.3274 - accuracy: 0.6336
Epoch 15/25
9/9 [==============================] - 3s 334ms/step - loss: 1.1831 - accuracy: 0.6668
Epoch 16/25
9/9 [==============================] - 3s 334ms/step - loss: 0.9772 - accuracy: 0.7211
Epoch 17/25
9/9 [==============================] - 3s 335ms/step - loss: 1.0308 - accuracy: 0.7053
Epoch 18/25
9/9 [==============================] - 3s 337ms/step - loss: 0.8926 - accuracy: 0.7370
Epoch 19/25
9/9 [==============================] - 3s 335ms/step - loss: 0.6769 - accuracy: 0.8030
Epoch 20/25
9/9 [==============================] - 3s 337ms/step - loss: 1.0053 - accuracy: 0.7136
Epoch 21/25
9/9 [==============================] - 3s 336ms/step - loss: 0.8281 - accuracy: 0.7538
Epoch 22/25
9/9 [==============================] - 3s 335ms/step - loss: 0.5696 - accuracy: 0.8357
Epoch 23/25
9/9 [==============================] - 3s 336ms/step - loss: 0.4530 - accuracy: 0.8671
Epoch 24/25
9/9 [==============================] - 3s 335ms/step - loss: 0.3883 - accuracy: 0.8855
Epoch 25/25
9/9 [==============================] - 3s 334ms/step - loss: 0.3016 - accuracy: 0.9138

michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 486B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#2 DONE 0.3s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 4.04kB 0.0s done
#5 DONE 0.0s

#6 [2/3] WORKDIR /src
#6 CACHED

#7 [3/3] COPY /src/tflow.py .
#7 DONE 0.0s

#8 exporting to image
#8 exporting layers
#8 exporting layers 0.0s done
#8 writing image sha256:922ed6d49d300e184ec9b3d4819e9b759f6b3af17d36ed47735cd73093b85738 done
#8 naming to docker.io/library/ml-tensorflow-win done
#8 DONE 0.1s
2024-12-01 02:25:12.541506: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:25:12.559026: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:25:12.559053: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:25:12.559063: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:25:12.562916: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:25:13.305767: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.308785: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.308836: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.310159: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.310242: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.310393: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.400624: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.400661: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.400666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:25:13.400682: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:25:13.400695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 3s 0us/step
Epoch 1/25
2024-12-01 02:25:24.924752: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:25:27.706536: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f7e98df02e0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:25:27.706561: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:25:27.709657: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:25:27.757555: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
7/7 [==============================] - 34s 910ms/step - loss: 6.1072 - accuracy: 0.0174
Epoch 2/25
7/7 [==============================] - 3s 420ms/step - loss: 4.5421 - accuracy: 0.0470
Epoch 3/25
7/7 [==============================] - 3s 421ms/step - loss: 4.1170 - accuracy: 0.0783
Epoch 4/25
7/7 [==============================] - 3s 421ms/step - loss: 3.8726 - accuracy: 0.1141
Epoch 5/25
7/7 [==============================] - 3s 422ms/step - loss: 3.6838 - accuracy: 0.1509
Epoch 6/25
7/7 [==============================] - 3s 421ms/step - loss: 3.4169 - accuracy: 0.1907
Epoch 7/25
7/7 [==============================] - 3s 423ms/step - loss: 3.2222 - accuracy: 0.2297
Epoch 8/25
7/7 [==============================] - 3s 422ms/step - loss: 3.6691 - accuracy: 0.1675
Epoch 9/25
7/7 [==============================] - 3s 424ms/step - loss: 3.7561 - accuracy: 0.1492
Epoch 10/25
7/7 [==============================] - 3s 425ms/step - loss: 3.5491 - accuracy: 0.1760
Epoch 11/25
7/7 [==============================] - 3s 426ms/step - loss: 3.3228 - accuracy: 0.2094
Epoch 12/25
7/7 [==============================] - 3s 425ms/step - loss: 3.1399 - accuracy: 0.2394
Epoch 13/25
7/7 [==============================] - 3s 427ms/step - loss: 2.9667 - accuracy: 0.2725
Epoch 14/25
7/7 [==============================] - 3s 426ms/step - loss: 2.8074 - accuracy: 0.3065
Epoch 15/25
7/7 [==============================] - 3s 426ms/step - loss: 2.6076 - accuracy: 0.3464
Epoch 16/25
7/7 [==============================] - 3s 428ms/step - loss: 2.7522 - accuracy: 0.3272
Epoch 17/25
7/7 [==============================] - 3s 426ms/step - loss: 2.5534 - accuracy: 0.3621
Epoch 18/25
7/7 [==============================] - 3s 427ms/step - loss: 2.3425 - accuracy: 0.4111
Epoch 19/25
7/7 [==============================] - 3s 427ms/step - loss: 2.0997 - accuracy: 0.4621
Epoch 20/25
7/7 [==============================] - 3s 426ms/step - loss: 1.9775 - accuracy: 0.4959
Epoch 21/25
7/7 [==============================] - 3s 428ms/step - loss: 1.7427 - accuracy: 0.5398
Epoch 22/25
7/7 [==============================] - 3s 428ms/step - loss: 1.4871 - accuracy: 0.6050
Epoch 23/25
7/7 [==============================] - 3s 426ms/step - loss: 1.2739 - accuracy: 0.6586
Epoch 24/25
7/7 [==============================] - 3s 426ms/step - loss: 1.0641 - accuracy: 0.7079
Epoch 25/25
7/7 [==============================] - 3s 426ms/step - loss: 0.8945 - accuracy: 0.7526

michael@14900c MINGW64 /c/wse_github/obrienlabsdev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 486B done
#1 DONE 0.0s

#2 [auth] tensorflow/tensorflow:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#3 DONE 0.5s

#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#5 DONE 0.0s

#6 [internal] load build context
#6 transferring context: 4.04kB 0.0s done
#6 DONE 0.0s

#7 [2/3] WORKDIR /src
#7 CACHED

#8 [3/3] COPY /src/tflow.py .
#8 DONE 0.0s

#9 exporting to image
#9 exporting layers 0.0s done
#9 writing image sha256:6bfe99d6a6992e6d844ab0646e564767d4d37426b0c1550fe97f5944ccc559c7 done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.1s
2024-12-01 02:27:48.937126: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 02:27:48.954591: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 02:27:48.954624: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 02:27:48.954664: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 02:27:48.958926: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 02:27:49.701098: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.703305: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.703345: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.704237: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.704267: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.704276: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.801786: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.801822: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.801828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 02:27:49.801844: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 02:27:49.801867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46009 MB memory:  -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 4s 0us/step
Epoch 1/25
2024-12-01 02:28:02.185247: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 02:28:04.332728: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fb8e8661c10 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 02:28:04.332749: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6
2024-12-01 02:28:04.336152: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 02:28:04.384715: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
10/10 [==============================] - 34s 1s/step - loss: 5.5942 - accuracy: 0.0251
Epoch 2/25
10/10 [==============================] - 3s 293ms/step - loss: 4.2972 - accuracy: 0.0699
Epoch 3/25
10/10 [==============================] - 3s 295ms/step - loss: 3.8669 - accuracy: 0.1154
Epoch 4/25
10/10 [==============================] - 3s 295ms/step - loss: 3.5305 - accuracy: 0.1677
Epoch 5/25
10/10 [==============================] - 3s 295ms/step - loss: 3.1965 - accuracy: 0.2263
Epoch 6/25
10/10 [==============================] - 3s 296ms/step - loss: 2.8172 - accuracy: 0.3008
Epoch 7/25
10/10 [==============================] - 3s 296ms/step - loss: 2.3786 - accuracy: 0.3944
Epoch 8/25
10/10 [==============================] - 3s 295ms/step - loss: 1.9369 - accuracy: 0.4957
Epoch 9/25
10/10 [==============================] - 3s 296ms/step - loss: 1.5214 - accuracy: 0.5956
Epoch 10/25
10/10 [==============================] - 3s 297ms/step - loss: 1.1885 - accuracy: 0.6777
Epoch 11/25
10/10 [==============================] - 3s 298ms/step - loss: 0.8498 - accuracy: 0.7654
Epoch 12/25
10/10 [==============================] - 3s 297ms/step - loss: 0.6862 - accuracy: 0.8079
Epoch 13/25
10/10 [==============================] - 3s 298ms/step - loss: 0.4986 - accuracy: 0.8589
Epoch 14/25
10/10 [==============================] - 3s 300ms/step - loss: 0.3895 - accuracy: 0.8877
Epoch 15/25
10/10 [==============================] - 3s 298ms/step - loss: 0.3158 - accuracy: 0.9085
Epoch 16/25
10/10 [==============================] - 3s 301ms/step - loss: 0.2679 - accuracy: 0.9245
Epoch 17/25
10/10 [==============================] - 3s 300ms/step - loss: 0.2197 - accuracy: 0.9366
Epoch 18/25
10/10 [==============================] - 3s 299ms/step - loss: 0.1929 - accuracy: 0.9446
Epoch 19/25
10/10 [==============================] - 3s 300ms/step - loss: 0.1641 - accuracy: 0.9532
Epoch 20/25
10/10 [==============================] - 3s 300ms/step - loss: 0.1469 - accuracy: 0.9581
Epoch 21/25
10/10 [==============================] - 3s 300ms/step - loss: 0.1393 - accuracy: 0.9599
Epoch 22/25
10/10 [==============================] - 3s 301ms/step - loss: 0.1278 - accuracy: 0.9631
Epoch 23/25
10/10 [==============================] - 3s 302ms/step - loss: 0.1229 - accuracy: 0.9633
Epoch 24/25
10/10 [==============================] - 3s 300ms/step - loss: 0.1223 - accuracy: 0.9648
Epoch 25/25
10/10 [==============================] - 3s 300ms/step - loss: 0.1282 - accuracy: 0.9627

michael@14900c MINGW64 /c/wse_github/obrienlabs/benchmark (master)
$ nvidia-smi
Sat Nov 30 21:32:22 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 551.86                 Driver Version: 551.86         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A6000             WDDM  |   00000000:01:00.0 Off |                  Off |
| 65%   87C    P2            284W /  300W |   46306MiB /  49140MiB |     97%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+




Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant