Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow 2.14 on P1Gen6 ADA AD104 NVIDIA RTX-3500 ada GPU - OK using tensorflow/tensorflow:2.14.0-gpu on CUDA 12.6 #33

Open
obriensystems opened this issue Nov 30, 2024 · 2 comments
Assignees

Comments

@obriensystems
Copy link
Member

image

(venv-cuda)
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ nvidia-smi
Sat Nov 30 18:54:26 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.94                 Driver Version: 560.94         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX 3500 Ada Gene...  WDDM  |   00000000:01:00.0 Off |                  Off |
| N/A   58C    P3             21W /   98W |       0MiB /  12282MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+



code
import tensorflow as tf
#import keras
#from keras.utils import multi_gpu_model
#import keras.backend as k
#https://github.com/microsoft/tensorflow-directml/issues/352

# https://www.tensorflow.org/guide/distributed_training
#
# https://www.tensorflow.org/tutorials/distribute/keras
# https://keras.io/guides/distributed_training/
#strategy = tf.distribute.MirroredStrategy()
#print('Number of devices: {}'.format(strategy.num_replicas_in_sync))

# windows git-bash
#set PYTHONIOENCODING=utf-8

#NUM_GPUS = 2
#strategy = tf.distribute.MirroredStrategy()#num_gpus=NUM_GPUS)

# working on dual RTX-4090
#strategy = tf.distribute.MirroredStrategy(devices=["/gpu:0", "/gpu:1"])
#WARNING:tensorflow:Some requested devices in `tf.distribute.Strategy` are not visible to TensorFlow: /replica:0/task:0/device:GPU:1,/replica:0/task:0/device:GPU:0
#Number of devices: 2

# Working on Lenovo P1 Gen 6 - 1 3500 GPU but with gpu0 the iris embedded 13900 intel gpu
#strategy = tf.contrib.distribute.MirroredStrategy(devices=["/gpu:0"])
strategy = tf.distribute.MirroredStrategy()


# cpu only
#strategy = tf.distribute.OneDeviceStrategy(device="/gpu:0")
#central_storage_strategy = tf.distribute.experimental.CentralStorageStrategy()
#strategy = tf.distribute.MultiWorkerMirroredStrategy() # not in tf 1.5
#print("mirrored_strategy: ",mirrored_strategy)
#strategy = tf.distribute.OneDeviceStrategy(device="/gpu:1")
#mirrored_strategy = tf.distribute.MirroredStrategy(devices=["/gpu:0","/gpu:1"],cross_device_ops=tf.contrib.distribute.AllReduceCrossDeviceOps(all_reduce_alg="hierarchical_copy"))
#mirrored_strategy = tf.distribute.MirroredStrategy(devices= ["/gpu:0","/gpu:1"],cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())

#print('Number of devices: {}'.format(strategy.num_replicas_in_sync))

# https://learn.microsoft.com/en-us/windows/ai/directml/gpu-faq
#a = tf.constant([1.])
#b = tf.constant([2.])
#c = tf.add(a, b)

#gpu_config = tf.GPUOptions()
#gpu_config.visible_device_list = "1"#"0,1"
#gpu_config.visible_device_list = "0,1"
#gpu_config.allow_growth=True

#session = tf.Session(config=tf.ConfigProto(gpu_options=gpu_config))
#print(session.run(c))
#tensorflow.python.framework.errors_impl.AlreadyExistsError: TensorFlow device (DML:0) is being mapped to multiple DML devices (0 now, and 1 previously), which is not supported. This may be the result of providing different GPU configurations (ConfigProto.gpu_options, for example different visible_device_list) when creating multiple Sessions in the same process. This is not  currently supported, see https://github.com/tensorflow/tensorflow/issues/19083
#from keras import backend as K
#K.set_session(session)

cifar = tf.keras.datasets.cifar100
(x_train, y_train), (x_test, y_test) = cifar.load_data()

with strategy.scope():
# https://www.tensorflow.org/api_docs/python/tf/keras/applications/resnet50/ResNet50
# https://keras.io/api/models/model/
  parallel_model = tf.keras.applications.ResNet50(
#model = tf.keras.applications.ResNet50(
    include_top=True,
    weights=None,
    input_shape=(32, 32, 3),
    classes=100,)
# https://saturncloud.io/blog/how-to-do-multigpu-training-with-keras/  
  #parallel_model = multi_gpu_model(model, gpus=2)
  loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
# https://keras.io/api/models/model_training_apis/
  parallel_model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
  # 4096 5s 399ms 104c
parallel_model.fit(x_train, y_train, epochs=25, batch_size=512 )#2048)#5120)#7168)#7168)


dockerfile

# https://hub.docker.com/r/nvidia/cuda/tags
#FROM nvidia/cuda:12.2.0-runtime-ubi9
#FROM tensorflow/tensorflow:latest-gpu
# https://hub.docker.com/layers/tensorflow/tensorflow/2.14.0-gpu/images/sha256-64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206?context=explore
FROM tensorflow/tensorflow:2.14.0-gpu
     WORKDIR /src
COPY /src/tflow.py .
CMD ["python", "tflow.py"]

#RUN pip install -U jupyterlab pandas matplotlib
#EXPOSE 8888
#ENTRYPOINT ["jupyter", "lab","--ip=0.0.0.0","--allow-root","--no-browser"]



micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 414B 0.0s done
#1 DONE 0.0s

#2 [auth] tensorflow/tensorflow:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#3 DONE 0.9s

#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 57B done
#5 DONE 0.0s

#6 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#6 resolve docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206 done
#6 sha256:68d96d40670e06f0c7b926edf49528a881cf2fca37714de065eacfc75e831192 11.70kB / 11.70kB done
...
#6 extracting sha256:91c6784a9bf6ec3f4bd6f1562e09ae227761f1d4a3b44f8a2d9d27a436cebb4e done
#6 extracting sha256:2ef3e831f8658e728821077ea54594b5bc43898fd70b3b55851d43c3ebaba8f6 done
#6 DONE 165.9s

#7 [2/3] WORKDIR /src
#7 DONE 0.9s

#8 [3/3] COPY /src/tflow.py .
#8 DONE 0.0s

#9 exporting to image
#9 exporting layers 0.0s done
#9 writing image sha256:1d6810df1690137c7ea77998e44557269fbcdfbfcf20ba14748865748bdbeb1a done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.0s
2024-11-30 23:48:43.537015: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-11-30 23:48:43.558784: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-11-30 23:48:43.558819: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-11-30 23:48:43.558832: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-30 23:48:43.562966: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-11-30 23:48:45.152510: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
...
Your kernel may have been built without NUMA support.
2024-11-30 23:48:45.352833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-11-30 23:48:45.352849: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:48:45.352865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9581 MB memory:  -> device: 0, name: NVIDIA RTX 3500 Ada Generation Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 15s 0us/step
Epoch 1/25
2024-11-30 23:49:12.649878: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-11-30 23:49:14.510726: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f322800cb40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-11-30 23:49:14.510749: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX 3500 Ada Generation Laptop GPU, Compute Capability 8.9
2024-11-30 23:49:14.513676: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-11-30 23:49:14.570970: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
98/98 [==============================] - 29s 84ms/step - loss: 4.3393 - accuracy: 0.0846
Epoch 2/25
98/98 [==============================] - 5s 54ms/step - loss: 3.5924 - accuracy: 0.1715
Epoch 3/25
98/98 [==============================] - 5s 54ms/step - loss: 3.1520 - accuracy: 0.2434
Epoch 4/25
98/98 [==============================] - 5s 54ms/step - loss: 2.9126 - accuracy: 0.2939
Epoch 5/25
98/98 [==============================] - 5s 54ms/step - loss: 3.7606 - accuracy: 0.2035
Epoch 6/25
98/98 [==============================] - 5s 55ms/step - loss: 3.3880 - accuracy: 0.2272
Epoch 7/25
98/98 [==============================] - 5s 55ms/step - loss: 2.8392 - accuracy: 0.3091
Epoch 8/25
98/98 [==============================] - 5s 55ms/step - loss: 2.7753 - accuracy: 0.3222
Epoch 9/25
98/98 [==============================] - 5s 55ms/step - loss: 2.5441 - accuracy: 0.3679
Epoch 10/25
98/98 [==============================] - 5s 55ms/step - loss: 2.2937 - accuracy: 0.4159
Epoch 11/25
98/98 [==============================] - 5s 55ms/step - loss: 2.0584 - accuracy: 0.4679
Epoch 12/25
98/98 [==============================] - 5s 55ms/step - loss: 1.8501 - accuracy: 0.5134
Epoch 13/25
98/98 [==============================] - 5s 55ms/step - loss: 1.6035 - accuracy: 0.5678
Epoch 14/25
98/98 [==============================] - 5s 56ms/step - loss: 1.3341 - accuracy: 0.6309
Epoch 15/25
98/98 [==============================] - 6s 57ms/step - loss: 1.1129 - accuracy: 0.6875
Epoch 16/25
98/98 [==============================] - 6s 58ms/step - loss: 0.9613 - accuracy: 0.7251
Epoch 17/25
98/98 [==============================] - 6s 57ms/step - loss: 0.7963 - accuracy: 0.7765
Epoch 18/25
98/98 [==============================] - 5s 56ms/step - loss: 0.7567 - accuracy: 0.7839
Epoch 19/25
98/98 [==============================] - 6s 57ms/step - loss: 0.5892 - accuracy: 0.8342
Epoch 20/25
98/98 [==============================] - 6s 57ms/step - loss: 0.5843 - accuracy: 0.8339
Epoch 21/25
98/98 [==============================] - 6s 57ms/step - loss: 1.3317 - accuracy: 0.6774
Epoch 22/25
98/98 [==============================] - 6s 58ms/step - loss: 0.6691 - accuracy: 0.8199
Epoch 23/25
98/98 [==============================] - 6s 58ms/step - loss: 0.3243 - accuracy: 0.9192
Epoch 24/25
98/98 [==============================] - 6s 59ms/step - loss: 1.0674 - accuracy: 0.7385
Epoch 25/25
98/98 [==============================] - 6s 59ms/step - loss: 1.0427 - accuracy: 0.7503

@obriensystems
Copy link
Member Author

image

@obriensystems
Copy link
Member Author

256, 512, 1024, 2048, 4096 batch size

micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 575B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#2 DONE 0.5s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#4 DONE 0.0s

#5 [internal] load build context
#5 transferring context: 3.67kB 0.0s done
#5 DONE 0.0s

#6 [2/3] WORKDIR /src
#6 CACHED

#7 [3/3] COPY /src/tflow.py .
#7 DONE 0.0s

#8 exporting to image
#8 exporting layers 0.0s done
#8 writing image sha256:d820728b9b315c1efdca05a5df834ebee7a4367b7752f51e93e3fb581bccc798 done
#8 naming to docker.io/library/ml-tensorflow-win done
#8 DONE 0.0s
2024-11-30 23:57:26.685484: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-11-30 23:57:26.706273: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-11-30 23:57:26.706309: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-11-30 23:57:26.706321: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-30 23:57:26.710656: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-11-30 23:57:27.906699: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:27.909251: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:27.909301: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:27.910960: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:27.910997: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:27.911009: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:28.039226: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:28.039275: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:28.039281: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-11-30 23:57:28.039301: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-30 23:57:28.039338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9581 MB memory:  -> device: 0, name: NVIDIA RTX 3500 Ada Generation Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 9s 0us/step
Epoch 1/25
2024-11-30 23:57:48.472889: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-11-30 23:57:51.465069: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f54603fc950 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-11-30 23:57:51.465098: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX 3500 Ada Generation Laptop GPU, Compute Capability 8.9
2024-11-30 23:57:51.468427: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-11-30 23:57:51.525029: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
196/196 [==============================] - 34s 57ms/step - loss: 4.3636 - accuracy: 0.0868
Epoch 2/25
196/196 [==============================] - 8s 40ms/step - loss: 3.9713 - accuracy: 0.1480
Epoch 3/25
196/196 [==============================] - 8s 40ms/step - loss: 3.7218 - accuracy: 0.1683
Epoch 4/25
196/196 [==============================] - 8s 41ms/step - loss: 3.2891 - accuracy: 0.2331
Epoch 5/25
196/196 [==============================] - 8s 41ms/step - loss: 3.6898 - accuracy: 0.1783
Epoch 6/25
196/196 [==============================] - 8s 42ms/step - loss: 3.3448 - accuracy: 0.2245
Epoch 7/25
196/196 [==============================] - 8s 42ms/step - loss: 3.1134 - accuracy: 0.2653
Epoch 8/25
196/196 [==============================] - 8s 43ms/step - loss: 3.0021 - accuracy: 0.2784
Epoch 9/25
196/196 [==============================] - 8s 43ms/step - loss: 2.7545 - accuracy: 0.3221
Epoch 10/25
196/196 [==============================] - 9s 44ms/step - loss: 2.7476 - accuracy: 0.3218
Epoch 11/25
196/196 [==============================] - 8s 43ms/step - loss: 2.6039 - accuracy: 0.3515
Epoch 12/25
196/196 [==============================] - 9s 44ms/step - loss: 2.3861 - accuracy: 0.3956
Epoch 13/25
196/196 [==============================] - 9s 44ms/step - loss: 2.3027 - accuracy: 0.4242
Epoch 14/25
196/196 [==============================] - 9s 44ms/step - loss: 2.1197 - accuracy: 0.4542
Epoch 15/25
196/196 [==============================] - 9s 44ms/step - loss: 1.8711 - accuracy: 0.5049
Epoch 16/25
196/196 [==============================] - 9s 45ms/step - loss: 1.6557 - accuracy: 0.5570
Epoch 17/25
196/196 [==============================] - 9s 44ms/step - loss: 1.4631 - accuracy: 0.6041
Epoch 18/25
196/196 [==============================] - 9s 45ms/step - loss: 1.3836 - accuracy: 0.6176
Epoch 19/25
196/196 [==============================] - 9s 45ms/step - loss: 1.9901 - accuracy: 0.5074
Epoch 20/25
196/196 [==============================] - 9s 44ms/step - loss: 1.7544 - accuracy: 0.5436
Epoch 21/25
196/196 [==============================] - 9s 45ms/step - loss: 1.8138 - accuracy: 0.5295
Epoch 22/25
196/196 [==============================] - 9s 44ms/step - loss: 1.1861 - accuracy: 0.6827
Epoch 23/25
196/196 [==============================] - 9s 45ms/step - loss: 0.8640 - accuracy: 0.7623
Epoch 24/25
196/196 [==============================] - 9s 43ms/step - loss: 0.6471 - accuracy: 0.8293
Epoch 25/25
196/196 [==============================] - 8s 43ms/step - loss: 0.5260 - accuracy: 0.8597
(venv-cuda)
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 575B done
#1 DONE 0.0s

#2 [auth] tensorflow/tensorflow:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#3 DONE 0.6s

#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#5 DONE 0.0s

#6 [internal] load build context
#6 transferring context: 3.67kB done
#6 DONE 0.0s

#7 [2/3] WORKDIR /src
#7 CACHED

#8 [3/3] COPY /src/tflow.py .
#8 CACHED

#9 exporting to image
#9 exporting layers done
#9 writing image sha256:1d6810df1690137c7ea77998e44557269fbcdfbfcf20ba14748865748bdbeb1a done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.0s
2024-12-01 00:04:25.038485: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 00:04:25.058796: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 00:04:25.058830: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 00:04:25.058841: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 00:04:25.063577: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 00:04:26.074299: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.076994: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.077051: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.079172: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.079201: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.079211: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.226348: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.226451: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.226492: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 00:04:26.226564: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:04:26.226595: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9581 MB memory:  -> device: 0, name: NVIDIA RTX 3500 Ada Generation Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 6s 0us/step
Epoch 1/25
2024-12-01 00:04:43.941642: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 00:04:46.425421: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f61bc946980 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 00:04:46.425451: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX 3500 Ada Generation Laptop GPU, Compute Capability 8.9
2024-12-01 00:04:46.428552: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 00:04:46.484245: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
98/98 [==============================] - 30s 83ms/step - loss: 4.4019 - accuracy: 0.0753
Epoch 2/25
98/98 [==============================] - 5s 54ms/step - loss: 3.5497 - accuracy: 0.1767
Epoch 3/25
98/98 [==============================] - 5s 54ms/step - loss: 3.1803 - accuracy: 0.2434
Epoch 4/25
98/98 [==============================] - 5s 54ms/step - loss: 3.3044 - accuracy: 0.2308
Epoch 5/25
98/98 [==============================] - 5s 54ms/step - loss: 3.0593 - accuracy: 0.2603
Epoch 6/25
98/98 [==============================] - 5s 54ms/step - loss: 2.6995 - accuracy: 0.3353
Epoch 7/25
98/98 [==============================] - 5s 54ms/step - loss: 2.3841 - accuracy: 0.3945
Epoch 8/25
98/98 [==============================] - 5s 55ms/step - loss: 2.3127 - accuracy: 0.4143
Epoch 9/25
98/98 [==============================] - 5s 54ms/step - loss: 2.0369 - accuracy: 0.4691
Epoch 10/25
98/98 [==============================] - 5s 54ms/step - loss: 1.7624 - accuracy: 0.5369
Epoch 11/25
98/98 [==============================] - 5s 55ms/step - loss: 1.6354 - accuracy: 0.5610
Epoch 12/25
98/98 [==============================] - 5s 55ms/step - loss: 1.3999 - accuracy: 0.6258
Epoch 13/25
98/98 [==============================] - 5s 56ms/step - loss: 3.3840 - accuracy: 0.2818
Epoch 14/25
98/98 [==============================] - 5s 55ms/step - loss: 3.4657 - accuracy: 0.2331
Epoch 15/25
98/98 [==============================] - 5s 56ms/step - loss: 3.0036 - accuracy: 0.3019
Epoch 16/25
98/98 [==============================] - 5s 56ms/step - loss: 2.4746 - accuracy: 0.3717
Epoch 17/25
98/98 [==============================] - 5s 56ms/step - loss: 2.1085 - accuracy: 0.4415
Epoch 18/25
98/98 [==============================] - 5s 56ms/step - loss: 1.8274 - accuracy: 0.5083
Epoch 19/25
98/98 [==============================] - 6s 56ms/step - loss: 1.5464 - accuracy: 0.5782
Epoch 20/25
98/98 [==============================] - 6s 57ms/step - loss: 1.2816 - accuracy: 0.6471
Epoch 21/25
98/98 [==============================] - 6s 57ms/step - loss: 1.0437 - accuracy: 0.7191
Epoch 22/25
98/98 [==============================] - 6s 57ms/step - loss: 0.7747 - accuracy: 0.7823
Epoch 23/25
98/98 [==============================] - 6s 58ms/step - loss: 0.8367 - accuracy: 0.7624
Epoch 24/25
98/98 [==============================] - 6s 58ms/step - loss: 0.6280 - accuracy: 0.8256
Epoch 25/25
98/98 [==============================] - 6s 58ms/step - loss: 0.4482 - accuracy: 0.8822
(venv-cuda)
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ java --version
java 21.0.5 2024-10-15 LTS
Java(TM) SE Runtime Environment (build 21.0.5+9-LTS-239)
Java HotSpot(TM) 64-Bit Server VM (build 21.0.5+9-LTS-239, mixed mode, sharing)
(venv-cuda)
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ ./build.sh
ERROR: listing workers for Build: failed to list workers: Canceled: context canceled
(venv-cuda)
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 575B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#2 ...

#3 [auth] tensorflow/tensorflow:pull token for registry-1.docker.io
#3 DONE 0.0s

#2 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#2 DONE 0.8s

#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#5 DONE 0.0s

#6 [internal] load build context
#6 transferring context: 3.68kB 0.0s done
#6 DONE 0.0s

#7 [2/3] WORKDIR /src
#7 CACHED

#8 [3/3] COPY /src/tflow.py .
#8 DONE 0.0s

#9 exporting to image
#9 exporting layers 0.0s done
#9 writing image sha256:bbcea2a57cf72678d126f28fe2eb940774d4bd9acf9b5ecba9f66bbb589e6153 done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.0s
2024-12-01 00:25:31.999107: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 00:25:32.229234: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 00:25:32.229266: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 00:25:32.230800: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 00:25:32.340545: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 00:25:34.385224: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.426195: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.426241: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.429134: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.429191: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.429204: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.595716: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.595763: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.595767: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 00:25:34.595799: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:25:34.595810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9581 MB memory:  -> device: 0, name: NVIDIA RTX 3500 Ada Generation Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 18s 0us/step
Epoch 1/25
2024-12-01 00:26:04.454519: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 00:26:06.652346: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f9a1359f940 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 00:26:06.652378: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX 3500 Ada Generation Laptop GPU, Compute Capability 8.9
2024-12-01 00:26:06.659005: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 00:26:06.733169: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
49/49 [==============================] - 32s 201ms/step - loss: 4.6684 - accuracy: 0.0623
Epoch 2/25
49/49 [==============================] - 5s 103ms/step - loss: 3.6396 - accuracy: 0.1590
Epoch 3/25
49/49 [==============================] - 5s 105ms/step - loss: 3.1207 - accuracy: 0.2439
Epoch 4/25
49/49 [==============================] - 5s 105ms/step - loss: 2.8306 - accuracy: 0.3066
Epoch 5/25
49/49 [==============================] - 5s 106ms/step - loss: 2.7187 - accuracy: 0.3275
Epoch 6/25
49/49 [==============================] - 5s 106ms/step - loss: 2.6684 - accuracy: 0.3385
Epoch 7/25
49/49 [==============================] - 5s 106ms/step - loss: 2.4572 - accuracy: 0.3773
Epoch 8/25
49/49 [==============================] - 5s 106ms/step - loss: 2.9407 - accuracy: 0.2756
Epoch 9/25
49/49 [==============================] - 5s 105ms/step - loss: 2.4240 - accuracy: 0.3785
Epoch 10/25
49/49 [==============================] - 5s 106ms/step - loss: 2.0339 - accuracy: 0.4637
Epoch 11/25
49/49 [==============================] - 5s 106ms/step - loss: 1.6721 - accuracy: 0.5452
Epoch 12/25
49/49 [==============================] - 5s 105ms/step - loss: 1.3453 - accuracy: 0.6221
Epoch 13/25
49/49 [==============================] - 5s 106ms/step - loss: 1.1025 - accuracy: 0.6861
Epoch 14/25
49/49 [==============================] - 5s 106ms/step - loss: 0.8672 - accuracy: 0.7436
Epoch 15/25
49/49 [==============================] - 5s 106ms/step - loss: 0.6552 - accuracy: 0.8053
Epoch 16/25
49/49 [==============================] - 5s 105ms/step - loss: 0.5098 - accuracy: 0.8456
Epoch 17/25
49/49 [==============================] - 5s 105ms/step - loss: 0.8505 - accuracy: 0.7490
Epoch 18/25
49/49 [==============================] - 5s 106ms/step - loss: 1.0267 - accuracy: 0.7125
Epoch 19/25
49/49 [==============================] - 5s 105ms/step - loss: 0.9959 - accuracy: 0.7142
Epoch 20/25
49/49 [==============================] - 5s 106ms/step - loss: 0.4886 - accuracy: 0.8532
Epoch 21/25
49/49 [==============================] - 5s 105ms/step - loss: 0.2797 - accuracy: 0.9231
Epoch 22/25
49/49 [==============================] - 5s 105ms/step - loss: 0.1975 - accuracy: 0.9440
Epoch 23/25
49/49 [==============================] - 5s 105ms/step - loss: 0.1371 - accuracy: 0.9600
Epoch 24/25
49/49 [==============================] - 5s 107ms/step - loss: 0.0981 - accuracy: 0.9735
Epoch 25/25
49/49 [==============================] - 5s 106ms/step - loss: 0.0758 - accuracy: 0.9776
(venv-cuda)
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 575B done
#1 DONE 0.0s

#2 [auth] tensorflow/tensorflow:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#3 DONE 1.0s

#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#5 DONE 0.0s

#6 [internal] load build context
#6 transferring context: 3.68kB 0.0s done
#6 DONE 0.0s

#7 [2/3] WORKDIR /src
#7 CACHED

#8 [3/3] COPY /src/tflow.py .
#8 DONE 0.0s

#9 exporting to image
#9 exporting layers 0.0s done
#9 writing image sha256:91662b98c48f03fb3acd6fbc253841ce7aac8e445d0cdb8380442f5e29a648f1 done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.0s
2024-12-01 00:30:21.013693: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 00:30:21.034515: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 00:30:21.034574: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 00:30:21.034588: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 00:30:21.039309: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 00:30:22.066704: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.069331: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.069380: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.071225: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.071251: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.071259: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.238991: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.239037: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.239044: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 00:30:22.239062: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:30:22.239076: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9581 MB memory:  -> device: 0, name: NVIDIA RTX 3500 Ada Generation Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 6s 0us/step
Epoch 1/25
2024-12-01 00:30:39.622591: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 00:30:41.973276: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f0dc02065d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 00:30:41.973300: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX 3500 Ada Generation Laptop GPU, Compute Capability 8.9
2024-12-01 00:30:41.976530: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 00:30:42.035687: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.

25/25 [==============================] - 34s 400ms/step - loss: 4.9395 - accuracy: 0.0354
Epoch 2/25
25/25 [==============================] - 5s 207ms/step - loss: 3.9236 - accuracy: 0.1054
Epoch 3/25
25/25 [==============================] - 5s 207ms/step - loss: 3.4189 - accuracy: 0.1878
Epoch 4/25
25/25 [==============================] - 5s 209ms/step - loss: 3.0047 - accuracy: 0.2666
Epoch 5/25
25/25 [==============================] - 5s 208ms/step - loss: 2.6428 - accuracy: 0.3382
Epoch 6/25
25/25 [==============================] - 5s 208ms/step - loss: 2.3087 - accuracy: 0.4076
Epoch 7/25
25/25 [==============================] - 5s 208ms/step - loss: 2.1035 - accuracy: 0.4594
Epoch 8/25
25/25 [==============================] - 5s 209ms/step - loss: 1.8614 - accuracy: 0.5035
Epoch 9/25
25/25 [==============================] - 5s 208ms/step - loss: 1.5357 - accuracy: 0.5820
Epoch 10/25
25/25 [==============================] - 5s 209ms/step - loss: 1.4527 - accuracy: 0.6061
Epoch 11/25
25/25 [==============================] - 5s 209ms/step - loss: 1.0684 - accuracy: 0.6985
Epoch 12/25
25/25 [==============================] - 5s 209ms/step - loss: 0.7071 - accuracy: 0.7972
Epoch 13/25
25/25 [==============================] - 5s 209ms/step - loss: 0.5945 - accuracy: 0.8280
Epoch 14/25
25/25 [==============================] - 5s 208ms/step - loss: 0.5797 - accuracy: 0.8359
Epoch 15/25
25/25 [==============================] - 5s 208ms/step - loss: 0.5358 - accuracy: 0.8441
Epoch 16/25
25/25 [==============================] - 5s 208ms/step - loss: 0.4175 - accuracy: 0.8808
Epoch 17/25
25/25 [==============================] - 5s 209ms/step - loss: 0.3493 - accuracy: 0.8983
Epoch 18/25
25/25 [==============================] - 5s 209ms/step - loss: 0.4485 - accuracy: 0.8755
Epoch 19/25
25/25 [==============================] - 5s 209ms/step - loss: 0.7039 - accuracy: 0.8063
Epoch 20/25
25/25 [==============================] - 5s 209ms/step - loss: 1.0323 - accuracy: 0.7179
Epoch 21/25
25/25 [==============================] - 5s 209ms/step - loss: 0.5772 - accuracy: 0.8379
Epoch 22/25
25/25 [==============================] - 5s 209ms/step - loss: 0.2776 - accuracy: 0.9209
Epoch 23/25
25/25 [==============================] - 5s 208ms/step - loss: 0.2011 - accuracy: 0.9501
Epoch 24/25
25/25 [==============================] - 5s 209ms/step - loss: 0.2049 - accuracy: 0.9618
Epoch 25/25
25/25 [==============================] - 5s 209ms/step - loss: 0.1222 - accuracy: 0.9830
(venv-cuda)
micha@p1gen6 MINGW64 /c/wse_github/ObrienlabsDev/machine-learning (main)
$ ./build.sh
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 575B done
#1 DONE 0.0s

#2 [auth] tensorflow/tensorflow:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/tensorflow/tensorflow:2.14.0-gpu
#3 DONE 0.8s

#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [1/3] FROM docker.io/tensorflow/tensorflow:2.14.0-gpu@sha256:64602abcd8cc4f4bdd6268ca0abc39e6d37113d700886afd15f6dd151210b206
#5 DONE 0.0s

#6 [internal] load build context
#6 transferring context: 3.67kB 0.0s done
#6 DONE 0.0s

#7 [2/3] WORKDIR /src
#7 CACHED

#8 [3/3] COPY /src/tflow.py .
#8 DONE 0.0s

#9 exporting to image
#9 exporting layers 0.0s done
#9 writing image sha256:a5ee876f43f024c7a064c48b3d61ae7d854230bbec86ef4729f987e0e96aba63 done
#9 naming to docker.io/library/ml-tensorflow-win done
#9 DONE 0.0s
2024-12-01 00:35:19.583397: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-01 00:35:19.606388: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-01 00:35:19.606418: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-01 00:35:19.606429: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-01 00:35:19.610639: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-01 00:35:21.166299: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.168924: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.168970: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.170887: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.170914: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.170923: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.315538: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.315611: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.315620: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2024-12-01 00:35:21.315645: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-12-01 00:35:21.315679: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9581 MB memory:  -> device: 0, name: NVIDIA RTX 3500 Ada Generation Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169001437/169001437 [==============================] - 8s 0us/step
Epoch 1/25
2024-12-01 00:35:40.830076: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8600
2024-12-01 00:35:44.185017: W tensorflow/tsl/framework/bfc_allocator.cc:296] Allocator (GPU_0_bfc) ran out of memory trying to allocate 5.09GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2024-12-01 00:35:44.422820: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f40b45739e0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-12-01 00:35:44.422842: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA RTX 3500 Ada Generation Laptop GPU, Compute Capability 8.9
2024-12-01 00:35:44.425879: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-12-01 00:35:44.487522: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
2024-12-01 00:35:51.967788: W tensorflow/tsl/framework/bfc_allocator.cc:296] Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.02GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2024-12-01 00:35:51.967857: W tensorflow/tsl/framework/bfc_allocator.cc:296] Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.02GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2024-12-01 00:35:51.967866: W tensorflow/tsl/framework/bfc_allocator.cc:296] Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.14GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2024-12-01 00:35:52.727086: W tensorflow/tsl/framework/bfc_allocator.cc:296] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.41GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2024-12-01 00:35:57.297448: W tensorflow/tsl/framework/bfc_allocator.cc:296] Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.33GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2024-12-01 00:35:57.297528: W tensorflow/tsl/framework/bfc_allocator.cc:296] Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.33GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
12/13 [==========================>...] - ETA: 0s - loss: 5.3903 - accuracy: 0.02432024-12-01 00:36:09.574078: W tensorflow/tsl/framework/bfc_allocator.cc:296] Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.60GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
13/13 [==============================] - 39s 763ms/step - loss: 5.3775 - accuracy: 0.0247
Epoch 2/25
13/13 [==============================] - 5s 385ms/step - loss: 4.2209 - accuracy: 0.0707
Epoch 3/25
13/13 [==============================] - 5s 370ms/step - loss: 3.8432 - accuracy: 0.1215
Epoch 4/25
13/13 [==============================] - 5s 368ms/step - loss: 3.5030 - accuracy: 0.1739
Epoch 5/25
13/13 [==============================] - 5s 376ms/step - loss: 3.3365 - accuracy: 0.2107
Epoch 6/25
13/13 [==============================] - 5s 371ms/step - loss: 3.0246 - accuracy: 0.2688
Epoch 7/25
13/13 [==============================] - 5s 373ms/step - loss: 2.6487 - accuracy: 0.3396
Epoch 8/25
13/13 [==============================] - 5s 374ms/step - loss: 2.2471 - accuracy: 0.4246
Epoch 9/25
13/13 [==============================] - 5s 375ms/step - loss: 1.9108 - accuracy: 0.4988
Epoch 10/25
13/13 [==============================] - 5s 374ms/step - loss: 1.6222 - accuracy: 0.5708
Epoch 11/25
13/13 [==============================] - 5s 376ms/step - loss: 1.4085 - accuracy: 0.6119
Epoch 12/25
13/13 [==============================] - 5s 385ms/step - loss: 1.1184 - accuracy: 0.6859
Epoch 13/25
13/13 [==============================] - 5s 396ms/step - loss: 1.0108 - accuracy: 0.7160
Epoch 14/25
13/13 [==============================] - 5s 397ms/step - loss: 0.8975 - accuracy: 0.7460
Epoch 15/25
13/13 [==============================] - 5s 417ms/step - loss: 0.6930 - accuracy: 0.8035
Epoch 16/25
13/13 [==============================] - 5s 415ms/step - loss: 0.5735 - accuracy: 0.8375
Epoch 17/25
13/13 [==============================] - 5s 414ms/step - loss: 0.4665 - accuracy: 0.8621
Epoch 18/25
13/13 [==============================] - 5s 389ms/step - loss: 0.3995 - accuracy: 0.8819
Epoch 19/25
13/13 [==============================] - 5s 375ms/step - loss: 0.3647 - accuracy: 0.8939
Epoch 20/25
13/13 [==============================] - 5s 384ms/step - loss: 0.3451 - accuracy: 0.8980
Epoch 21/25
13/13 [==============================] - 5s 381ms/step - loss: 0.3194 - accuracy: 0.9040
Epoch 22/25
13/13 [==============================] - 5s 411ms/step - loss: 0.2781 - accuracy: 0.9201
Epoch 23/25
13/13 [==============================] - 5s 401ms/step - loss: 0.4304 - accuracy: 0.8811
Epoch 24/25
13/13 [==============================] - 5s 382ms/step - loss: 0.2810 - accuracy: 0.9201
Epoch 25/25
13/13 [==============================] - 5s 382ms/step - loss: 0.1870 - accuracy: 0.9457

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant