Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No float16 support for _FusedConv2D #347

Open
leandro-gracia-gil opened this issue Jan 27, 2023 · 5 comments
Open

No float16 support for _FusedConv2D #347

leandro-gracia-gil opened this issue Jan 27, 2023 · 5 comments

Comments

@leandro-gracia-gil
Copy link

When trying to use the tensorflow directml plugin with a model that works fine with tensorflow-gpu, I'm getting this error.

2 root error(s) found.
  (0) NOT_FOUND:  No registered '_FusedConv2D' OpKernel for 'GPU' devices compatible with node {{node model/conv2d/Relu}}
         (OpKernel was found, but attributes didn't match) Requested Attributes: T=DT_HALF, _XlaHasReferenceVars=false, data_format="NCHW", dilations=[1, 1, 1, 1], epsilon=0, explicit_paddings=[], fused_ops=["BiasAdd", "Relu"], leakyrelu_alpha=0.2, num_args=1, padding="SAME", strides=[1, 1, 1, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"
        .  Registered:  device='GPU'; T in [DT_FLOAT]
  device='CPU'; T in [DT_FLOAT]
  device='CPU'; T in [DT_DOUBLE]
  device='CPU'; T in [DT_BFLOAT16]

         [[StatefulPartitionedCall/StatefulPartitionedCall_1/model/conv2d/Relu]]

This looks like fused 2D convolutions have float32, float64, and bfloat16 support, but not float16. Unfortunately, using bfloat16 is not an option in my case because the computers where inference is going to run will have GPUs that support float16 but not yet bfloat16.

Would it be possible to add support for float16 too?

I'm using tensorflow-directml-plugin version 0.3.0.dev221212, and tensorflow-cpu version 2.10.0 on a Windows 10 machine, no WSL.

@leandro-gracia-gil
Copy link
Author

Actually, looking in more detail it seems like only float32 is supported for GPU, while float32, float64 and bfloat16 are supported for CPU.

In any case the original request is the same. Would it be possible to add GPU support for float16 to this operation?

@maggie1059
Copy link
Collaborator

Hi @leandro-gracia-gil, we just published a new version of the plugin here with float16 support added for _FusedConv2D: https://pypi.org/project/tensorflow-directml-plugin/

Would you mind trying it out and letting us know if this fixes the error?

@leandro-gracia-gil
Copy link
Author

Hi @maggie1059, thanks for the heads up.

It does look like this particular error is indeed fixed. Thanks!

Now I'm getting a warning tens of times, and it seems my model gets stuck or something because I don't get any results no matter how long I wait. That said, this could very well be a different unrelated problem I'm hitting afterwards. I'm pasting the warning here just in case it rings a bell.

2023-02-03 13:08:30.098584: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-02-03 13:08:30.098904: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 82902 MB memory) -> physical PluggableDevice (device: 0, name: DML, pci bus id: <undefined>)

The amount of device memory in the warning is also wrong. It should only be 24 GB.

In any case, if this looks unrelated I think we can probably close this issue. Thanks again.

@lllck
Copy link

lllck commented Apr 17, 2024

Hello @leandro-gracia-gil , have you resolved this problem? I am also facing the same problem now.

@leandro-gracia-gil
Copy link
Author

Hello @leandro-gracia-gil , have you resolved this problem? I am also facing the same problem now.

I'm afraid I didn't. I eventually stopped trying to use the DirectML plugin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants