Release TensorFlow-DirectML 1.15.3.dev200911 · microsoft/tensorflow-directml

Preview build of tensorflow-directml built on September 11, 2020.

The Python packages are available as a PyPI release. To download the appropriate python package automatically, simply pip install tensorflow-directml.

Changes in dev200911:

64 new kernels registered for the DML device (Block-RNN/LSTM/GRU ops, matrix diag ops, roll, and others).
New BFC-based allocator for DML resources that greatly improves utilization of available memory.
TF will only attempt to use DirectX devices with support for 16- and 8-bit datatypes, such as FLOAT16, since there is no way to disable certain kernel registrations at runtime.
Add support for DML_VISIBLE_DEVICES environment variable. This behaves identically to CUDA_VISIBLE_DEVICES. When this environment variable is set, it filters (or re-orders) adapter indices in a process-wide fashion. When adapters are filtered in this way, they don't appear to TF at all and don't show up during device enumeration.
Add support for TF_DIRECTML_KERNEL_CACHE_SIZE environment variable, which can be used to used to potentially reuse kernel instances more frequently (defaults to 1024 kernels).
Deliberately leak per-process DML/D3D12 resources and state (see DmlDeviceCache::Instance) to avoid order-of-destruction issues during process exist (matches CUDA device behavior).
Various bug fixes in kernels and out-of-memory handling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorFlow-DirectML 1.15.3.dev200911

Changes in dev200911: