Releases: microsoft/tensorflow-directml
TensorFlow-DirectML 1.15.8
Build of tensorflow-directml built on October 3, 2022
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml
.
Changes in 1.15.8
- Prevent unbounded growth of command allocator memory.
- Add unsorted segment ops for DML (emulated on the CPU).
- Add emulated support for int64.
- Add CPU emulated versions of
UnsortedSegmentSum
,UnsortedSegmentMax
,UnsortedSegmentMin
andUnsortedSegmentProd
. - Pin protobuf version between 3.61 (inclusive) and 4.0.0 (exclusive) since version >= 4.0.0 is not compatible with TensorFlow 1.15.
- Optimize output allocation for inputs that can be executed in-place and directly forwarded to the output.
- Add a DirectML kernel for
InTopKV2
.
TensorFlow-DirectML 1.15.7
Build of tensorflow-directml built on May 14, 2022
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml
.
Changes in 1.15.7
- Fix python API that was returning the wrong version of the package.
TensorFlow-DirectML 1.15.6
Build of tensorflow-directml built on May 6, 2022
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml
.
Changes in 1.15.6
- Enables forwarding of inputs to avoid copies in some scenarios.
- Removes the the use of CopyTileMappings for PIX captures.
- Increases the default cache size for the kernel cache to 1536.
- Upgrades to DirectML 1.8.2, which includes a fix for bad tensor alignment on Linux and a fix for grouped Convolution for certain combinations of parameters.
TensorFlow-DirectML 1.15.5
Build of tensorflow-directml built on September 3, 2021.
This marks the first generally available (non-preview) release of TensorFlow-DirectML, which is based on the 1.15.5 version of TensorFlow. See our roadmap for more details on our plans for releases based on TensorFlow 1.x and 2.x.
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml
.
Changes in 1.15.5
- Registers 7 new DirectML kernels. See the new Operator Roadmap for an overview of the operations DirectML supports.
- Upgrades to DirectML 1.7.0, which includes several shader performance improvements for training-related operators.
- Significantly improves the DirectML memory allocator, which now supports tiled allocations to improve the utilization of dedicated video memory.
- Improves handling of INT64 tensors in several kernels.
- Removes dependency on WSL2 interoperability for Linux builds.
- D3D12, DXGI, and DXCore are linked at run time instead of load time; the CPU device should be usable even if running in a pure Linux environment.
- Modifies the CI/release pipelines to produce manylinux2010-compliant wheels that run on even more glibc-based WSL2 distros.
- Fixes several DirectML kernel bugs.
TensorFlow-DirectML 1.15.5.dev210429
Preview build of tensorflow-directml built on April 29, 2021.
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml.
Changes in dev210429
- Updates to DirectML 1.5.1 to prevent a driver crash affecting some Intel devices.
- Reimplements ScatterNdAdd and ScatterNdSub kernels.
TensorFlow-DirectML 1.15.5.dev210422
Preview build of tensorflow-directml built on April 22, 2021.
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml.
Changes in dev210422
- Registers 32 new DirectML kernels. See the new Operator Roadmap for an overview of the operations DirectML supports.
- Updates to DirectML 1.5.0, which includes several shader performance improvements for training-related operators.
- Greatly improves threading efficiency of the DML execution backend, which improves performance in models that comprise large numbers of inexpensive operators.
- Resolves fatal errors when attempting to run TensorFlow-DirectML on older versions of Windows that do not support DirectML (falls back to CPU device).
- Forces integrated GPUs (UMA adapters) to grow their memory requirements dynamically instead of upfront like discrete adapters. This should resolve immediate device removal on some hardware.
- Merges upstream TensorFlow 1.15.5 fixes.
- Fixes several DirectML kernel bugs.
TensorFlow-DirectML 1.15.4.dev201216
Preview build of tensorflow-directml built on December 16, 2020.
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml
.
Changes in dev201216:
- Registers 55 new DirectML kernels. See the new Operator Roadmap for an overview of the operations DirectML supports and is planning to support.
- Updates to DirectML 1.4.0 using the redistributable NuGet package, Microsoft.AI.DirectML.
- Resolves an issue that prevented the DirectML library DLL file from loading correctly on Windows builds. This fix makes tensorflow-directml compatible with UWP apps like Python from the Microsoft Store.
- Improves DirectX device removal reporting and troubleshooting. See Troubleshooting-Timeouts.md.
- Fixes spurious out-of-memory issues affecting integrated GPUs, which don't have large pools of dedicated memory.
- Exposes additional DirectX adapter information through
device_lib.list_local_devices()
. This makes it easier for consumers of tensorflow-directml to filter out adapters that don't meet minimum hardware requirements. - Removes AVX2 requirement from Linux builds, which improves CPU compatibility.
- Switches to numpy 1.18.5 (works around breaking changes).
- Merges upstream TensorFlow 1.15.4 fixes.
- Fixes numerous DirectML kernel bugs.
- Adds a few minor performance improvements.
TensorFlow-DirectML 1.15.3.dev200911
Preview build of tensorflow-directml built on September 11, 2020.
The Python packages are available as a PyPI release. To download the appropriate python package automatically, simply pip install tensorflow-directml
.
Changes in dev200911:
- 64 new kernels registered for the DML device (Block-RNN/LSTM/GRU ops, matrix diag ops, roll, and others).
- New BFC-based allocator for DML resources that greatly improves utilization of available memory.
- TF will only attempt to use DirectX devices with support for 16- and 8-bit datatypes, such as FLOAT16, since there is no way to disable certain kernel registrations at runtime.
- Add support for DML_VISIBLE_DEVICES environment variable. This behaves identically to CUDA_VISIBLE_DEVICES. When this environment variable is set, it filters (or re-orders) adapter indices in a process-wide fashion. When adapters are filtered in this way, they don't appear to TF at all and don't show up during device enumeration.
- Add support for TF_DIRECTML_KERNEL_CACHE_SIZE environment variable, which can be used to used to potentially reuse kernel instances more frequently (defaults to 1024 kernels).
- Deliberately leak per-process DML/D3D12 resources and state (see DmlDeviceCache::Instance) to avoid order-of-destruction issues during process exist (matches CUDA device behavior).
- Various bug fixes in kernels and out-of-memory handling.