Intel® Extension for TensorFlow* 1.0.0
Major Features
Intel® Extension for TensorFlow* is an Intel optimized Python package to extend official TensorFlow capability of running TensorFlow workloads on Intel GPU, and brings the first Intel GPU product Intel® Data Center GPU Flex Series 170 into TensorFlow open source community for AI workload acceleration. It’s based on TensorFlow PluggableDevice interface and provides fully support from TensorFlow 2.10.
This release contains following major features:
-
AOT (Ahead-of-time compilation)
AOT Compilation is a performance feature which targets to remove just-in-time(JIT) overhead during application launch. It can be enabled when configure your system to source code build. Intel® Extension for TensorFlow* package in PyPI channel is built with AOT enabled.
-
Graph Optimization
Advanced Automatic Mixed Precision
Advanced Automatic Mixed Precision implements low-precision data types (
float16
orbfloat16
) with further boosted performance and less memory consumption. Please get started from how to enable.Graph fusion
Intel® Extension for TensorFlow* provides graph optimization to fuse specified operators pattern to new single operator for better performance, such as
Conv2D+ReLU
,Linear+ReLU
. Refer to the supported fusion list from Graph fusion. -
Python API
Public APIs to extend XPU operators are developed for better performance in the
itex.ops
namespace, includingAdamWithWeightDecayOptimizer
/gelu
/LayerNormalization
/ItexLSTM
. Please find more details from Intel® Extension for TensorFlow* ops. -
Intel® Extension for TensorFlow* Profiler
Intel® Extension for TensorFlow* provides support for TensorFlow* Profiler to trace TensorFlow* models performance on Intel GPU. Please refer to how to enable profiler for more details.
-
Docker Container Support
Intel® Extension for TensorFlow* Docker container is delivered to include Intel® oneAPI Base Toolkit and all other software stack except Intel GPU Drivers. Users only needs to install GPU driver in host machine, before pull and launch docker container directly. Please get started from Docker Container Guide.
-
FP32 Math Mode
Float32 precision is to reduce TensorFloat-32 execution by
ITEX_FP32_MATH_MODE
setting. Users can enable this feature by settingITEX_FP32_MATH_MODE
(defaultFP32
) to be equal with either value (GPU:TF32
/FP32
). More details in ITEX_FP32_MATH_MODE. -
Intel® Extension for TensorFlow* Verbose
ITEX_VERBOSE
is designed to help users get more Intel® Extension for TensorFlow* log message by different log levels. More details in ITEX_VERBOSE level introduction. -
INT8 Quantization
Intel® Extension for TensorFlow* co-works with Intel® Neural Compressor >= 1.14.1 to provide compatible TensorFlow INT8 quantization solution support with same user experience.
-
Experimental Support
This release provides experimental support for Intel® Arc™ A-Series GPUs on Windows Subsystem for Linux 2 with Ubuntu Linux installed and native Ubuntu Linux, and second generation Intel® Xeon® Scalable Processors and newer, such as Cascade Lake, Cooper Lake, Ice Lake and Sapphire Rapids.
Known Issues
- FP64 is not natively supported by the Intel® Data Center GPU Flex Series platform. If you run any AI workload on that platform and receive error message as "[CRITICAL ERROR] Kernel 'XXX' removed due to usage of FP64 instructions unsupported by the targeted hardware" , it means that a kernel requires FP64 instructions is removed and not executed, hence the accuracy of whole workload is wrong.