Releases: TexasInstruments/edgeai-tidl-tools
Releases · TexasInstruments/edgeai-tidl-tools
10_01_00_02
New in this Release
Description | Notes |
---|---|
Support for ONNXRUNTIME 1.15.0 | |
Support for several new operators: TopK, Sqrt, Sin, Pow, Mish, Log, Instance Normalization, HSWISH, Floor, Exp, ERF, AsinH, Asin & Abs | |
Improved support for networks with a large number of operators (>2K) | |
Support for improved latency & weight sparsity | Specific to J722S/AM67A/TDA4AEN platforms |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-6871 | Softmax (with output type float) gives incorrect results when axis is set to width and width < 16 | All except AM62 |
TIDL-6865 | Elementwise layers with dimension N1xC1xH1xW1 and N2xC2xH2xW2, gives functionally incorrect output on target, if H1 or H2 is 1 and H1 != H2 and C1 == C2 > 1 | All except AM62 |
TIDL-6485 | Models compiled with option "advanced_options:inference_mode" = 2 and containing a Constant Data layer H > 1 will result in functionally incorrect output | All except AM62 |
TIDL-6473 | Models compiled with option "advanced_options:inference_mode" = 2 and containing a layer running in TIDL_NOT_MULTI_CORE mode followed by Slice layer running in TIDL_MULTI_CORE mode may result in functionally incorrect output in host emulation/target | All except AM62 |
TIDL-6461 | Using "advanced_options:inference_mode" = 2 and "debug_level" >=3 may result in error for debug stitching script for some networks | All except AM62 |
TIDL-6418 | Models compiled with "advanced_options:inference_mode" = 2 compilation option may result in functionally incorrect outputs in if the model has Slice/Reshape layers | All except AM62 |
TIDL-5169 | Dataconvert layer with layout conversion from NCHW->NHWC at the output of network returns TIDLRT_create time error if number of output channels for this layer is equal to one | All except AM62 |
TIDL-5167 | Layers with multiple input may result into functional issue if inputs have different padding in the buffer | All except AM62 |
TIDL-5166 | Matmul layer with A matrix broadcast in channel axis results in crash on target/EVM | All except AM62 |
TIDL-5162 | Memory planning fails for models having batches with broadcast | All except AM62 |
TIDL-4868 | Reshape layer accidentally gets denied with message : "Input volume should be equal to output volume" | All except AM62 |
TIDL-4855 | ONNX Runtime does not report correct copy cycles from get_TI_benchmark_data | All except AM62 |
TIDL-4833 | Networks erroring out with message "tidlReadPerChannelMeanStatistics : Unable to read Per Channel Mean statistics" | All except AM62 |
TIDL-4832 | Networks with GEMM are not correctly getting denied, with the following error towards the end "Gemm layer is not supported in TIDL when bias size != output width" | All except AM62 |
TIDL-4714 | Networks with >1536 operators in a single graph fail to compile | All except AM62 |
TIDL-4460 | Model compilation fails for networks with Transpose layers with following error message : "Failed to Allocate memory record 7 @ space = 17 and size = xxxxxx !!!" | |
TIDL-4367 | Networks with multiple branch where first layer in any one of the branch is a reshape layer gives functionally wrong output | All except AM62 |
TIDL-3928 | Sub operator with variable input get's incorrectly offloaded to C7x and results in an init failure during inference | All except AM62 |
TIDL-3902 | Model compiled with enableHighResOptimization=1 option, with any convolution layer's weights volume plus 192 * number of input channels greater than 224KB(for AM62A/J722S) or 448KB (for all other devices), may result into hang on target | All except AM62 |
TIDL-2947 | Convolution with pad greater than the input width results in incorrect outputs | All except AM62 |
Known Issues
ID | Description | Affected Platforms | Occurrence | Workaround in this release |
---|---|---|---|---|
TIDL-7073 | Running inference on a network with option "advanced_options:inference_mode" = 2 sequentially followed by a network with "advanced_options:inference_mode" = 0 on c7x_2 or greater results in hang on target | All except AM62 | Rare | None |
TIDL-6866 | Using option "advanced_options:output_feature_16bit_names_list" along with "high_resolution_optimization" = 1 and "tensor_bits = 8" results in functionally incorrect output on host emulation/target | All except AM62 | Rare | None |
TIDL-6856 | 3x1 convolution with single input and output channel fails in model compilation | All except AM62 | Rare | None |
TIDL-6469 | partial_init_during_compile fails in host emulation mode | All except AM62 | Frequent | None |
TIDL-6465 | Convolution with Fr=Fc=3 and dilation>8 (for AM62A/J722S) dilation>16 (for other devices) gives wrong output on Host Emulation | All except AM62 | Rare | None |
TIDL-4731 | Fusion of batch norm layer into convolution layer when batchnorm is before convolution can give incorrect results when convolution input has pad | All except AM62 | Rare | None |
TIDL-3865 | Elementwise layers with broadcast along width or height or both and number of channels > 1 produces incorrect outputs on device | All except AM62 | Rare | None |
10_00_08_00
New in this Release
Description | Notes |
---|---|
Improved performance for 16-bit leakyRELU | This requires an updated version of C7x/MMA firmware (10_00_08_00) and needs to have advanced_options:c7x_firmware set to 10_00_08_00 |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-5158 | Models compiled with add_data_convert_ops=0 and with unsigned input results in accuracy loss | All except AM62 |
TIDL-5139 | Layer with multiple inputs and having one input in 16-bit and at least one input in 8-bit may result in accuracy degradation | All except AM62 |
TIDL-5045 | Matmul layer with both inputs with different signedness returns error during compilation (Note: This will require a firmware update) | Only impacts AM68PA (TDA4VM) |
TIDL-5041 | TIDL model compilation fails when the model contains a convolution with kernel size (1x1) and input of height = 1 and width = 1 | All except AM62 |
TIDL-4964 | Model compilation may return following error message "Error : Error Code = <ERR_UNKNOWN>" for networks having concat layer | All except AM62 |
TIDL-4858 | Clip layer without a max initializer results in poor accuracy | All except AM62 |
TIDL-4854 | Model compilation results in a segmentation fault in certain cases when the same input is being fed twice to an elementwise Mul operator | All except AM62 |
TIDL-4737 | Model compilation exits with error message "Mixed Precision is not supported with matmul layer, Aborting" when network has MatMul layer and running with mixed precision | All except AM62 |
TIDL-4679 | Custom unsupported layer removed from subgraph instead of delegated to ARM | All except AM62 |
TIDL-4631 | Model compilation fails with following error message : "Output Transpose is not supported on this device" | Only impacts AM68PA (TDA4VM) |
10_00_07_00
New in this Release
Description | Notes |
---|---|
Improved performance for 16-bit leakyRELU | This requires an updated version of C7x/MMA firmware (10_00_07_00) and needs to have advanced_options:c7x_firmware set to 10_00_07_00. Additionally refer to backward compatibility for steps to update your EVM on an older SDK |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-5158 | Models compiled with add_data_convert_ops=0 and with unsigned input results in accuracy loss | All except AM62 |
TIDL-5139 | Layer with multiple inputs and having one input in 16-bit and at least one input in 8-bit may result in accuracy degradation | All except AM62 |
TIDL-5045 | Matmul layer with both inputs with different signedness returns error during compilation (Note: This will require a firmware update) | Only impacts AM68PA (TDA4VM) |
TIDL-5041 | TIDL model compilation fails when the model contains a convolution with kernel size (1x1) and input of height = 1 and width = 1 | All except AM62 |
TIDL-4964 | Model compilation may return following error message "Error : Error Code = <ERR_UNKNOWN>" for networks having concat layer | All except AM62 |
TIDL-4858 | Clip layer without a max initializer results in poor accuracy | All except AM62 |
TIDL-4854 | Model compilation results in a segmentation fault in certain cases when the same input is being fed twice to an elementwise Mul operator | All except AM62 |
TIDL-4737 | Model compilation exits with error message "Mixed Precision is not supported with matmul layer, Aborting" when network has MatMul layer and running with mixed precision | All except AM62 |
TIDL-4679 | Custom unsupported layer removed from subgraph instead of delegated to ARM | All except AM62 |
TIDL-4631 | Model compilation fails with following error message : "Output Transpose is not supported on this device" | Only impacts AM68PA (TDA4VM) |
10_00_06_00
New in this Release
Description | Notes |
---|---|
Support for higher precision 16-bit sigmoid on EVM | This requires a firmware update to be utilized (10_00_05_00) |
Support for custom element type for output of the input data convert layer | This requires a firmware update to be utilized (10_00_05_00) |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-4839 | Incorrect output datatype to TIDL_DataConvertLayer when "add_data_convert_ops" flag is enabled and the ONNX/TFLITE graph has multiple outputs with different output datatypes | All except AM62 |
TIDL-4671 | Fused combination for layernorm might not get correctly identified resulting in multiple layers getting denied | All except AM62 |
TIDL-4829 | 16-bit Sigmoid kernel results in poor accuracy | All except AM62 |
TIDL-4841 | Gather Layer input may come in DDR resulting into performance degradation on EVM | All except AM62 |
TIDL-4716 | TIDL creation call results in erroneously allocating more DDR shared memory for input/output tensors than required | All except AM62 |
TIDL-4684 | Depthwise separable convolution layer with 2xWxH*elementSize > 224KB (for AM62A/J722S) or 448KB (for all other devices) may result in wrong output on target | All except AM62 |
TIDL-4697 | Depthwise Separable Convolution Layer with large Input Width may give wrong output on Target | All except AM62 |
TIDL-4696 | Using compilation option "advanced_options:inference_mode" = 1 gives following error : "High Throughtput Inference Mode is not supported when partial batch is detected in graph" when the batched model has Transpose Layer | All except AM62 |
TIDL-4681 | TIDL Subgraphs incorrectly have outputs with the same name | All except AM62 |
TIDL-4825 | Feature to schedule TIDL networks based on user specified priority may result in error with status (-1128) on multiple runs of create and process callbacks | All except AM62 |
TIDL-4827 | Transpose of type (0,2,3,1) & (0,3,1,2) are split into layout data convert & DMA transposes in TIDL | All except AM62 |
TIDL-4680 | Feature to schedule TIDL networks based on user specified priority may result in hang/c7x exception on target | All except AM62 |
TIDL-4687 | Layernorm operator (ONNX Opset-17) produces incorrect results | All except AM62 |
TIDL-4702 | Model with multiple subgraphs fails in inference when calibration_frames>1 | All except AM62 |
TIDL-4695 | Reshape Layer may give wrong output on target if size of input buffer is not same as size of its output buffer | All except AM62 |
TIDL-4694 | Model compilation hits a segmentation fault when logFileName option is used but the file is not accessible | All except AM62 |
TIDL-4669 | Matmul with high dimension input may give wrong output on Target | All except AM62 |
TIDL-4691 | Model compilation via TIDL-RT hits a Segmentation fault while using the logFileName option | All except AM62 |
10_00_05_00
New in this Release
Description | Notes |
---|---|
Support for higher precision 16-bit sigmoid on EVM | This requires a firmware update to be utilized (10_00_05_00) |
Support for custom element type for output of the input data convert layer | This requires a firmware update to be utilized (10_00_05_00) |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-4839 | Incorrect output datatype to TIDL_DataConvertLayer when "add_data_convert_ops" flag is enabled and the ONNX/TFLITE graph has multiple outputs with different output datatypes | All except AM62 |
TIDL-4671 | Fused combination for layernorm might not get correctly identified resulting in multiple layers getting denied | All except AM62 |
TIDL-4829 | 16-bit Sigmoid kernel results in poor accuracy | All except AM62 |
TIDL-4841 | Gather Layer input may come in DDR resulting into performance degradation on EVM | All except AM62 |
TIDL-4716 | TIDL creation call results in erroneously allocating more DDR shared memory for input/output tensors than required | All except AM62 |
TIDL-4684 | Depthwise separable convolution layer with 2xWxH*elementSize > 224KB (for AM62A/J722S) or 448KB (for all other devices) may result in wrong output on target | All except AM62 |
TIDL-4697 | Depthwise Separable Convolution Layer with large Input Width may give wrong output on Target | All except AM62 |
TIDL-4696 | Using compilation option "advanced_options:inference_mode" = 1 gives following error : "High Throughtput Inference Mode is not supported when partial batch is detected in graph" when the batched model has Transpose Layer | All except AM62 |
TIDL-4681 | TIDL Subgraphs incorrectly have outputs with the same name | All except AM62 |
TIDL-4825 | Feature to schedule TIDL networks based on user specified priority may result in error with status (-1128) on multiple runs of create and process callbacks | All except AM62 |
TIDL-4827 | Transpose of type (0,2,3,1) & (0,3,1,2) are split into layout data convert & DMA transposes in TIDL | All except AM62 |
TIDL-4680 | Feature to schedule TIDL networks based on user specified priority may result in hang/c7x exception on target | All except AM62 |
TIDL-4687 | Layernorm operator (ONNX Opset-17) produces incorrect results | All except AM62 |
TIDL-4702 | Model with multiple subgraphs fails in inference when calibration_frames>1 | All except AM62 |
TIDL-4695 | Reshape Layer may give wrong output on target if size of input buffer is not same as size of its output buffer | All except AM62 |
TIDL-4694 | Model compilation hits a segmentation fault when logFileName option is used but the file is not accessible | All except AM62 |
TIDL-4669 | Matmul with high dimension input may give wrong output on Target | All except AM62 |
TIDL-4691 | Model compilation via TIDL-RT hits a Segmentation fault while using the logFileName option | All except AM62 |
10_00_04_00
New in this Release
Description | Notes |
---|---|
Support for Centernet model architecture : Added/optimized new operators : Object detection layer for Centernet | |
Support for wrap mode of pad using Slice and Concat operators | |
Support for partially batched networks (networks with part of network having multiple batches and remaining network as single batch) | |
Optimization of TIDLRT-Create for boot time performance improvement | |
Robustness improvement and improved logging for compiler (parser, graph partition and optimizer modules) | |
Performance optimization of non-linear activation functions using iLUT feature ( Tanh, Sigmoid, Softmax, GELU, ELU ) | Specific to J722S/AM67A/TDA4AEN platform |
Support for low latency mode for single neural network by splitting across multiple C7x-MMA | Specific to J722S/AM67A/TDA4AEN platform |
Support for high throughput mode by scheduling multiple instances of network across multiple C7x-MMA (multi-core batch processing) | Specific to J722S/AM67A/TDA4AEN platform |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-4672 | 16-bit Softmax produces produces poorer results than expected | All except AM62 |
TIDL-4670 | 3D OD detection layer output tensor size is not correct | All except AM62 |
TIDL-4665 | Model may give wrong output when it contains reshape layers with batches | All except AM62 |
TIDL-4661 | Models silently fail during compilation when /dev/shm is full | All except AM62 |
TIDL-4660 | Models with Cast as an intermediate operator results in functional issue due to unintentional offload to TIDL-RT | All except AM62 |
TIDL-4638 | Element wise layers with height greater than 65535 functionally doesn't work in host/PC emulation mode | All except AM62 |
TIDL-4480 | Network containing large number of reshape layers may result in longer compilation time | All except AM62 |
Known Issues
ID | Description | Affected Platforms | Occurrence | Workaround in this release |
---|---|---|---|---|
TIDL-4024 | "QDQ models with self-attention blocks error out during model compilation with ""RUNTIME_EXCEPTION : Non-zero status code returned while running TIDL_0 node. Name:'TIDLExecutionProvider_TIDL_0_0' Status Message: CHECK failed: (index) < (current_size_)"")" | All except AM62 | Rare | None |
TIDL-3905 | "TFLite Prequantized models with ""add_dataconvert_ops"": 3 fails with error ""Unable to split bias""" | All except AM62 | Rare | None |
TIDL-3895 | 2x2s2 Max Pooling with ceil_mode=0 and odd input dimensions results in incorrect outputs | All except AM62 | Rare | None |
TIDL-3886 | Maxpool 2x2 with stride 1x1 is considered supported but is incorrectly denied from being offloaded to C7x | All except AM62 | Rare | None |
TIDL-3845 | Running model compilation and inference back to back in the same python script results in a segfault | All except AM62 | Rare | None |
TIDL-3780 | Prototext based scale input may result in slight degradation in quantized output | All except AM62 | Rare | None |
TIDL-3704 | Intermediate subgraphs whose outputs are not 4D result in incorrect outputs | All except AM62 | Rare | None |
TIDL-3622 | Quantization prototxt does not correctly fill information for tflite const layers | All except AM62 | Rare | None |
TIDL-2990 | PReLU layer does not correctly parse the slope parameter and produces incorrect outputs | All except AM62 | Rare | None |
10_00_03_00
New in this Release
Description | Notes |
---|---|
Support for Centernet model architecture : Added/optimized new operators : Object detection layer for Centernet | |
Support for wrap mode of pad using Slice and Concat operators | |
Support for partially batched networks (networks with part of network having multiple batches and remaining network as single batch) | |
Optimization of TIDLRT-Create for boot time performance improvement | |
Robustness improvement and improved logging for compiler (parser, graph partition and optimizer modules) | |
Performance optimization of non-linear activation functions using iLUT feature ( Tanh, Sigmoid, Softmax, GELU, ELU, SiLU ) | Specific to J722S/AM67A/TDA4AEN platform |
Support for low latency mode for single neural network by splitting across multiple C7x-MMA | Specific to J722S/AM67A/TDA4AEN platform |
Support for high throughput mode by scheduling multiple instances of network across multiple C7x-MMA (multi-core batch processing) | Specific to J722S/AM67A/TDA4AEN platform |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-4672 | 16-bit Softmax produces produces poorer results than expected | All except AM62 |
TIDL-4670 | 3D OD detection layer output tensor size is not correct | All except AM62 |
TIDL-4665 | Model may give wrong output when it contains reshape layers with batches | All except AM62 |
TIDL-4661 | Models silently fail during compilation when /dev/shm is full | All except AM62 |
TIDL-4660 | Models with Cast as an intermediate operator results in functional issue due to unintentional offload to TIDL-RT | All except AM62 |
TIDL-4638 | Element wise layers with height greater than 65535 functionally doesn't work in host/PC emulation mode | All except AM62 |
TIDL-4480 | Network containing large number of reshape layers may result in longer compilation time | All except AM62 |
TIDL-3918 | Pooling Layer with K=2x2 and S=2x2 results in a C7x Exception | All except AM62 |
Known Issues
ID | Description | Affected Platforms | Occurrence | Workaround in this release |
---|---|---|---|---|
TIDL-4024 | "QDQ models with self-attention blocks error out during model compilation with ""RUNTIME_EXCEPTION : Non-zero status code returned while running TIDL_0 node. Name:'TIDLExecutionProvider_TIDL_0_0' Status Message: CHECK failed: (index) < (current_size_)"")" | All except AM62 | Rare | None |
TIDL-3905 | "TFLite Prequantized models with ""add_dataconvert_ops"": 3 fails with error ""Unable to split bias""" | All except AM62 | Rare | None |
TIDL-3895 | 2x2s2 Max Pooling with ceil_mode=0 and odd input dimensions results in incorrect outputs | All except AM62 | Rare | None |
TIDL-3886 | Maxpool 2x2 with stride 1x1 is considered supported but is incorrectly denied from being offloaded to C7x | All except AM62 | Rare | None |
TIDL-3845 | Running model compilation and inference back to back in the same python script results in a segfault | All except AM62 | Rare | None |
TIDL-3780 | Prototext based scale input may result in slight degradation in quantized output | All except AM62 | Rare | None |
TIDL-3704 | Intermediate subgraphs whose outputs are not 4D result in incorrect outputs | All except AM62 | Rare | None |
TIDL-3622 | Quantization prototxt does not correctly fill information for tflite const layers | All except AM62 | Rare | None |
TIDL-2990 | PReLU layer does not correctly parse the slope parameter and produces incorrect outputs | All except AM62 | Rare | None |
10_00_02_00
New in this Release
Description | Notes |
---|---|
Support for Centernet model architecture : Added/optimized new operators : Object detection layer for Centernet | |
Support for wrap mode of pad using Slice and Concat operators | |
Support for partially batched networks (networks with part of network having multiple batches and remaining network as single batch) | |
Optimization of TIDLRT-Create for boot time performance improvement | |
Robustness improvement and improved logging for compiler (parser, graph partition and optimizer modules) | |
Performance optimization of non-linear activation functions using iLUT feature ( Tanh, Sigmoid, Softmax, GELU, ELU, SiLU ) | Specific to J722S/AM67A/TDA4AEN platform |
Support for low latency mode for single neural network by splitting across multiple C7x-MMA | Specific to J722S/AM67A/TDA4AEN platform |
Support for high throughput mode by scheduling multiple instances of network across multiple C7x-MMA (multi-core batch processing) | Specific to J722S/AM67A/TDA4AEN platform |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-4486 | "Model Compilation does not check whether /dev/shm allocations were successful and gets stuck after printing ""In TIDL_subgraphRTInvoke""" | All except AM62 |
TIDL-4474 | Batch reshape layer with output going to DDR memory results in functional mismatch on target/EVM | All except AM62 |
TIDL-4473 | Global average pool layers functionally failing on target when input feature plane size < 4 bytes | All except AM62 |
TIDL-4463 | Fully grouped convolution with filter size 1x1 and stride 1 results in incorrect output on Target | All except AM62 |
TIDL-4455 | TransposeConvolution(Deconv) output is not functionally correct on host emulation and target if TIDL-RT based config files are used during compilation | All except AM62 |
TIDL-4451 | Subgraphs with input tensors with more than 4 dimensions result in poor quantized outputs | All except AM62 |
TIDL-4438 | "Model compilation errors with ""coeff cannot be found(or not match) in coef file, Random bias will be generated! Only for evaluation usage! Results are all random!"" when an ONNX Add/Sub operator has a constant intializer with size 1 and a 2D variable input" | All except AM62 |
TIDL-4429 | "Model compilation throws an ""Unsupported reshape layer"" error when Reshape shape parameter includes '-1' and reshape involves 6 dimensions" | All except AM62 |
TIDL-4422 | "Model compilation throws ""Error: Layer N, LayerName is missing inputs in the network and cannot be topologically sorted"" when the network has reshape with parent having multiple consumers" | All except AM62 |
TIDL-4415 | Data conversion from float to int16 results in degradation of accuracy | All except AM62 |
TIDL-4391 | Pad layer fused into convolution during model compilation might result in incorrect output | All except AM62 |
TIDL-4388 | Networks with Hardsigmoid layer results in functional issues during inference due to wrong parsing of beta parameter | All except AM62 |
TIDL-4383 | Model with number of layers > TIDL_DIM_MAX (1536) results in undefined behaviour | All except AM62 |
TIDL-4382 | Networks with Softmax layer with axis set to height gives incorrect results during inference | All except AM62 |
TIDL-4378 | Onnx Slice layers with stride were incorrectly being delegated to C7x resulting in undefined behavior | All except AM62 |
TIDL-4372 | Passing allow list or deny list does not work properly when some nodes are supported as part of a combination & results in incorrect allowing/denying of a layer | All except AM62 |
TIDL-4366 | TF-Lite Pre-quantized model may result in incorrect outputs if tensor ranges exceed 8-bit limits | All except AM62 |
TIDL-4356 | Depthwise separable convolution layer with 2xWxH > 224KB (for AM62A/J722S) or 448KB (for all other devices) may result in wrong output on target | All except AM62 |
TIDL-4354 | Resize operator with scale less than 1.0 incorrectly gets offloaded to TIDL-RT and causes functional mismatch during inference | All except AM62 |
TIDL-4275 | Reduce mean operator gets incorrectly delegated to the C7x during compilation and results in error during inference | All except AM62 |
TIDL-4271 | "ONNX models with maxpool with 3x3 stride 2x2 with pad [0,0,1,1] hangs on Target" | All except AM62 |
TIDL-4270 | "Argmax layer with 3 input dimensions and axis attribute value as 1, is incorrectly delegated to TIDL-RT and results in incorrect functional behavior during inference" | All except AM62 |
TIDL-4269 | "Argmax layer with keepdims = 0, is incorrectly delegated to TIDL-RT and results in incorrect functional behavior during inference" | All except AM62 |
TIDL-4268 | Transpose operator without the perm attribute produces incorrect output dimensions | All except AM62 |
TIDL-4267 | Split operator with num_outputs mentioned and uneven split produces inconsistent output dimensions compared to ONNX | All except AM62 |
TIDL-4066 | Models compiled with quantization_style=4 and mixed precision with a convolution layer (in 16-bit) without bias results in incorrect outputs | All except AM62 |
TIDL-4062 | Network with Identity layer before final output produces incorrect size of output | All except AM62 |
TIDL-4059 | Reshape-Transpose-Reshape sequence may get fused to DepthToSpace layer and report error during compilation and segmentation fault | All except AM62 |
TIDL-4055 | Models with batched input and tensors with less than 4 dimensions result in models with incorrect shape information and produce functionally incorrect output | All except AM62 |
TIDL-4053 | "GEMM operator with non-flattened input gives warning as ""Random coeff will be generated! "" during compilation and produces incorrect results during inference" | All except AM62 |
TIDL-4039 | Flatten layer with non-zero axis creates incorrect dimensions in TIDL graph resulting in error during inference | All except AM62 |
TIDL-4038 | "Reduce Min & Max operators results in wrong functional behavior when the optional input for ""axes"" is not specified" | All except AM62 |
TIDL-4019 | Model compilation incorrectly handles operators expanded internally by ONNX (Which do not have implementations in the ARM execution provider) | All except AM62 |
TIDL-4018 | "Resize with attributes fed as variable input results in undefined output dimensions, also refer TIDL-3897" | All except AM62 |
TIDL-3960 | "Convolution gives incorrect output when input feature map width < max (left pad, right pad)" | All except AM62 |
TIDL-3942 | 16-bit Elementwise multiplication with inputs as unsigned may lead to incorrect results in host emulation | All except AM62 |
TIDL-3936 | Allocation failure during graph creation in ONNXRUNTIME when ConstantOfShape with unknown dimensions are present | All except AM62 |
TIDL-3930 | "Limited set of operators support more than 4 dimensions (Reshape, Transpose, Batchnorm, Split/Slice, Softmax, MatMul & Eltwise) - other operators might accidentally get offloaded to C7x and produce incorrect results" | All except AM62 |
TIDL-3910 | MatMul operator with signed inputs and unsigned output results in an error during target inference | All except AM62 |
TIDL-3899 | Unsupported cast operators might get incorrectly offloaded and result in accuracy degradation | All except AM62 |
TIDL-3898 | "Pow, Div (With 2 variable inputs) , Sqrt & Erf may get accidentally offloaded to C7x and result in incorrect outputs" | All except AM62 |
TIDL-3897 | Operators with attributes fed as runtime variable results in undefined behavior during compilation stage | All except AM62 |
TIDL-3896 | Bias expansion scale in 16-bit is limited to 15-bits for unsigned data resulting in not best possible accuracy | All except AM62 |
TIDL-3880 | TIDL GPU tools do not support NVIDIA GPUs with compute capability 8.9 | All except AM62 |
TIDL-3872 | Feature to schedule TIDL networks based on user specified priority is not supported | All except AM62 |
TIDL-3870 | Batch processing is supported only if all layers have the same batch dimension | All except AM62 |
TIDL-3869 | Softmax along width axis results in suboptimal graph due to additional transpose operators introduced during compilation | All except AM62 |
TIDL-3866 | 16-bit Softmax & Layernorm can differ by +/- 1 between host emulation and device inference | All except AM62 |
TIDL-3830 | Model Compilation: Incorrect data type selection during float calibration pass when addDataConvert is disabled for ONNX networks | All except AM62 |
TIDL-3711 | "Networks with SoftMax having large tensor input may hang on target, but works fine with host emulation mode" | All except AM62 |
TIDL-3710 | Convolution layer with asymmetric quantization will functionally not work on target if ( 2 * W * H * elementSize) > (C7x L2 size allocate... |
09_02_09_00
New in this Release
Description | Notes |
---|---|
Support for asymmetrically quantized LeakyRelu | This requires an updated version of C7x/MMA firmware (09_02_09_00) and needs to have the advaced_options:c7x_firmware set |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-4364 | Softmax layer with height axis and no output transpose results in functional mismatch on EVM | All except AM62 |
TIDL-4355 | 16-bit Elementwise Mul (Unsigned x Unsigned) results in overflow in the output in target flow | All except AM62 |
TIDL-4266 | Elementwise broadcast fails when both inputs have broadcasted dimension value = 1 | All except AM62 |
TIDL-4034 | Slice layer may functionally mismatch when preceeded by other slice/reshape layer combinations | All except AM62 |
TIDL-4031 | Elementwise layers with repetitive inputs results into "segmentation fault" during host emulation | All except AM62 |
TIDL-4395 | [High Precision Sigmoid] Host emulation output is wrong for '-128' input value | All except AM62 |
TIDL-3836 | Resize-nearest neighbour fails when enableHighResOptimization flag is set | All except AM62 |
TIDL-4001 | 16-bit Softmax Kernel generates wrong output if plane width and height are not equal | All except AM62 |
TIDL-4381 | Allowlisting failure for TIDL induced transpose when optimizing consecutive transposes in an ONNX network | All except AM62 |
TIDL-4380 | TIDL Import gives "coeff cannot be found(or not match) in coef file, Random bias will be generated! Only for evaluation usage! Results are all random!" warning when ONNX Add/Mul operator has a constant input of size 1 in higher dimensions ([1], [1,1], .) | All except AM62 |
TIDL-4379 | TIDL Import fails with "All the Tensor Dimensions has to be greater then Zero" when ONNX network has split with negative axis | All except AM62 |
TIDL-4261 | Gather layer hangs on EVM when run in 16-bit | All except AM62 |
TIDL-4042 | Compilation results in following warning and a potential segmentation fault if additional layer is added at the end of network by TIDL during compilation phase: "Warning : Couldn't find corresponding ioBuf tensor for onnx tensor with matching name" | All except AM62 |
TIDL-4029 | ONNX QDQ model fails with "Import Error: Model with all ranges not supplied" message when it has LeakyRelu operator | All except AM62 |
TIDL-4028 | Gather indices not parsed correctly during calibration when its datatype is INT32 | All except AM62 |
TIDL-4012 | Consecutive transpose operators do not fuse into a single transpose operator | All except AM62 |
TIDL-4011 | Network having Reshape/Squeeze layers with (A) Multiple consumer layers and (B) No change in shape fails during compilation with error message as "missing inputs in the network and cannot be topologically sorted" | All except AM62 |
TIDL-4389 | Asymmetric quantization results in lower accuracy compared to symmetric quantization for networks with LeakyReLU | All except AM62 |
TIDL-4377 | Calibration for models which are not properly regularized, results in degraded accuracy | All except AM62 |
TIDL-4274 | 16-bit Elementwise Mul (Unsigned x Unsigned) results in overflow in the output | All except AM62 |
TIDL-4352 | Transpose operator with any configuration other than (0,1,2,3) -> (0,2,3,1) or vice versa may result into functional mismatch on target | All except AM62 |
TIDL-4030 | Model compilation may hang in networks containing 2x2 stride 2 depthwise separable convolution | All except AM62 |
TIDL-4405 | Networks with height-wise/width-wise concatenate layers may result in segmentation fault during compilation | All except AM62 |
TIDL-4402 | Setting option "advanced_options:high_resolution_optimization" = 1 may result in functionally incorrect output on target for some layers leading up to resize layer in the network | All except AM62 |
TIDL-4401 | Layer level debug traces for inference are incorrect for host emulation mode on setting option "advanced_options:high_resolution_optimization" = 1 | All except AM62 |
TIDL-4399 | Depth-wise convolution layer followed by resize layer (not necessarily immediate consumer) may result in functional issue on target | All except AM62 |
TIDL-4376 | 2x2 stride 2 spatial max pooling layer with odd input dimensions functionally incorrect for useCeil = 0 attribute on target | All except AM62 |
TIDL-4044 ... |
09_01_08_00
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-4028 | Gather indices (INT32 only) are not read correctly during calibration and this results in incorrect model output | All except AM62 |
TIDL-4029 | ONNX QDQ model fails with "Import Error: Model with all ranges not supplied" when the ONNX model has a LeakyRelu operator | All except AM62 |
TIDL-4045 | Gather operator results in intermittent hangs during inference on EVM | All except AM62 |
Note: TIDL-4045 requires a firmware update, otherwise this issue can also be worked around (Without a firmware update) by allocating 448K of extra data at the end of the indices tensor (initialized to zeros)