v0.1.6

Latest

Latest

Wanwan1996 released this 24 Oct 07:33

e238f37

MegCC

Highlight

支持 ARM V8.6 平台新特性 I8MM 实现的 Int8 量化卷积，性能约为 DOT 版本的1.7倍。
支持 clip 新模型推理，性能与 megdnn 相当。

Bug Fixes

compiler-kernel

reisze 算子添加 NEAREST Mode。
修正 int8 Winograd F23 的适用条件，避免 hybrid conv 选到该 kernel 的错误情况。
修正所有 ConvBias 的适用条件，限制其只支持 channel broadcast 的 bias。
新增对 elemwise sqrt sin cos 算子的支持。
添加 int8 resize 的 float32 计算实现，修复 Arm 实现和 naive 实现结果存在误差的问题。

complier-通用

修复 MegCC 无法支持不带 bias 的 ConvBias 算子的 bug。

runtime

修复 runtime 可能释放指针为 NULL 的 dynamic tensor 导致程序 crash 的问题。

New Features

基础组件

升级 megbrain 到 padding channel pass bug 修复版本，以解决 megcc 部分模型编译出错的问题。

complier-kernel

增加 Arm64 BatchedMatmul 算子。
增加 IndexingMultiAxisVec 算子对多维 indexing 模式的支持。
新增对 nchw int8 conv1x1x kernel 的支持。
添加 aarch32 int8 dot nchw conv5x5 kernel。
添加 Float16 数据类型的 batched matmul算子。
新增 naive mod op。
支持 ARM V8.6 平台新特性 I8MM 实现的 Int8 量化卷积，性能约为 DOT 版本的1.7倍。

MegCC

Highlight

Supports Int8 quantized convolution implemented by I8MM, a new feature of the ARM V8.6 platform, with performance approximately 1.7 times that of the DOT version.
Support clip new model inference, performance is equivalent to megdnn.

Bug Fixes

compiler-kernel

Add NEAREST Mode for resize operator.
Correct the applicable conditions of int8 Winograd F23 to avoid the error of hybrid conv selecting this kernel.
Fix all ConvBias applicability conditions to restrict support to channel broadcast only bias.
Support sqrt sin cos elemwise kernel.
Add float32 calculation implementation of int8 resize to fix the problem of discrepancy between Arm implementation and naive implementation results.

complier-common

Fix the bug that MegCC does not support ConvBias operator without bias.

runtime

Fix the bug that runtime may try to free dynamic tensor with NULL pointer.

New Features

basic components

Update megbrain to the padding channel pass bug fix version to solve the problem of compiling some models of megcc.

complier-kernel

Adds the BatchedMatmul operator for the Arm64 platform.
The IndexingMultiAxisVec operator supports multi-dimensional indexing mode.
Add support for nchw int8 conv1x1x kernel.
Add aarch32 int8 dot nchw conv5x5 kernel.
Add batched matmul operator for Float16 datatype.
Add naive float32 mod elemwise op
Supports Int8 quantized convolution implemented in I8MM, a new feature of the ARM V8.6 platform, with approximately 1.7x the performance of the DOT version.

Assets 3