MegCC

HighLight

优化了 ARM 和 RISCV(RVV) 下常见模型 FP16 推理支持。

breaking change

runtime

修复 runtime 暴露符号容易和用户 pipeline 符号（例如 DEBUG 枚举与编译器会自带的 DEBUG 宏）冲突的问题。此改动会导致 log API（LITE_set_log_level）兼容性破坏（若您使用了 LITE_set_log_level , 编译时编译器会报类似 “undefine WARN, do you means LITE_WARN” 之类的错误，需参考此 PR 中的 diff 修改代码）。

Bug Fixes

基础组件

支持 MacOS 编译 compiler 工具以及 MacOS 主机编译 runtime。
修复输出为 Uint8 时 MegCC 与 MegBrain 对分不正确的问题，完善对分脚本，提升其易用性。

compiler-kernel

修复 megcc 不支持自动创建 kernel 导出目录、模型信息文件中没有输出信息的问题。

complier-通用

修复 importer 无法识别重名 op 的问题。
修复 exporter return-op 输出不能为 weight 的情况。
修复使用 mgb-to-tinynn 编译模型开启 float16 nchw88 优化时编译失败的问题。
修复 MegCC 在 Apple m1 平台上编译报错的问题，目前 MegCC 支持在 Apple M1 平台编译。

runtime

绑定 CombineModel 模块和 vm 模块，以解决多模型下LITE_destroy_network 和 LITE_make_network 顺序不同时出现的 crash。
修复输出 tensorw 为 weight 时模型计算出错的问题。
修复 loader 多次使用时，参数重置不完整造成的内存非法访问问题。

其他

修复 arm-linux 环境因缺少依赖导致的使用 release 包编译时 runtime 失败。

New Features

基础组件

修复 llvm 不支持 fp16 数据类型的识别导致的编译问题。

complier-kernel

添加 arm int8-dot hybrid stride 为 1 的卷积算子。
添加 VEC_BCAST111C f32 类型的 elemwise。
新增对 float16 类型 kernel 的支持（已添加 mk8 matmul、relayout、unary elemwise）。
新增 fp16 f23 f43 f63 winograd 卷积算子。
新增 fp16 im2col 卷积 kenrel。
增加对高斯模糊算子的支持。
增加 float16 数据类型的 reduce、pooling、以及 float16 和 float32 相互转换的算子。
修改模型解密逻辑，删除了辅助工具hako-to-mgb，用户无需指定hako版本，将自动尝试megcc支持的所有解密算法。（hako 选项保留，标记为deprecated）。
添加 float16 数据类型的二元和三元 elemwise 算子。
添加 float16 数据类型的 Conv1x1 算子。
添加对 fp16 模型的 benchmark 支持；添加 conv 和 matmul 的 naive 实现。
arm 平台添加 VEC_BCAST110 elemwise 模式。
添加 gaussian blur cv 算子。
添加 fp16 gi gemv, gevm 算子。
添加 fp16 nchw88 channelwise 算子。

complier-通用

支持 kernel riscv 后端单元测试。

Improvements

文档

更换文档中 mgeconvert onnx 转换文档链接为具体使用位置链接。
更新对分脚本和文档，使之更准确完整。
更新使用文档，使之更合理完整。

其他

格式化代码，用来保证 MegCC 中 cpp ，cmake 的代码风格，使之更合理规范。

MegCC

HighLight

Optimized FP16 inference support for common models under ARM and RISCV(RVV).

breaking change

runtime

Fix the problem that runtime exposed symbols are easy to conflict with user pipeline symbols (for example, DEBUG enumeration will conflict with the DEBUG macro that comes with the compiler). This change will lead to the loss of log API compatibility (if the user side uses the log API（LITE_set_log_level）, the compiler will report an error similar to: "undefine WARN, do you means LITE_WARN" when compiling).

Bug Fixes

basic components

Support MacOS compiler build and host build for runtime.
Fix the compilation failure when using mgb-to-tinynn compile model with float16 nchw88 optimization enabled.
Fix the bug that the bisection of MegCC and MegBrain is incorrect when the output is Uint8, fix the bisection script and improve its usability.

compiler-kernel

Fix dumpdir empty error and output info empty for exporter and mgb-to-tinynn.

complier-common

Fix the case where the exporter output op contains weight.
Fix redefinition error in importer when there is two op with same name in model.
Support for compiling MegCC on Apple M1 platform.
Fix the compilation failure when using mgb-to-tinynn compile model with float16 nchw88 optimization enabled.

runtime

Bind vm with CombineModel to solve the crash that occurs when the order of LITE_destroy_network and LITE_make_network is different under multiple models.
Fix the error of weighttype output tensor value invalid when inference.
Fix the illegal memory access bug caused by incomplete parameter reset when the loader is used multiple times.

其他

Fix the bug that arm-linux environment fails to compile runtime with release package due to missing dependency.

New Features

basic components

Fixed llvm compilation problem for fp16 data type.

complier-kernel

Add int8 dot hybrid conv stride 1 algo.
Add elemwise for VEC_BCAST111C f32 type.
Started to support float16 type kernel (mk8 matmul, relayout, unary elemwise have been added).
Add fp16 f23 f43 and f63 winograd kernel.
Add fp16 im2col kernel.
Support Gaussian blur operator.
Add float16 data type reduce, pooling, and float16 and float32 converter operators.
Modify the decryption logic of the model, delete the auxiliary tool Hako-To-MGB, and the user does not need to specify the Hako version. It will automatically try all the unbuttonous algorithms supported by megcc (hako option reserved, marked as deprecated).
Add binary and ternary elemwise operators for float16 data types.
Add the Conv1x1 operator for the float16 data type.
Add support benchmark for fp16 model;Add fp16 naive conv and matnul kernel.
Add VEC_BCAST110 elemwise mode on arm platform.
Add gaussian blur cv op.
Add gi fp16 gemv and gevm kernel.
Add fp16 nchw88 channelwise kernel.

complier-common

Add riscv kernel unit test.

Improvements

compiler-kernel

Optimize the performance of MegCC naive argsort operator to align with MegBrain.
Optimize MEgCC naive topK operator performance to align with MegBrian.
Optimize Conv 1x1 algorithm for Group NCHW44 lauout on arm64 platform, and the performance is aligned with MegBrain after optimization.
Optimize WarpPerspective transform, optimize for the case where the transform type is resize transform, and the performance is aligned with MegBrain after optimization.

文档

Change the mgeconvert onnx to mge usage link.
Update bisection script and documentation.
Update the usage documentation to make it more reasonable and complete.:

其他

Formatting documentation and code to make it more reasonable and standardized.

What's Changed

fix(doc): update doc/first-use.md by @Asthestarsfalll in #36
improvement(readme): unify readme format by @Asthestarsfalll in #35
CI(project): and CI to fix spell by @tpoisonooo in #39
fix(doc): add kernel exporter doc and fix some misc doc problem by @yeasoon in #34
fix some misc by @yeasoon in #37
Update internal by @yeasoon in #41
fix(lib): add the Tiny CC library for Apple M1; by @Qsingle in #42
feat(third_party): add fp16 llvm supported patch by @yeasoon in #45

New Contributors

@Asthestarsfalll made their first contribution in #36
@Qsingle made their first contribution in #42

Full Changelog: v0.1.3...v0.1.4