-
Notifications
You must be signed in to change notification settings - Fork 183
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Feature/ntt benchmark roofline 3 (#1241)
* update for pack M&N x86 * revert for some performance fallback * add funroll-loops for gcc * shutdown for ci * revise bug for unroll * add softmax benchmark * add ctest for softmax * Fix ctest failure for softmax. * opt for x86 softmax * revise benchmark for softmax * Add rvv optimization of tanh with max_ulp_error = 2. * add tanh for x86 ulp version * remove usless headfile * Apply code-format changes * Optimize mul_add for rvv(performance boost 15% ~ 32%) * Remove ntt softmax and fix reduce conflict of x86_64. * change roofline for reduce x86 * Optimize matmul for rvv and update roofline. * Update reduce roofline for rvv. * Update Max_reduceMN_PackN roofline. * Add ratio for roofline / actual. * update tanh Roofline * Specialize max/min for float and update roofline for reduce no_pack. * problem about x86 roofline * [NTT] Add ukernel for matmul * Apply code-format changes * Fix build * change some reality for x86 * fallback roofline * change sequence for test * add warmup for unary * add primitive size auto test * revise bug in daily test * revise bug in daily test * avoid bug for temp * total fallback * add info for primitive size * remove tile infor * change the way * Add tensor.squeeze * remove Primitive infor * Apply code-format changes * temp change test * add table name * merge two table * remove typo * typo test * change back for daily test * test for table * change back tor test over * change for primitive size * Support odd matmul * Fix build * Apply code-format changes * Optimze erf for rvv. * Fix build * Add markdown for ntt mamtul. * Add u_matmul policy for rvv * add erf ulp version * fix typo * Refactor benchmark ntt py to support both ntt and ntt_matmul. * Apply code-format changes * Add ntt.store, optimize u_matmul for RVV * Fix pack MKN for RVV * Fix macos build and show gflops with floating point. * revise typo * Force compiler do not unroll k loops * Use pragma unroll 1 instead of volatile * set performance for cpu0 * Apply code-format changes * temp fallback for ci test --------- Co-authored-by: guodongliang <[email protected]> Co-authored-by: zhangyang2057 <[email protected]> Co-authored-by: uranus0515 <[email protected]> Co-authored-by: sunnycase <[email protected]> Co-authored-by: sunnycase <[email protected]> Co-authored-by: zhangyang2057 <[email protected]>
- Loading branch information
1 parent
e6d95d2
commit 9f289a3
Showing
22 changed files
with
2,871 additions
and
638 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.