Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimize MlasComputeSoftmax with prefetch (microsoft#20393)
The prefetching instructions (_mm_prefetch) is used to anticipate memory accesses by prefetching the next row of the input buffer. This optimization is designed to reduce the impact of memory latency, thereby enhancing the performance of the MlasComputeSoftmax function. As a result, the worst-case performance of the OCR model has improved by approximately 50ms, which equates to a 3% improvement.
- Loading branch information