Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

check isa for UT, use OMP as default #117

Merged
merged 1 commit into from
Feb 6, 2024
Merged

Conversation

ThanatosShinji
Copy link
Contributor

Type of Change

  1. set OMP on for bestla.
  2. Check AVX512F flag for LayerNormalization.

Perf on Ultra7-155H, Mistral-7B int4, group=-1, compute_dtype=int8

 Once we have made our plans,
 We can’t follow them.


model_print_timings:        load time =   100.93 ms
model_print_timings:      sample time =     8.00 ms /    16 runs   (    0.50 ms per token)
model_print_timings: prompt eval time =   100.82 ms /     2 tokens (   50.41 ms per token)
model_print_timings:        eval time =  1239.00 ms /    15 runs   (   82.60 ms per token)
model_print_timings:       total time =  1355.56 ms
========== eval time log of each prediction ==========
prediction   0, time: 100.82ms
prediction   1, time: 79.96ms
prediction   2, time: 81.61ms
prediction   3, time: 80.69ms
prediction   4, time: 81.98ms
prediction   5, time: 82.39ms

@a32543254 a32543254 requested a review from luoyu-intel February 5, 2024 01:15
@VincyZhang VincyZhang merged commit c40d116 into intel:main Feb 6, 2024
10 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants