Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

βš–οΈ Suprising performances : gemma2:8b 1.40 times faster than gemma2:2b on ollama πŸ¦™ #372

Open
adriens opened this issue Aug 3, 2024 · 1 comment

Comments

@adriens
Copy link
Contributor

adriens commented Aug 3, 2024

❔ Context

As mentionned earlier :

A gave a try to semantic-router and got really impressive results, see πŸ”€ Semantic Router w. ollama/gemma2 : real life 10ms hotline challenge 🀯 .

... but recently gemma2:2b has been released, then I switched to this model, with the hope that :

  • It should be faster
  • Be as good

... but surprinsngly it did as good, but slower.

πŸ‘‰ The goal of this issue is to understand why... and what could be done to make semantic router run even faster than 10ms.

βš–οΈ Data

Considering the following runs :

Below some performances, both with same output quality :

Cell NΒ° gemma2:2b gemma2:8b
13 44.2 ms 33 ms
14 16 ms 12.7 ms
15 15.9 ms 12 ms
16 17 ms 12 ms
17 15.9 ms 11.8 ms
18 21.4 ms 11.4 ms
19 16.3 ms 12.6 ms
20 15.6 ms 11.6 ms

ℹ️ On each test, the 8b is faster than the 2b... and it's surprising:

πŸ“Š Benchmark conclusion

The average speed-up factor of gemma2:8b compared to gemma2:2b is approximately 1.40. This means that, on average, gemma2:8b is 1.40 times faster than gemma2:2b.

πŸ‘‰ Do you get the same performances ?

@adriens adriens changed the title βš–οΈ Suprising performances : gemma2:8b alsways faster than gemma2:2b βš–οΈ Suprising performances : gemma2:8b 1.40 times faster than gemma2:2b on ollama πŸ¦™ Aug 3, 2024
@adriens
Copy link
Contributor Author

adriens commented Aug 3, 2024

According to your feedbacks, I'll push an issue to ollama

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant