LLaMA 3.3 support! This release also includes a handful of usability improvements.
What's Changed
- Set prepend_bos to false by default for Qwen models by @degenfabian in #815
- Throw error when using attn_in with grouped query attention by @degenfabian in #810
- Feature llama 33 by @bryce13950 in #826
Full Changelog: v2.10.0...v2.11.0