Releases · google/gemma.cpp

What's Changed

Clean up docs for developers by @austinvhuang in #102

MQA Implementation for 2B models by @ufownl in #114

Enhancing Utility Functions in ops.h by @enum-class in #105

Added a missing space in app.h by @villesundell in #115

Fix compilation error when HWY_COMPILER_GCC_ACTUAL < 1300 by @ufownl in #120

.bazelversion: Bazel 7.1.1 by @LINKIWI in #122

Add standalone tool to compress weights. by @szabadka in #125

1.07x speedup: merge MQA parallel sections as suggested by @veluca93 by @copybara-service in #126

Fix off-by-one errors in generation code and token streaming callback. by @szabadka in #127

Refactor library interfaces
Fixes to enable android and windows builds + general improvements to builds
Bazel builds
CI automation
Allow either HF or Kaggle (vs Kaggle only) for artifact downloads
Many small fixes and quality-of-life improvements from initial 0.1.0 release

What's Changed

Dev -> Main sync by @austinvhuang in #24
Update build.yml by @eltociear in #22
Fix typos by @shirayu in #32
Allow building on Windows using clang-cl toolchain by @dcoles in #6
Do not pass explicitly -O2 flag to compiler in Release build by @traversaro in #3
Fix build. by @dan-zheng in #35
reset conversation by @kishida in #34
Rename BUILD to BUILD.bazel. by @dan-zheng in #36
Add --eot_line option by @shirayu in #33
clean up formatting after 129e66a by @austinvhuang in #58
Warning fixes: unused member, cast, unused function by @copybara-service in #61
CLI args + README improvements + cleanup by @austinvhuang in #66
Fix for Android's 32-bit off_t. Fixes #62 by @copybara-service in #63
Add DEVELOPERS notes on using gemma as a library by @austinvhuang in #71
Add clang-tidy, fix narrowing issues, fix constness by @enum-class in #65
Support Bazel builds. Fixes #16 by @copybara-service in #75
Add instructions to download from Hugging Face Hub by @osanseviero in #74
Separate KV cache from GemmaImpl by @ufownl in #81
Avoid fadvise on older Android. Fixes #84 by @copybara-service in #85
use hwy/simd for RMSNorm(f, bf, f) calculation by @enum-class in #78
Use highway simd for SquaredL2 calculation by @enum-class in #77
Detect and print build type. Refs #88 by @copybara-service in #92
libgemma API refactor - decouple from interactive repl demo specifics, add hello world example using libgemma by @austinvhuang in #82
Additional cleanup after libgemma refactor #82 by @austinvhuang in #87
Use bf16-rounded sqrt for scaling embeddings to match Gemma by @copybara-service in #93
Remove unused ascii banner string by @copybara-service in #96
Allow changing k parameter of SampleTopK as a compiler flag by @ufownl in #97
Add missing log that point to a failed Generation by @zeerd in #98