Releases: thomasantony/llamacpp-python
Releases · thomasantony/llamacpp-python
v0.1.14
v0.1.13
- Adds support for "infinite text generation" using context swapping (similar to the
main
example in llama.cpp)
v0.1.12
- Makes unit tests more consistent and usable (still not running in workflows as the weights are too large)
- Updates llama.cpp submodule
v0.1.11
- Breaking change but makes model loading practically instantaneous thanks to memory-mapped I/O
- Requires re-generating the weight files using the new convert script (or use the migration script from llama.cpp)
v0.1.10
- Adds back
get_tokenizer()
andadd_bos()
that were broken in previous release
v0.1.9
- Updates the bindings to work with the new llama.cpp API from ggerganov/llama.cpp#370
- Adds two separate interfaces -
LlamaInference
which is similar to the bindings in v0.1.8 and the lower levelLlamaContext
(currently untested) - The old bindings are still present in
PyLlama.cpp
but is currently not compiled and will be removed at a later date
v0.1.8
- Adds a "tokenizer" object for use with
oobabooga/text-generation-webui
v0.1.7
- Switches from
poetry
toscikit-build
as the build tool due to problems with cross-compiling on CI - Adds CI builds for macOS arm64 wheels
- Adds windows wheel files to PyPI built on CI
v0.1.6
- Fixes windows builds on CI (hopefully).
- Removes
torch
andsentencepiece
as dependencies. They have to be manually installed now if you want to usellamacpp-convert
v0.1.5
Includes new llamacpp-cli
and llamacpp-chat
entrypoints. There's possibly still some kind of bug that makes the performance of llamacpp-chat
a bit worse than if you passed in the arguments directly into llamacpp-cli