Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

thomasantony / llamacpp-python Public

Notifications You must be signed in to change notification settings
Fork 28
Star 198

Code
Issues 10
Pull requests 1
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: thomasantony/llamacpp-python

Releases · thomasantony/llamacpp-python

v0.1.14

10 Apr 22:06

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.14 Latest

Latest

Fixes #19
Updates llama.cpp submodule

Assets 2

Loading

rpfilomeno reacted with thumbs up emoji

All reactions

👍 1 reaction

1 person reacted

v0.1.13

08 Apr 03:38

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.13

Adds support for "infinite text generation" using context swapping (similar to the main example in llama.cpp)

Assets 2

Loading

All reactions

v0.1.12

07 Apr 04:55

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.12

Makes unit tests more consistent and usable (still not running in workflows as the weights are too large)
Updates llama.cpp submodule

Assets 2

Loading

All reactions

v0.1.11

31 Mar 16:28

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.11

Breaking change but makes model loading practically instantaneous thanks to memory-mapped I/O
Requires re-generating the weight files using the new convert script (or use the migration script from llama.cpp)

Assets 2

Loading

All reactions

v0.1.10

30 Mar 11:01

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.10

Adds back get_tokenizer() and add_bos() that were broken in previous release

Assets 2

Loading

All reactions

v0.1.9

28 Mar 23:58

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.9

Updates the bindings to work with the new llama.cpp API from ggerganov/llama.cpp#370
Adds two separate interfaces - LlamaInference which is similar to the bindings in v0.1.8 and the lower level LlamaContext (currently untested)
The old bindings are still present in PyLlama.cpp but is currently not compiled and will be removed at a later date

Assets 2

Loading

adriacabeza reacted with hooray emoji

All reactions

🎉 1 reaction

1 person reacted

v0.1.8

20 Mar 02:57

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.8

Adds a "tokenizer" object for use with oobabooga/text-generation-webui

Assets 2

Loading

All reactions

v0.1.7

19 Mar 23:08

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.7

Switches from poetry to scikit-build as the build tool due to problems with cross-compiling on CI
Adds CI builds for macOS arm64 wheels
Adds windows wheel files to PyPI built on CI

Assets 2

Loading

All reactions

v0.1.6

19 Mar 03:23

thomasantony

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.1.6

Fixes windows builds on CI (hopefully).
Removes torch and sentencepiece as dependencies. They have to be manually installed now if you want to use llamacpp-convert

Assets 2

Loading

All reactions

v0.1.5

18 Mar 22:12

thomasantony

Compare

Choose a tag to compare

Loading

v0.1.5

Includes new llamacpp-cli and llamacpp-chat entrypoints. There's possibly still some kind of bug that makes the performance of llamacpp-chat a bit worse than if you passed in the arguments directly into llamacpp-cli

Assets 2

Loading

All reactions

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.