Releases · simonw/llm

12 Sep 23:20

simonw

0.16

d654c95

0.16 Latest

Latest

OpenAI models now use the internal self.get_key() mechanism, which means they can be used from Python code in a way that will pick up keys that have been configured using llm keys set or the OPENAI_API_KEY environment variable. #552. This code now works correctly:
```
import llm
print(llm.get_model("gpt-4o-mini").prompt("hi"))
```
New documented API methods: llm.get_default_model(), llm.set_default_model(alias), llm.get_default_embedding_model(alias), llm.set_default_embedding_model(). #553
Support for OpenAI's new o1 family of preview models, llm -m o1-preview "prompt" and llm -m o1-mini "prompt". These models are currently only available to tier 5 OpenAI API users, though this may change in the future. #570

Assets 2

18 Jul 19:33

simonw

0.15

d075336

0.15

Support for OpenAI's new GPT-4o mini model: llm -m gpt-4o-mini 'rave about pelicans in French' #536
gpt-4o-mini is now the default model if you do not specify your own default, replacing GPT-3.5 Turbo. GPT-4o mini is both cheaper and better than GPT-3.5 Turbo.
Fixed a bug where llm logs -q 'flourish' -m haiku could not combine both the -q search query and the -m model specifier. #515

Assets 2

13 May 20:40

simonw

0.14

9a3236d

0.14

Support for OpenAI's new GPT-4o model: llm -m gpt-4o 'say hi in Spanish' #490
The gpt-4-turbo alias is now a model ID, which indicates the latest version of OpenAI's GPT-4 Turbo text and image model. Your existing logs.db database may contain records under the previous model ID of gpt-4-turbo-preview. #493
New llm logs -r/--response option for outputting just the last captured response, without wrapping it in Markdown and accompanying it with the prompt. #431
Nine new {ref}plugins <plugin-directory> since version 0.13:
- llm-claude-3 supporting Anthropic's Claude 3 family of models.
- llm-command-r supporting Cohere's Command R and Command R Plus API models.
- llm-reka supports the Reka family of models via their API.
- llm-perplexity by Alexandru Geana supporting the Perplexity Labs API models, including llama-3-sonar-large-32k-online which can search for things online and llama-3-70b-instruct.
- llm-groq by Moritz Angermann providing access to fast models hosted by Groq.
- llm-fireworks supporting models hosted by Fireworks AI.
- llm-together adds support for the Together AI extensive family of hosted openly licensed models.
- llm-embed-onnx provides seven embedding models that can be executed using the ONNX model framework.
- llm-cmd accepts a prompt for a shell command, runs that prompt and populates the result in your shell so you can review it, edit it and then hit <enter> to execute or ctrl+c to cancel, see this post for details.

Assets 2

27 Jan 00:28

simonw

0.13.1

8021e12

0.13.1

Fix for No module named 'readline' error on Windows. #407

Assets 2

26 Jan 22:34

simonw

0.13

8e0aff6

0.13

Added support for new OpenAI embedding models: 3-small and 3-large and three variants of those with different dimension sizes, 3-small-512, 3-large-256 and 3-large-1024. See OpenAI embedding models for details. #394
The default gpt-4-turbo model alias now points to gpt-4-turbo-preview, which uses the most recent OpenAI GPT-4 turbo model (currently gpt-4-0125-preview). #396
New OpenAI model aliases gpt-4-1106-preview and gpt-4-0125-preview.
OpenAI models now support a -o json_object 1 option which will cause their output to be returned as a valid JSON object. #373
New plugins since the last release include llm-mistral, llm-gemini, llm-ollama and llm-bedrock-meta.
The keys.json file for storing API keys is now created with 600 file permissions. #351
Documented a pattern for installing plugins that depend on PyTorch using the Homebrew version of LLM, despite Homebrew using Python 3.12 when PyTorch have not yet released a stable package for that Python version. #397
Underlying OpenAI Python library has been upgraded to >1.0. It is possible this could cause compatibility issues with LLM plugins that also depend on that library. #325
Arrow keys now work inside the llm chat command. #376
LLM_OPENAI_SHOW_RESPONSES=1 environment variable now outputs much more detailed information about the HTTP request and response made to OpenAI (and OpenAI-compatible) APIs. #404
Dropped support for Python 3.7.

Assets 2

06 Nov 21:33

simonw

0.12

e9a6998

0.12

Support for the new GPT-4 Turbo model from OpenAI. Try it using llm chat -m gpt-4-turbo or llm chat -m 4t. #323
New -o seed 1 option for OpenAI models which sets a seed that can attempt to evaluate the prompt deterministically. #324

Assets 2

06 Nov 20:08

simonw

0.11.2

10c6cc2

0.11.2

Pin to version of OpenAI Python library prior to 1.0 to avoid breaking. #327

Assets 2

01 Nov 04:31

simonw

0.11.1

ff34fb2

0.11.1

Fixed a bug where llm embed -c "text" did not correctly pick up the configured default embedding model. #317
New plugins: llm-python, llm-bedrock-anthropic and llm-embed-jina (described in Execute Jina embeddings with a CLI using llm-embed-jina).
llm-gpt4all now uses the new GGUF model format. simonw/llm-gpt4all#16

Assets 2

19 Sep 06:35

simonw

0.11

bf22994

0.11

LLM now supports the new OpenAI gpt-3.5-turbo-instruct model, and OpenAI completion (as opposed to chat completion) models in general. #284

llm -m gpt-3.5-turbo-instruct 'Reasons to tame a wild beaver:'

OpenAI completion models like this support a -o logprobs 3 option, which accepts a number between 1 and 5 and will include the log probabilities (for each produced token, what were the top 3 options considered by the model) in the logged response.

llm -m gpt-3.5-turbo-instruct 'Say hello succinctly' -o logprobs 3

You can then view the logprobs that were recorded in the SQLite logs database like this:

sqlite-utils "$(llm logs path)" \
  'select * from responses order by id desc limit 1' | \
  jq '.[0].response_json' -r | jq

Truncated output looks like this:

  [
    {
      "text": "Hi",
      "top_logprobs": [
        {
          "Hi": -0.13706253,
          "Hello": -2.3714375,
          "Hey": -3.3714373
        }
      ]
    },
    {
      "text": " there",
      "top_logprobs": [
        {
          " there": -0.96057636,
          "!\"": -0.5855763,
          ".\"": -3.2574513
        }
      ]
    }
  ]

Also in this release:

The llm.user_dir() function, used by plugins, now ensures the directory exists before returning it. #275
New LLM_OPENAI_SHOW_RESPONSES=1 environment variable for displaying the full HTTP response returned by OpenAI compatible APIs. #286
The llm embed-multi command now has a --batch-size X option for setting the batch size to use when processing embeddings - useful if you have limited memory available. #273
The collection.embed_multi() method also now accepts an optional batch_size=int argument.
Fixed two bugs with llm embed-multi --files relating to handling of directories. Thanks, ealvar3z. #274, #280

Assets 2

12 Sep 19:13

simonw

0.10

e83d205

0.10

The two major features in this release are the llm chat command and support for embedding binary data.

llm chat

The new llm chat command starts an ongoing chat conversation with a model in your terminal. It works with all models supported by LLM and its plugins, including locally installed models such as Llama 2. #231

This offers a big performance boost for local models, since they don't need to be freshly loaded into memory for each prompt.

Here's an example chat with Llama 2 13B, provided by the llm-mlc plugin.

llm  chat  -m  mlc-chat-Llama-2-13b-chat-hf-q4f16_1

This starts a chat session:

Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
> Who are you?
Hello! I'm just an AI, here to assist you with any questions you may have.
My name is LLaMA, and I'm a large language model trained to provide helpful
and informative responses to a wide range of questions and topics. I'm here
to help you find the information you need, while ensuring a safe and
respectful conversation. Please feel free to ask me anything, and I'll do my
best to provide a helpful and accurate response.
> Tell me a joke about otters
Sure, here's a joke about otters:

Why did the otter go to the party?

Because he heard it was a "whale" of a time!

(Get it? Whale, like a big sea mammal, but also a "wild" or "fun" time.
Otters are known for their playful and social nature, so it's a lighthearted
and silly joke.)

I hope that brought a smile to your face! Do you have any other questions or
topics you'd like to discuss?
> exit

Chat sessions are logged to SQLite - use llm logs to view them. They can accept system prompts, templates and model options - consult the chat documentation for details.

Binary embedding support

LLM's embeddings feature has been expanded to provide support for embedding binary data, in addition to text. #254

This enables models like CLIP, supported by the new llm-clip plugin.

CLIP is a multi-modal embedding model which can embed images and text into the same vector space. This means you can use it to create an embedding index of photos, and then search for the embedding vector for "a happy dog" and get back images that are semantically closest to that string.

To create embeddings for every JPEG in a directory stored in a photos collection, run:

llm install  llm-clip
llm embed-multi  photos  --files  photos/  '*.jpg'  --binary  -m  clip

Now you can search for photos of racoons using:

llm similar photos -c 'raccoon'

This spits out a list of images, ranked by how similar they are to the string "raccoon":

{"id": "IMG_4801.jpeg", "score": 0.28125139257127457, "content": null, "metadata": null}
{"id": "IMG_4656.jpeg", "score": 0.26626441704164294, "content": null, "metadata": null}
{"id": "IMG_2944.jpeg", "score": 0.2647445926996852, "content": null, "metadata": null}
...

Also in this release

The LLM_LOAD_PLUGINS environment variable can be used to control which plugins are loaded when llm starts running. #256
The llm plugins --all option includes builtin plugins in the list of plugins. #259
The llm embed-db family of commands has been renamed to llm collections. #229
llm embed-multi --files now has an --encoding option and defaults to falling back to latin-1 if a file cannot be processed as utf-8. #225

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm chat

Binary embedding support

Also in this release

Releases: simonw/llm

0.16

0.15

0.14

0.13.1

0.13

0.12

0.11.2

0.11.1

0.11

0.10

llm chat

Binary embedding support

Also in this release