Skip to content

2.3.28 Satellite: Promptfoo

av edited this page Nov 17, 2024 · 1 revision

Handle: promptfoo
URL: http://localhost:34233

Promptfoo example screenshot

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a tool for testing, evaluating, and red-teaming LLM apps.

With promptfoo, you can:

  • Build reliable prompts, models, and RAGs with benchmarks specific to your use-case
  • Secure your apps with automated red teaming and pentesting
  • Speed up evaluations with caching, concurrency, and live reloading
  • Score outputs automatically by defining metrics
  • Use as a CLI, library, or in CI/CD
  • Use OpenAI, Anthropic, Azure, Google, HuggingFace, open-source models like Llama, or integrate custom API providers for any LLM API

Starting

# [Optional] Pre-pull the image
harbor pull promptfoo

You'll be running Promptfoo CLI most of the time, it's available as:

# Full name
harbor promptfoo --help

# Alias
harbor pf --help

Whenever the CLI is called, it'll also automatically start local Promptfoo backend.

# Run a CLI command
harbor pf --help

# Promptfoo backend started
harbor ps # harbor.promptfoo

Promptfoo backend serves all recorded results in the web UI:

# Open the web UI
harbor open promptfoo
harbor promptfoo view
harbor pf o

Usage

Most of the time, your workflow will be centered around creating prompts, assets, writing an eval config, running it and then viewing the results.

Harbor will run pf CLI from where you call Harbor CLI, so you can use it from any folder on your machine.

# Ensure a dedicated folder for the eval
cd /path/to/your/eval

# Init the eval (here)
harbor pf init

# Edit the configuration, prompts as needed
# Run the eval
harbor pf eval

# View the results
harbor pf view

Note

If you're seeing any kind of file system permission errors you'll need to ensure that files written from within a container are accessible to your user.

Configuration

Harbor pre-configures promptfoo to run against ollama out of the box (must be started before pf eval). Any other providers can be configured via:

# For example, use vLLM API
harbor env promptfoo OPENAI_BASE_URL $(harbor url -i vllm)

Promptfoo is a very rich and extensive tool, we recommend reading through excellent official documentation to get the most out of it.

Harbor comes with two (basic) built-in examples.

Promptfoo hello-world
# Navigate to eval folder
cd $(harbor home)/promptfoo/examples/hello-promptfoo

# Start ollama and pull the target model
harbor up ollama
harbor ollama pull llama3.1:8b

# Run the eval
harbor pf eval

# View the results
harbor pf view
Promptfoo temp-test

Promptfoo temp-test example screenshot

Evaluate a model across a range of temperatures to see if there's a sweet spot for a given prompt.

# Navigate to eval folder
cd $(harbor home)/promptfoo/examples/temp-test

# Start ollama and pull the target model
harbor up ollama
harbor ollama pull llama3.1:8b

# Run the eval
harbor pf eval

# View the results
harbor pf view
Clone this wiki locally