add a basic bfcl command-line interface #621

mattf · 2024-09-04T17:00:37Z

add a simple cli wrapping openfunctions_evaluation.py (bfcl run) and eval_runner.py (bfcl evaluate).

➜ bfcl
                                                                                                                             
 Usage: bfcl [OPTIONS] COMMAND [ARGS]...                                                                                     
                                                                                                                             
╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion            Install completion for the current shell.                                                 │
│ --show-completion               Show completion for the current shell, to copy it or customize the installation.          │
│ --help                -h        Show this message and exit.                                                               │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ models            List available models.                                                                                  │
│ test-categories   List available test categories.                                                                         │
│ run               Run one or more models on a test-category (same as openfunctions_evaluation).                           │
│ results           List the results available for evaluation.                                                              │
│ evaluate          Evaluate results from run of one or more models on a test-category (same as eval_runner).               │
│ scores            Display the leaderboard.                                                                                │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

➜ bfcl run -h
                                                                                                                    
 Usage: bfcl run [OPTIONS]                                                                                          
                                                                                                                    
 Run one or more models on a test-category (same as openfunctions_evaluation).                                      
                                                                                                                    
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --model                           TEXT     A list of model names to evaluate.                                    │
│                                            [default: gorilla-openfunctions-v2]                                   │
│ --test-category                   TEXT     A list of test categories to run the evaluation on. [default: all]    │
│ --api-sanity-check        -c               Perform the REST API status sanity check before running the           │
│                                            evaluation.                                                           │
│ --temperature                     FLOAT    The temperature parameter for the model. [default: 0.001]             │
│ --top-p                           FLOAT    The top-p parameter for the model. [default: 1.0]                     │
│ --max-tokens                      INTEGER  The maximum number of tokens for the model. [default: 1200]           │
│ --num-gpus                        INTEGER  The number of GPUs to use. [default: 1]                               │
│ --timeout                         INTEGER  The timeout for the model in seconds. [default: 60]                   │
│ --num-threads                     INTEGER  The number of threads to use. [default: 1]                            │
│ --gpu-memory-utilization          FLOAT    The GPU memory utilization. [default: 0.9]                            │
│ --help                    -h               Show this message and exit.                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯


➜ bfcl evaluate -h
                                                                                                                    
 Usage: bfcl evaluate [OPTIONS]                                                                                     
                                                                                                                    
 Evaluate results from run of one or more models on a test-category (same as eval_runner).                          
                                                                                                                    
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --model                     TEXT  A list of model names to evaluate. [default: None] [required]               │
│ *  --test-category             TEXT  A list of test categories to run the evaluation on. [default: None]         │
│                                      [required]                                                                  │
│    --api-sanity-check  -c            Perform the REST API status sanity check before running the evaluation.     │
│    --help              -h            Show this message and exit.                                                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

HuanzhiMao · 2024-09-22T23:51:05Z

Hi @mattf,

Thank you so much for your PR and welcome! I really appreciate your contribution – this feature has been on our TODO list for a while, and it’s great to see it implemented.

I noticed a few TODOs left in the code. I'll take care of finishing those up and handle any merge conflicts. After that, we’ll be ready to move forward!

mattf · 2024-09-23T00:00:15Z

@HuanzhiMao i'm glad you like it. i've a few more commands i'll push up.

HuanzhiMao · 2024-09-23T00:06:35Z

@HuanzhiMao i'm glad you like it. i've a few more commands i'll push up.

Perfect.

mattf · 2024-09-23T00:09:11Z

my plan was to put a simple cli around the runner / evaluator / model definition code and then propose refactoring changes to make the cli simpler.

i've found the cli helpful for my runs, which means it's only had one user.

HuanzhiMao · 2024-09-23T00:14:49Z

I agree. CLI entries will be easier than cd into different directories and then run each script via python xxx.

…_credential_config.py (#675) This PR addresses the issue of hard-coded relative file paths in BFCL, which previously made it impossible to run the script from different entry locations/directories. With this update, the script can now be executed from any directory, unblocking #621. Additionally, this PR automates the `apply_function_credential_config.py` step, removing the need for users to manually trigger the script to apply the credential files. Part of the effort to merge #510. --------- Co-authored-by: Devansh Amin <[email protected]>

HuanzhiMao · 2024-10-11T03:19:00Z

@mattf, I have resolved all the TODOs in the code and polished it a bit. Anything else you would like to add before we merge this PR?

ps, I changed bfcl run to bfcl generate for a more intuitive name.

CharlieJCJ

Tested, suggested fixes in bfcl generate with consistent .env path. Functionalities works for my local testings.

The CLI looks super clean! Love it @mattf, thanks for the pr, and thanks @HuanzhiMao for code changes.

CharlieJCJ

Re-tested on the same commands. LGTM

…_credential_config.py (ShishirPatil#675) This PR addresses the issue of hard-coded relative file paths in BFCL, which previously made it impossible to run the script from different entry locations/directories. With this update, the script can now be executed from any directory, unblocking ShishirPatil#621. Additionally, this PR automates the `apply_function_credential_config.py` step, removing the need for users to manually trigger the script to apply the credential files. Part of the effort to merge ShishirPatil#510. --------- Co-authored-by: Devansh Amin <[email protected]>

add a simple cli wrapping openfunctions_evaluation.py (`bfcl run`) and eval_runner.py (`bfcl evaluate`). ``` ➜ bfcl Usage: bfcl [OPTIONS] COMMAND [ARGS]... ╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ --install-completion Install completion for the current shell. │ │ --show-completion Show completion for the current shell, to copy it or customize the installation. │ │ --help -h Show this message and exit. │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ╭─ Commands ────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ models List available models. │ │ test-categories List available test categories. │ │ run Run one or more models on a test-category (same as openfunctions_evaluation). │ │ results List the results available for evaluation. │ │ evaluate Evaluate results from run of one or more models on a test-category (same as eval_runner). │ │ scores Display the leaderboard. │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ➜ bfcl run -h Usage: bfcl run [OPTIONS] Run one or more models on a test-category (same as openfunctions_evaluation). ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ --model TEXT A list of model names to evaluate. │ │ [default: gorilla-openfunctions-v2] │ │ --test-category TEXT A list of test categories to run the evaluation on. [default: all] │ │ --api-sanity-check -c Perform the REST API status sanity check before running the │ │ evaluation. │ │ --temperature FLOAT The temperature parameter for the model. [default: 0.001] │ │ --top-p FLOAT The top-p parameter for the model. [default: 1.0] │ │ --max-tokens INTEGER The maximum number of tokens for the model. [default: 1200] │ │ --num-gpus INTEGER The number of GPUs to use. [default: 1] │ │ --timeout INTEGER The timeout for the model in seconds. [default: 60] │ │ --num-threads INTEGER The number of threads to use. [default: 1] │ │ --gpu-memory-utilization FLOAT The GPU memory utilization. [default: 0.9] │ │ --help -h Show this message and exit. │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ➜ bfcl evaluate -h Usage: bfcl evaluate [OPTIONS] Evaluate results from run of one or more models on a test-category (same as eval_runner). ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ * --model TEXT A list of model names to evaluate. [default: None] [required] │ │ * --test-category TEXT A list of test categories to run the evaluation on. [default: None] │ │ [required] │ │ --api-sanity-check -c Perform the REST API status sanity check before running the evaluation. │ │ --help -h Show this message and exit. │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ``` --------- Co-authored-by: Huanzhi (Hans) Mao <[email protected]>

mattf added 6 commits September 4, 2024 12:34

avoid initializing EVAL_GROUND_TRUTH on module load

fc41e2e

allow loading from eval_checker

05dbf6f

move openfunctions_evaluation.py into the bfcl package

66357dd

refactor eval_runner entry to separate arg parsing from main()

076f617

add basic bfcl cli

e372172

order commands in help listing

2c71926

add commands: test-categories, models, results, scores

70ad2cc

HuanzhiMao added 3 commits September 22, 2024 17:55

Merge branch 'main' into pr/mattf/621

4bffe5f

update import path

384c3ab

Merge branch 'main' into add-cli

85b44a9

HuanzhiMao mentioned this pull request Oct 7, 2024

[BFCL] Chore: Refactor File Path Handling and Automate apply_function_credential_config.py #675

Merged

HuanzhiMao added 2 commits October 8, 2024 18:26

Merge branch 'main' into pr/mattf/621

989b435

update code to be in sync with latest pipeline

edbbb48

HuanzhiMao added the BFCL-General General BFCL Issue label Oct 9, 2024

HuanzhiMao added 6 commits October 10, 2024 18:51

fix arguments for eval_runner; should be optional

f5c597a

transpose the score table for better visual

d531e95

update README accodingly for the CLI commands

05cfca7

Merge branch 'main' into add-cli

fa9e399

rename 'run' to 'generate'

f5a6d6c

ignore .DS_Store in results command output

8275fe3

HuanzhiMao approved these changes Oct 11, 2024

View reviewed changes

HuanzhiMao added 2 commits October 10, 2024 22:25

better display score table

aee5587

clean up

285b903

CharlieJCJ approved these changes Oct 11, 2024

View reviewed changes

Merge branch 'main' into add-cli

e2eb926

CharlieJCJ approved these changes Oct 16, 2024

View reviewed changes

Merge branch 'main' into add-cli

55b45e2

Fanjia-Yan approved these changes Oct 16, 2024

View reviewed changes

ShishirPatil approved these changes Oct 17, 2024

View reviewed changes

ShishirPatil merged commit 0a33e97 into ShishirPatil:main Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a basic bfcl command-line interface #621

add a basic bfcl command-line interface #621

mattf commented Sep 4, 2024 •

edited

Loading

HuanzhiMao commented Sep 22, 2024

mattf commented Sep 23, 2024

HuanzhiMao commented Sep 23, 2024

mattf commented Sep 23, 2024

HuanzhiMao commented Sep 23, 2024

HuanzhiMao commented Oct 11, 2024

CharlieJCJ left a comment •

edited

Loading

CharlieJCJ left a comment

add a basic bfcl command-line interface #621

add a basic bfcl command-line interface #621

Conversation

mattf commented Sep 4, 2024 • edited Loading

HuanzhiMao commented Sep 22, 2024

mattf commented Sep 23, 2024

HuanzhiMao commented Sep 23, 2024

mattf commented Sep 23, 2024

HuanzhiMao commented Sep 23, 2024

HuanzhiMao commented Oct 11, 2024

CharlieJCJ left a comment • edited Loading

Choose a reason for hiding this comment

CharlieJCJ left a comment

Choose a reason for hiding this comment

mattf commented Sep 4, 2024 •

edited

Loading

CharlieJCJ left a comment •

edited

Loading