Release GoEx and Berkeley Function Calling Leaderboard Updates · ShishirPatil/gorilla

😍 v0.3 release 🚀

Highlights

⚡️ Released GoEx: A runtime that presents abstractions for safe execution of LLM generated code, APIs, actions, etc

🏆 Updates to Berkeley Function Calling Leaderboard (aka Berkeley Tool Calling Leaderboard) : Newer models including GPT-4o, gemini-flash and 1.5-pro, Hermes-2-Pro, etc. All measured along P95 and P99 latency, and costs besides accuracy.

What's Changed

Fix Typos in Evaluation Script and System Prompt. Identify Errors in a Dataset by @zuxin666 in #335
BFCL April 8th Release by @HuanzhiMao in #330
Initial goex commit by @ShishirPatil in #336
BFCL April 9th Release (Dataset Bug Fix) by @HuanzhiMao in #338
BFCL April 10th Release (API Sanity Check) by @HuanzhiMao in #339
Add Support for NousResearch/Hermes-2-Pro-Mistral-7B Function Calling by @Fanjia-Yan in #327
Update raft.py with default p to match paper by @ShishirPatil in #353
GoEx Import Issues by @royh02 in #354
BFCL April 11th Patch. Add Latency Statistics. by @HuanzhiMao in #347
GoEx Gitignore User Credentials by @royh02 in #344
Fix Circular Import Issue for BFCL evluation pipeline by @HuanzhiMao in #356
Added Docker to README by @Noppapon in #355
[Bug fix] Add Hermes-2-Pro-Mistral-7B model to UNDERSCORE_TO_DOT to parse API properly by @JasonZhu1313 in #364
Update requirements.txt by @viniciuslazzari in #343
Fix script argument by @ricklamers in #367
BFCL April 16th Release by @HuanzhiMao in #366
Log error messages from API validation by @eitanturok in #369
Update .gitignore by @eitanturok in #370
BFCL April 18th Release (Pipeline only) by @HuanzhiMao in #375
Add missing argument to OSSHandler's _format_prompt function by @eitanturok in #373
Add FC + Prompt for Cohere command-r-plus by @harry-cohere in #350
BFCL April 19th Release (Dataset & Pipeline) by @HuanzhiMao in #377
Azure OpenAI support in raft.py by @cedricvidal in #381
BFCL April 25th Release (New Models) by @HuanzhiMao in #386
Colored logging configuration + displaying progress in logs by @cedricvidal in #384
BFCL April 27th Release (Bug Fix in Cost/Latency Calculation) by @HuanzhiMao in #390
BFCL April 28th Release (New Model: snowflake/arctic) by @Fanjia-Yan in #397
RAFT Recovery Mode for interruptions by @kaiwen129 in #410
Small corrections to possible_answers for simple test category by @aastroza in #405
BFCL May 6th Release (Dataset Bug Fix) by @HuanzhiMao in #412
RAFT DevContainer for GitHub Codespaces by @cedricvidal in #379
RAFT Add support for configuring separate completion and embedding endpoints + pytest by @cedricvidal in #396
RAFT Fix arbitrary code execution vulnerability in checkpoint feature by @cedricvidal in #415
handle parallel function calls from gemini by @vandyxiaowei in #406
RAFT Support for chat and completion model formats by @cedricvidal in #417
[RAFT] Edit encode prompt to include <ANSWER>: tag in label by @kaiwen129 in #422
[BFCL] Patch Gemini Handler by @HuanzhiMao in #421
BFCL May 14th Release (GPT-4o and Gemini) by @Fanjia-Yan in #426
[BFCL] update tree_sitter version in requirements.txt by @justinwangx in #433
Fix indentation in leaderboard README by @polm-stability in #449
Fix breaking changes due to updated Anthropic SDK by @eitanturok in #452

New Contributors

@zuxin666 made their first contribution in #335
@JasonZhu1313 made their first contribution in #364
@ricklamers made their first contribution in #367
@eitanturok made their first contribution in #369
@harry-cohere made their first contribution in #350
@cedricvidal made their first contribution in #381
@aastroza made their first contribution in #405
@vandyxiaowei made their first contribution in #406
@justinwangx made their first contribution in #433
@polm-stability made their first contribution in #449

Full Changelog: v0.2...v0.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GoEx and Berkeley Function Calling Leaderboard Updates

Highlights

What's Changed

New Contributors

Contributors