v0.3: Claude 3, Sandboxed Python #238

ianarawjo · 2024-03-06T04:30:09Z

ianarawjo
Mar 6, 2024
Maintainer

Adds new Anthropic Claude 3 models.

Backend now uses the messages API for Claude 2.1+ models.
Adds the system message parameter in Claude settings.

Adds browser-sandboxed Python with pyodide

You can now run Python in a safe sandbox entirely in the browser, provided you do not need to import third-party libraries.
The web-hosted version at chainforge.ai/play now has Python evaluators unlocked:

The local version of ChainForge includes a toggle to turn sandboxing on or off:

If you turn sandboxing off, you go back to the previous Python evaluator, executed on your local machine through the Flask backend. In the non-sandboxed eval node you can import any libraries available in your Python environment.

Why sandboxing?

The benefit of sandboxing is that ChainForge can now be used to execute Python code generated by LLMs, using eval() or exec() in your evaluation function. This was possible before but dangerous and unsafe. Benchmarks that do not rely on third-party libraries, like HumanEvals at pass@1 rate, could be run within ChainForge entirely in the web browser (if anyone wants to set this up, let me know!).

This discussion was created from the release v0.3: Claude 3, Sandboxed Python.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3: Claude 3, Sandboxed Python #238

{{title}}

Replies: 0 comments

Select a reply

v0.3: Claude 3, Sandboxed Python #238

ianarawjo Mar 6, 2024 Maintainer

Adds new Anthropic Claude 3 models.

Adds browser-sandboxed Python with pyodide

Why sandboxing?

Replies: 0 comments

ianarawjo
Mar 6, 2024
Maintainer