-
Notifications
You must be signed in to change notification settings - Fork 287
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
153 additions
and
0 deletions.
There are no files selected for viewing
153 changes: 153 additions & 0 deletions
153
colabs/lm-eval-harness/use_weights_and_biases_with_lm_eval_harness.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,153 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "d6YuliF8qfsy" | ||
}, | ||
"source": [ | ||
"<img src=\"https://wandb.me/logo-im-png\" width=\"400\" alt=\"Weights & Biases\" />\n", | ||
"<!--- @wandbcode{lm-eval-harness} -->\n", | ||
"\n", | ||
"# Visualizing Results in Weights and Biases\n", | ||
"\n", | ||
"<!--- @wandbcode{lm-eval-harness-colab} -->\n", | ||
"\n", | ||
"With the Weights and Biases integration, you can now spend more time extracting deeper insights into your evaluation results. The integration is designed to streamline the process of logging and visualizing experiment results using the Weights & Biases (W&B) platform.\n", | ||
"\n", | ||
"The integration provide functionalities\n", | ||
"\n", | ||
"- to automatically log the evaluation results,\n", | ||
"- log the samples as W&B Tables for easy visualization,\n", | ||
"- log the `results.json` file as an artifact for version control,\n", | ||
"- log the `<task_name>_eval_samples.json` file if the samples are logged,\n", | ||
"- generate a comprehensive report for analysis and visualization with all the important metric,\n", | ||
"- log task and cli configs,\n", | ||
"- and more out of the box like the command used to run the evaluation, GPU/CPU counts, timestamp, etc.\n", | ||
"\n", | ||
"The integration is super easy to use with the eval harness. Let's see how!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "tJRBdlTrqgVg" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Install this project if you did not already have it.\n", | ||
"# This is all that is needed to be installed to start using Weights and Biases\n", | ||
"\n", | ||
"!git clone -b wandb-logging https://github.com/ayulockin/lm-evaluation-harness\n", | ||
"%cd lm-evaluation-harness\n", | ||
"!pip -qq install -e .[wandb]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "Eb4eGgJIxCki" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Getting an error `no module named transformers.cache_utils` while running eval.\n", | ||
"# Installing transformers from the repo is solving the issue.\n", | ||
"# Ref: https://huggingface.co/DiscoResearch/mixtral-7b-8expert/discussions/9#6576edcd0370e52e3b2c0620\n", | ||
"!pip uninstall -y transformers\n", | ||
"!pip install git+https://github.com/huggingface/transformers" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "5c-t1T2vrxQw" | ||
}, | ||
"source": [ | ||
"# Run the Eval Harness\n", | ||
"\n", | ||
"Run the eval harness as usual with a `wandb_args` flag. This flag is used to provide arguments for initializing a wandb run ([wandb.init](https://docs.wandb.ai/ref/python/init)) as comma separated string arguments.\n", | ||
"\n", | ||
"If `wandb_args` flag is used, the metrics and all other goodness will be automatically logged to Weights and Biases. In the stdout, you will find the link to the W&B run page as well as link to the generated report." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "YkeORhFXry8o" | ||
}, | ||
"source": [ | ||
"## Set your API Key\n", | ||
"\n", | ||
"Before you can use W&B, you need to authenticate your machine with an authentication key. Visit https://wandb.ai/authorize to get one." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "E7YTd5OUrpnO" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import wandb\n", | ||
"wandb.login()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "_L5yCLtGt5Tu" | ||
}, | ||
"source": [ | ||
"> Note that if you are using command line you can simply authrnticate your machine by doing `wandb login` in your terminal. For more info check out the [documentation](https://docs.wandb.ai/quickstart#2-log-in-to-wb)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "R4wGEB7ot7OZ" | ||
}, | ||
"source": [ | ||
"## Run and log to W&B" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "fkMGw712sg8f" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!lm_eval \\\n", | ||
" --model hf \\\n", | ||
" --model_args pretrained=microsoft/phi-2,trust_remote_code=True \\\n", | ||
" --tasks hellaswag,mmlu_abstract_algebra \\\n", | ||
" --device cuda:0 \\\n", | ||
" --batch_size 8 \\\n", | ||
" --output_path output/phi-2 \\\n", | ||
" --limit 10 \\\n", | ||
" --wandb_args project=lm-eval-harness-integration \\\n", | ||
" --log_samples" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"accelerator": "GPU", | ||
"colab": { | ||
"gpuType": "V100", | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |