Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make inputs and outputs more flexible in model builder #261

Merged
merged 3 commits into from
Apr 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion VERSION_INFO
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.2.0-dev
0.2.0-dev
57 changes: 41 additions & 16 deletions src/python/py/models/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,21 @@
This folder contains the model builder for quickly creating optimized and quantized ONNX models within a few minutes that run with ONNX Runtime GenAI.

# Contents
- [Current Support](#current-support)
- [Usage](#usage)
- [Full Usage](#full-usage)
- [Original PyTorch Model from Hugging Face](#original-pytorch-model-from-hugging-face)
- [Original PyTorch Model from Disk](#original-pytorch-model-from-disk)
- [Customized or Finetuned PyTorch Model](#customized-or-finetuned-pytorch-model)
- [GGUF Model](#gguf-model)
- [Extra Options](#extra-options)
- [Config Only](#config-only)
- [Unit Testing Models](#unit-testing-models)
- [Option 1: Use the model builder tool directly](#option-1-use-the-model-builder-tool-directly)
- [Option 2: Edit the config.json file](#option-2-edit-the-configjson-file-on-disk-and-then-run-the-model-builder-tool)
- [Current Support](#current-support)
- [Usage](#usage)
- [Full Usage](#full-usage)
- [Original PyTorch Model from Hugging Face](#original-pytorch-model-from-hugging-face)
- [Original PyTorch Model from Disk](#original-pytorch-model-from-disk)
- [Customized or Finetuned PyTorch Model](#customized-or-finetuned-pytorch-model)
- [GGUF Model](#gguf-model)
- [Extra Options](#extra-options)
- [Config Only](#config-only)
- [Exclude Embedding Layer](#exclude-embedding-layer)
- [Exclude Language Modeling Head](#exclude-language-modeling-head)
- [Unit Testing Models](#unit-testing-models)
- [Option 1: Use the model builder directly](#option-1-use-the-model-builder-directly)
- [Option 2: Edit the config.json file](#option-2-edit-the-configjson-file-on-disk-and-then-run-the-model-builder)
- [Design](#design)

## Current Support
The tool currently supports the following model architectures.
Expand Down Expand Up @@ -89,7 +92,7 @@ python3 builder.py -m model_name -o path_to_output_folder -p precision -e execut
```
To see all available options through `--extra_options`, please use the `help` commands in the `Full Usage` section above.

### Config Only
#### Config Only
This scenario is for when you already have your optimized and/or quantized ONNX model and you need to create the config files to run with ONNX Runtime GenAI.
```
# From wheel:
Expand All @@ -101,6 +104,28 @@ python3 builder.py -m model_name -o path_to_output_folder -p precision -e execut

Afterwards, please open the `genai_config.json` file in the output folder and modify the fields as needed for your model. You should store your ONNX model in the output folder as well.

#### Exclude Embedding Layer
This scenario is for when you want to exclude the embedding layer from your ONNX model.

```
# From wheel:
python3 -m onnxruntime_genai.models.builder -i path_to_local_folder_on_disk -o path_to_output_folder -p precision -e execution_provider -c cache_dir_to_store_temp_files --extra_options exclude_embeds=true

# From source:
python3 builder.py -i path_to_local_folder_on_disk -o path_to_output_folder -p precision -e execution_provider -c cache_dir_to_store_temp_files --extra_options exclude_embeds=true
```

#### Exclude Language Modeling Head
This scenario is for when you want to exclude the language modeling head from your ONNX model.

```
# From wheel:
python3 -m onnxruntime_genai.models.builder -i path_to_local_folder_on_disk -o path_to_output_folder -p precision -e execution_provider -c cache_dir_to_store_temp_files --extra_options exclude_lm_head=true

# From source:
python3 builder.py -i path_to_local_folder_on_disk -o path_to_output_folder -p precision -e execution_provider -c cache_dir_to_store_temp_files --extra_options exclude_lm_head=true
```

### Unit Testing Models
This scenario is where your PyTorch model is already downloaded locally (either in the default Hugging Face cache directory or in a local folder on disk). If it is not already downloaded locally, here is an example of how you can download it.

Expand All @@ -117,7 +142,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir=cache_dir)
tokenizer.save_pretrained(cache_dir)
```

#### Option 1: Use the model builder tool directly
#### Option 1: Use the model builder directly
This option is the simplest but it will download another copy of the PyTorch model onto disk to accommodate the change in the number of hidden layers.
```
# From wheel:
Expand All @@ -127,11 +152,11 @@ python3 -m onnxruntime_genai.models.builder -m model_name -o path_to_output_fold
python3 builder.py -m model_name -o path_to_output_folder -p precision -e execution_provider --extra_options num_hidden_layers=4
```

#### Option 2: Edit the config.json file on disk and then run the model builder tool
#### Option 2: Edit the config.json file on disk and then run the model builder

1. Navigate to where the PyTorch model and its associated files are saved on disk.
2. Modify `num_hidden_layers` in `config.json` to your desired target (e.g. 4 layers).
3. Run the below command for the model builder tool.
3. Run the below command for the model builder.

```
# From wheel:
Expand Down
Loading
Loading