Skip to content

Commit

Permalink
Add phi-3 tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
natke committed Apr 23, 2024
1 parent 0ebd6b4 commit 04e78f6
Show file tree
Hide file tree
Showing 3 changed files with 84 additions and 3 deletions.
10 changes: 8 additions & 2 deletions docs/genai/howto/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,17 @@ nav_order: 1
* TOC placeholder
{:toc}

## Python package release candidates
## Python package

```bash
pip install numpy
pip install onnxruntime-genai --pre --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
pip install onnxruntime-genai --pre
```
Append `-directml` for the library that is optimized for DirectML on Windows

```bash
pip install numpy
pip install onnxruntime-genai-directml --pre
```

Append `-cuda` for the library that is optimized for CUDA environments
Expand Down
2 changes: 1 addition & 1 deletion docs/genai/tutorials/phi2-python.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: Learn how to write a language generation application with ONNX Runt
has_children: false
parent: Tutorials
grand_parent: Generative AI (Preview)
nav_order: 1
nav_order: 2
---

# Language generation in Python with phi-2
Expand Down
75 changes: 75 additions & 0 deletions docs/genai/tutorials/phi3-python.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: Python phi-3 tutorial
description: Small but mighty. Run Phi-3 with ONNX Runtime.
has_children: false
parent: Tutorials
grand_parent: Generative AI (Preview)
nav_order: 1
---

# Run the Phi-3 Mini models with the ONNX Runtime generate() API


## Download the model

Phi-3 ONNX models are published on HuggingFace.

Download either or both of the [short](https://aka.ms/phi3-mini-4k-instruct-onnx) and [long](https://aka.ms/phi3-mini-128k-instruct-onnx) context Phi-3 mini models.


For the short context model.

```bash
git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx
```

For the long context model

```bash
git clone https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx
```

These model repositories have models that run with DirectML, CPU and CUDA.

## Install the generate() API package

### DirectML

```
pip install numpy
pip install --pre onnxruntime-genai-directml
```

### CPU

```
pip install numpy
pip install --pre onnxruntime-genai
```

### CUDA

```
pip install numpy
pip install --pre onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
```

## Run the model

Run the model with [this script](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/model-qa.py).

The script accepts a model folder and takes the generation parameters from the config in that model folder. You can also override the parameters on the command line.

This example is using the long context model running with DirectML on Windows.

```bash
python model-qa.py -m models/phi3-mini-128k-instruct-directml-int4-awq-block-128
```

Once the script has loaded the model, it will ask you for input in a loop, streaming the output as it is produced the model. For example:

```bash
Input: <|user|>Tell me a joke about creative writing<|end|><|assistant|>

Output: Why don't writers ever get lost? Because they always follow the plot!
```

0 comments on commit 04e78f6

Please sign in to comment.