Skip to content

Commit

Permalink
Improve tutorial (#355)
Browse files Browse the repository at this point in the history
  • Loading branch information
natke authored Apr 30, 2024
1 parent 639d176 commit 514493b
Showing 1 changed file with 23 additions and 5 deletions.
28 changes: 23 additions & 5 deletions examples/python/phi-3-tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,25 +26,40 @@ These model repositories have models that run with DirectML, CPU and CUDA.

## Install the generate() API package

**Unsure about which installation instructions to follow?** Here's a bit more guidance:

Are you on Windows machine with GPU?
* I don't know → Review [this guide](https://www.microsoft.com/en-us/windows/learning-center/how-to-check-gpu) to see whether you have a GPU in your Windows machine.
* Yes → Follow the instructions for [DirectML](#directml).
* No → Do you have an NVIDIA GPU?
* I don't know → Review [this guide](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html#verify-you-have-a-cuda-capable-gpu) to see whether you have a CUDA-capable GPU.
* Yes → Follow the instructions for [NVIDIA CUDA GPU](#nvidia-cuda-gpu).
* No → Follow the instructions for [CPU](#cpu).

*Note: Only one package is required based on your hardware.*

### DirectML


```
pip install numpy
pip install --pre onnxruntime-genai-directml
```

### CPU
### NVIDIA CUDA GPU


```
pip install numpy
pip install --pre onnxruntime-genai
pip install --pre onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
```

### CUDA
### CPU


```
pip install numpy
pip install --pre onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
pip install --pre onnxruntime-genai
```

## Run the model
Expand All @@ -55,6 +70,9 @@ The script accepts a model folder and takes the generation parameters from the c

This example is using the long context model running with DirectML on Windows.

The `-m` argument is the path to the model you downloaded from HuggingFace above.
The `-l` argument is the length of output you would like to generate with the model.

```bash
curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/model-qa.py -o model-qa.py
python model-qa.py -m Phi-3-mini-128k-instruct-onnx/directml/directml-int4-awq-block-128 -l 2048
Expand All @@ -66,4 +84,4 @@ Once the script has loaded the model, it will ask you for input in a loop, strea
Input: <|user|>Tell me a joke about creative writing<|end|><|assistant|>

Output: Why don't writers ever get lost? Because they always follow the plot!
```
```

0 comments on commit 514493b

Please sign in to comment.