From 9c4c08cfe5594a393d3cba1735f72012feed914a Mon Sep 17 00:00:00 2001 From: Patrice Vignola Date: Tue, 23 Apr 2024 14:10:33 -0700 Subject: [PATCH] Revert "Update README" This reverts commit 96984c59d9f45f7e0d94f5bf838bd7eb74b31111. --- examples/python/phi-3-tutorial.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/examples/python/phi-3-tutorial.md b/examples/python/phi-3-tutorial.md index 7e69db5bc..33e422368 100644 --- a/examples/python/phi-3-tutorial.md +++ b/examples/python/phi-3-tutorial.md @@ -49,15 +49,14 @@ pip install --pre onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visual ## Run the model -Run the model with [model-qa.py](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/model-qa.py). +Run the model with [this script](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/model-qa.py). The script accepts a model folder and takes the generation parameters from the config in that model folder. You can also override the parameters on the command line. This example is using the long context model running with DirectML on Windows. ```bash -curl https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/model-qa.py -o model-qa.py -python model-qa.py -m Phi-3-mini-4k-instruct-onnx/directml/directml-int4-awq-block-128 +python model-qa.py -m models/phi3-mini-128k-instruct-directml-int4-awq-block-128 ``` Once the script has loaded the model, it will ask you for input in a loop, streaming the output as it is produced the model. For example: