From 514493be37eebbfcdf387c3d83ac71650303d18a Mon Sep 17 00:00:00 2001 From: "Nat Kershaw (MSFT)" Date: Mon, 29 Apr 2024 21:05:23 -0700 Subject: [PATCH] Improve tutorial (#355) --- examples/python/phi-3-tutorial.md | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/examples/python/phi-3-tutorial.md b/examples/python/phi-3-tutorial.md index 4e8a5660c..6d2f5f728 100644 --- a/examples/python/phi-3-tutorial.md +++ b/examples/python/phi-3-tutorial.md @@ -26,25 +26,40 @@ These model repositories have models that run with DirectML, CPU and CUDA. ## Install the generate() API package +**Unsure about which installation instructions to follow?** Here's a bit more guidance: + +Are you on Windows machine with GPU? +* I don't know → Review [this guide](https://www.microsoft.com/en-us/windows/learning-center/how-to-check-gpu) to see whether you have a GPU in your Windows machine. +* Yes → Follow the instructions for [DirectML](#directml). +* No → Do you have an NVIDIA GPU? + * I don't know → Review [this guide](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html#verify-you-have-a-cuda-capable-gpu) to see whether you have a CUDA-capable GPU. + * Yes → Follow the instructions for [NVIDIA CUDA GPU](#nvidia-cuda-gpu). + * No → Follow the instructions for [CPU](#cpu). + +*Note: Only one package is required based on your hardware.* + ### DirectML + ``` pip install numpy pip install --pre onnxruntime-genai-directml ``` -### CPU +### NVIDIA CUDA GPU + ``` pip install numpy -pip install --pre onnxruntime-genai +pip install --pre onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/ ``` -### CUDA +### CPU + ``` pip install numpy -pip install --pre onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/ +pip install --pre onnxruntime-genai ``` ## Run the model @@ -55,6 +70,9 @@ The script accepts a model folder and takes the generation parameters from the c This example is using the long context model running with DirectML on Windows. +The `-m` argument is the path to the model you downloaded from HuggingFace above. +The `-l` argument is the length of output you would like to generate with the model. + ```bash curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/model-qa.py -o model-qa.py python model-qa.py -m Phi-3-mini-128k-instruct-onnx/directml/directml-int4-awq-block-128 -l 2048 @@ -66,4 +84,4 @@ Once the script has loaded the model, it will ask you for input in a loop, strea Input: <|user|>Tell me a joke about creative writing<|end|><|assistant|> Output: Why don't writers ever get lost? Because they always follow the plot! -``` +``` \ No newline at end of file