Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update install docs and add troubleshooting page #21210

Merged
merged 4 commits into from
Jul 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/genai/howto/build-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: How to build models with ONNX Runtime generate() API
has_children: false
parent: How to
grand_parent: Generate API (Preview)
nav_order: 2
nav_order: 3
---

# Generate models using Model Builder
Expand Down
30 changes: 30 additions & 0 deletions docs/genai/howto/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,28 +13,58 @@ nav_order: 1
* TOC placeholder
{:toc}

## Pre-requisites

### CUDA

If you are installing the CUDA variant of onnxruntime-genai, the CUDA toolkit must be installed.

The CUDA toolkit can be downloaded from the [CUDA Toolkit Archive](https://developer.nvidia.com/cuda-toolkit-archive).

Ensure that the `CUDA_PATH` environment variable is set to the location of your CUDA installation.

## Python packages

Note: only one of these packages should be installed in your application.

### CPU

```bash
pip install numpy
pip install onnxruntime-genai --pre
```

### DirectML

Append `-directml` for the library that is optimized for DirectML on Windows

```bash
pip install numpy
pip install onnxruntime-genai-directml --pre
```

### CUDA

Append `-cuda` for the library that is optimized for CUDA environments

#### CUDA 11

```bash
pip install numpy
pip install onnxruntime-genai-cuda --pre --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
```

#### CUDA 12

```bash
pip install numpy
pip install onnxruntime-genai-cuda --pre --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
```

## Nuget packages

Note: only one of these packages should be installed in your application.

```bash
dotnet add package Microsoft.ML.OnnxRuntimeGenAI --prerelease
```
Expand Down
24 changes: 0 additions & 24 deletions docs/genai/howto/setup-cuda-env.md

This file was deleted.

34 changes: 34 additions & 0 deletions docs/genai/howto/troubleshoot.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
title: Troubleshoot
description: How to troubleshoot common problems
has_children: false
parent: How to
grand_parent: Generate API (Preview)
nav_order: 4
---

# Troubleshoot issues with ONNX Runtime generate() API
{: .no_toc }

* TOC placeholder
{:toc}

## Installation issues

### Windows Conda import error

```
ImportError: DLL load failed while importing onnxruntime_genai: A dynamic link library (DLL) initialization routine failed.
```

If you see this issue in a Conda environment on Windows, you need to upgrade the `C++ runtime for Visual Studio`. In the conda environment, run the following command:

```bash
conda install conda-forge::vs2015_runtime
```

The onnxruntime-genai Python package should run without error after this extra step.

### Windows CUDA import error

After CUDA toolkit installation completed on windows, ensure that the `CUDA_PATH` system environment variable has been set to the path where the toolkit was installed. This variable will be used when importing the onnxruntime_genai python module on Windows. Unset or incorrectly set `CUDA_PATH` variable may lead to a `DLL load failed while importing onnxruntime_genai`.
21 changes: 16 additions & 5 deletions docs/genai/tutorials/phi3-v.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,17 +62,28 @@ Support for Windows machines with GPUs other than NVIDIA is coming soon!
```
This command downloads the model into a folder called `cuda-int4-rtn-block-32`.

2. Install the generate() API
2. Setup your CUDA environment

```
Install the [CUDA toolkit](https://developer.nvidia.com/cuda-toolkit-archive).

Ensure that the `CUDA_PATH` environment variable is set to the location of your CUDA installation.


3. Install the generate() API

* CUDA 11

```bash
pip install numpy
pip install --pre onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
```

3. Setup your CUDA environment

Please follow the steps [here](../howto/setup-cuda-env.md) to setup the CUDA environment.
* CUDA 12

```bash
pip install numpy
pip install onnxruntime-genai-cuda --pre --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
```

4. Run the model

Expand Down
Loading