Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update install instructions for onnxruntime-genai separate packages #21646

Merged
merged 6 commits into from
Aug 14, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 32 additions & 22 deletions docs/genai/howto/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,69 +13,79 @@ nav_order: 1
* TOC placeholder
{:toc}

## Pre-requisites

### CUDA

If you are installing the CUDA variant of onnxruntime-genai, the CUDA toolkit must be installed.

The CUDA toolkit can be downloaded from the [CUDA Toolkit Archive](https://developer.nvidia.com/cuda-toolkit-archive).

Ensure that the `CUDA_PATH` environment variable is set to the location of your CUDA installation.
## Python package installation

## Python packages

Note: only one of these packages should be installed in your application.
Note: only one of these sets of packages (CPU, DirectML, CUDA) should be installed in your environment.

### CPU

```bash
pip install numpy
pip install onnxruntime-genai --pre
pip install onnxruntime-genai
```

### DirectML

Append `-directml` for the library that is optimized for DirectML on Windows

```bash
pip install numpy
pip install onnxruntime-genai-directml --pre
pip install onnxruntime-genai-directml
```

### CUDA

Append `-cuda` for the library that is optimized for CUDA environments
If you are installing the CUDA variant of onnxruntime-genai, the CUDA toolkit must be installed.

The CUDA toolkit can be downloaded from the [CUDA Toolkit Archive](https://developer.nvidia.com/cuda-toolkit-archive).

Ensure that the `CUDA_PATH` environment variable is set to the location of your CUDA installation.

#### CUDA 11

```bash
pip install numpy
pip install onnxruntime-genai-cuda --pre --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
pip install onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
```

#### CUDA 12

```bash
pip install numpy
pip install onnxruntime-genai-cuda --pre --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
pip install onnxruntime-genai-cuda --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
```

## Nuget packages

Note: only one of these packages should be installed in your application.
## Nuget package installation

Note: install only one of these packages (CPU, DirectML, CUDA) in your project.

### Pre-requisites

#### ONNX Runtime dependency

ONNX Runtime generate() versions 0.3.0 and earlier came bundled with the core ONNX Runtime binaries. From version 0.4.0 onwards, the packages are separated to allow a more flexible developer experience.

Version 0.4.0-rc1 depends on the ONNX Runtime version 1.19.0 RC. To install 0.4.0-rc1, add the following nuget source *before* installing the ONNX Runtime generate() nuget package.

```
dotnet nuget add source https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/nuget/v3/index.json --name ORT-Nightly
```

### CPU

```bash
dotnet add package Microsoft.ML.OnnxRuntimeGenAI --prerelease
```

For the package that has been optimized for CUDA:
### CUDA

Note: only CUDA 11 is supported for versions 0.3.0 and earlier, and only CUDA 12 is supported for versions 0.4.0 and later.

```bash
dotnet add package Microsoft.ML.OnnxRuntimeGenAI.Cuda --prerelease
```

For the package that has been optimized for DirectML:
### DirectML

```bash
dotnet add package Microsoft.ML.OnnxRuntimeGenAI.DirectML --prerelease
Expand Down
Loading