Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ACPT docs to ORT docs #20839

Merged
merged 6 commits into from
May 30, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions docs/ecosystem/acpt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: Azure Container for PyTorch (ACPT)
description: Learn more about Azure Container for PyTorch (ACPT) and how it utilizes ONNX Runtime
nav_order: 1
redirect_from: /docs/tutorials/ecosystem/acpt
---
# Azure Container for PyTorch (ACPT)
{: .no_toc }

Azure Container for PyTorch (ACPT) is a lightweight, standalone environment that includes needed components to effectively run optimized training for large models. It helps with reducing preparation costs and faster deployment time. ACPT can be used to quickly get started with various deep learning tasks with PyTorch on Azure.

## Contents
{: .no_toc }

* TOC placeholder
{:toc}


## Why should I use ACPT?
* **Flexibility:** Use as-is with preinstalled packages or build on top of the curated environment.
* **Ease of use:** All components are installed and validated against dozens of Microsoft workloads to reduce setup costs and accelerate time to value.
* **Efficiency:** Avoid unnecessary image builds and only have required dependencies that are accessible right in the image/container.
* **Optimized training framework:** Set up, develop, and accelerate PyTorch models on large workloads, and improve training and deployment success rate.
* **Up-to-date stack:** Access the latest compatible versions of Ubuntu, Python, PyTorch, CUDA/RocM, etc.
* **Latest training optimization technologies:** Make use of ONNX Runtime, DeepSpeed, MSCCL, and more.

## Supported configurations for Azure Container for PyTorch (ACPT)
The following configurations are supported in the Microsoft Container Registry (MCR):

| OS | GPU Type | Python Version | PyTorch Version | ORT-training version | DeepSpeed version | torch-ort Version | Nebula Version |
| - | - | - | - | - | - | - | - |
|ubuntu2004|cu117|3.8|1.13.1|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu117|3.9|1.13.1|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu117|3.10|1.13.1|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu118|3.8|2.0.1|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu118|3.10|2.0.1|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu118|3.8|2.2.2|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu118|3.10|2.2.2|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu121|3.8|2.2.2|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu121|3.10|2.2.2|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu118|3.10|2.3.0|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu121|3.10|2.3.0|1.18.0|0.14.2|1.17.0|0.16.13|
|ubuntu2004|cu121|3.8|2.1.2|1.18.0|0.14.2|1.17.0|0.16.13|

Other packages like fairscale, horovod, msccl, protobuf, pyspark, pytest, pytorch-lightning, tensorboard, NebulaML, torchvision, and torchmetrics are provided to support all training needs.

## Support
Version updates for supported environments, including the base images they reference, are released every two weeks to address vulnerabilities no older than 30 days. Based on usage, some environments may be deprecated (hidden from the product but usable) to support more common machine learning scenarios.
4 changes: 4 additions & 0 deletions docs/ecosystem/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ ONNX Runtime functions as part of an ecosystem of tools and platforms to deliver
{:toc}


## Azure Container for PyTorch (ACPT)
* [Azure Container for PyTorch (ACPT) docs](https://onnxruntime.ai/docs/ecosystem/acpt.html){:target="_blank"}
* [Azure Container for PyTorch (ACPT) - Azure Machine Learning](https://learn.microsoft.com/en-us/azure/machine-learning/resource-azure-container-for-pytorch?view=azureml-api-2){:target="_blank"}

## Azure Machine Learning Services
* [Azure Container Instance: BERT](https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/python/tools/transformers/notebooks/Inference_Bert_with_OnnxRuntime_on_AzureML.ipynb){:target="_blank"}
* [Azure Kubernetes Services: FER+](https://github.com/microsoft/onnxruntime/blob/main/docs/python/notebooks/onnx-inference-byoc-gpu-cpu-aks.ipynb){:target="_blank"}
Expand Down
14 changes: 14 additions & 0 deletions docs/ecosystem/ptca_image_list.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
| OS | GPU Type | Python Version | PyTorch Version | ORT-training version | DeepSpeed version | torch-ort Version | Nebula Version | Image Name | MCR Image Name |
| - | - | - | - | - | - | - | - | - | - |
|ubuntu2004|cu117|3.8|1.13.1|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu117-py38-torch1131|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu117-py38-torch1131|
|ubuntu2004|cu117|3.9|1.13.1|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu117-py39-torch1131|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu117-py39-torch1131|
|ubuntu2004|cu117|3.10|1.13.1|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu117-py310-torch1131|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu117-py310-torch1131|
|ubuntu2004|cu118|3.8|2.0.1|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu118-py38-torch201|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu118-py38-torch201|
|ubuntu2004|cu118|3.10|2.0.1|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu118-py310-torch201|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu118-py310-torch201|
|ubuntu2004|cu118|3.8|2.2.2|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu118-py38-torch222|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu118-py38-torch222|
|ubuntu2004|cu118|3.10|2.2.2|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu118-py310-torch222|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu118-py310-torch222|
|ubuntu2004|cu121|3.8|2.2.2|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu121-py38-torch222|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu121-py38-torch222|
|ubuntu2004|cu121|3.10|2.2.2|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu121-py310-torch222|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu121-py310-torch222|
|ubuntu2004|cu118|3.10|2.3.0|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu118-py310-torch230|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu118-py310-torch230|
|ubuntu2004|cu121|3.10|2.3.0|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu121-py310-torch230|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu121-py310-torch230|
|ubuntu2004|cu121|3.8|2.1.2|1.18.0|0.14.2|1.17.0|0.16.13|ptebic.azurecr.io/public/aifx/acpt/stable-ubuntu2004-cu121-py38-torch212|mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu121-py38-torch212|
Loading