Skip to content

Commit

Permalink
Add DiffusersModel (#1)
Browse files Browse the repository at this point in the history
* Add `Model` with shared code

* Add `PIPELINE_TASKS` mapping for `diffusers`

* Run `pre-commit autoupdate`

* Add `DiffusersModel` (WIP)

* Add `HF_PACKAGE` in `CustomCprModelServer`

* Add `DiffusersPredictor`

* Handle `model_task` in `DiffusersModel`

* Update `Dockerfile.{cpu,gpu}` and format `docker.py`

* Add `diffusers` extra

* Add `Dockerfile.{cpu,gpu}` for `diffusers`

* Move `transformers` files under `_internal/transformers/*`

* Add `_path_prefix` for `Dockerfile` via `importlib.resources`

* Add `huggingface_framework` and `huggingface_framework_version` args

* Add `DiffusersModel` import in `__init__`

* Update `README.md` and `docs/index.md`

* Add more `ignore_patterns` to `snapshot_download`

* Add `huggingface_*` args to `{Diffusers,Transformers}Model`

* Set `torch_dtype` using `torch`

* Fix outdated `pip install` in `docs/index.md`

* Fix `mypy` errors
  • Loading branch information
alvarobartt authored Mar 20, 2024
1 parent 005eef1 commit c5a6ca7
Show file tree
Hide file tree
Showing 21 changed files with 828 additions and 289 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ repos:
- id: check-yaml

- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.3.0
rev: v0.3.2
hooks:
- id: ruff
args:
Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
* 🐳 Automatically build Custom Prediction Routines (CPR) for Hugging Face Hub models using `transformers.pipeline`
* 📦 Everything is packaged within a single method, providing more flexibility and ease of usage than the former `google-cloud-aiplatform` SDK for custom models
* 🔌 Seamless integration for running inference on top of any model from the Hugging Face Hub in Vertex AI thanks to `transformers`
* 🌅 Support for `diffusers` models too!
* 🔍 Includes custom `logging` messages for better monitoring and debugging via Google Cloud Logging

## Get started
Expand All @@ -23,13 +24,13 @@ gcloud auth login
Then install `vertex-ai-huggingface-inference-toolkit` via `pip install`:

```bash
pip install vertex-ai-huggingface-inference-toolkit>=0.1.0
pip install vertex-ai-huggingface-inference-toolkit>=0.0.2
```

Or via `uv pip install` for faster installations using [`uv`](https://astral.sh/blog/uv):

```bash
uv pip install vertex-ai-huggingface-inference-toolkit>=0.1.0
uv pip install vertex-ai-huggingface-inference-toolkit>=0.0.2
```

## Example
Expand Down
5 changes: 3 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
* 🐳 Automatically build Custom Prediction Routines (CPR) for Hugging Face Hub models using `transformers.pipeline`
* 📦 Everything is packaged within a single method, providing more flexibility and ease of usage than the former `google-cloud-aiplatform` SDK for custom models
* 🔌 Seamless integration for running inference on top of any model from the Hugging Face Hub in Vertex AI thanks to `transformers`
* 🌅 Support for `diffusers` models too!
* 🔍 Includes custom `logging` messages for better monitoring and debugging via Google Cloud Logging

## Get started
Expand All @@ -23,13 +24,13 @@ gcloud auth login
Then install `vertex-ai-huggingface-inference-toolkit` via `pip install`:

```bash
pip install vertex-ai-huggingface-inference-toolkit>=0.1.0
pip install vertex-ai-huggingface-inference-toolkit>=0.0.2
```

Or via `uv pip install` for faster installations using [`uv`](https://astral.sh/blog/uv):

```bash
uv pip install vertex-ai-huggingface-inference-toolkit>=0.1.0
uv pip install vertex-ai-huggingface-inference-toolkit>=0.0.2
```

## Example
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ path = "src/vertex_ai_huggingface_inference_toolkit/__init__.py"

[project.optional-dependencies]
transformers = ["accelerate", "transformers"]
diffusers = ["accelerate", "diffusers"]
docs = [
"mkdocs",
"mkdocs-material",
Expand Down
3 changes: 2 additions & 1 deletion src/vertex_ai_huggingface_inference_toolkit/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
__author__ = "Alvaro Bartolome <[email protected]>"
__version__ = "0.0.2"

from vertex_ai_huggingface_inference_toolkit.diffusers import DiffusersModel
from vertex_ai_huggingface_inference_toolkit.transformers import TransformersModel

__all__ = ["TransformersModel"]
__all__ = ["DiffusersModel", "TransformersModel"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
ARG PYTHON_VERSION="3.10"
FROM python:${PYTHON_VERSION}-slim AS build
LABEL maintainer="Alvaro Bartolome"

ARG DEBIAN_FRONTEND=noninteractive
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONBUFFERED=1

RUN mkdir -m 777 -p /usr/app /home
WORKDIR /usr/app
ENV HOME=/home

RUN python -m pip install --no-cache-dir --upgrade pip && \
python -m pip install --no-cache-dir --force-reinstall "google-cloud-aiplatform[prediction]>=1.27.0" && \
python -m pip install --no-cache-dir --force-reinstall "vertex_ai_huggingface_inference_toolkit[transformers]>=0.0.2" --upgrade

ARG FRAMEWORK="torch"
ARG FRAMEWORK_VERSION="2.2.0"
RUN python -m pip install --no-cache-dir ${FRAMEWORK}==${FRAMEWORK_VERSION}

ARG DIFFUSERS_VERSION="0.27.2"
RUN python -m pip install --no-cache-dir diffusers==${DIFFUSERS_VERSION}

ARG EXTRA_REQUIREMENTS
RUN if [ -n "${EXTRA_REQUIREMENTS}" ]; then python -m pip install --no-cache-dir --force-reinstall ${EXTRA_REQUIREMENTS}; fi

ENV HANDLER_MODULE=google.cloud.aiplatform.prediction.handler
ENV HANDLER_CLASS=PredictionHandler
ENV PREDICTOR_MODULE=vertex_ai_huggingface_inference_toolkit.predictors.diffusers
ENV PREDICTOR_CLASS=DiffusersPredictor

EXPOSE 8080
ENTRYPOINT ["python", "-m", "google.cloud.aiplatform.prediction.model_server"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
ARG CUDA_VERSION="12.3.0"
ARG UBUNTU_VERSION="22.04"
FROM nvidia/cuda:${CUDA_VERSION}-base-ubuntu${UBUNTU_VERSION} AS build
LABEL maintainer="Alvaro Bartolome"

ARG DEBIAN_FRONTEND=noninteractive
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONBUFFERED=1

RUN mkdir -m 777 -p /usr/app /home
WORKDIR /usr/app
ENV HOME=/home

ARG PYTHON_VERSION="3.10"
RUN apt-get update && \
apt-get install software-properties-common --no-install-recommends -y && \
add-apt-repository ppa:deadsnakes/ppa && \
apt-get install python${PYTHON_VERSION} python3-pip --no-install-recommends -y && \
apt-get autoremove -y && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

RUN ln -s "/usr/bin/python${PYTHON_VERSION}" /usr/bin/python
ENV PYTHON=/usr/bin/python

RUN python -m pip install --no-cache-dir --upgrade pip && \
python -m pip install --no-cache-dir --force-reinstall "google-cloud-aiplatform[prediction]>=1.27.0" && \
python -m pip install --no-cache-dir --force-reinstall "vertex_ai_huggingface_inference_toolkit[transformers]>=0.0.2" --upgrade

ARG FRAMEWORK="torch"
ARG FRAMEWORK_VERSION="2.2.0"
RUN python -m pip install --no-cache-dir ${FRAMEWORK}==${FRAMEWORK_VERSION}

ARG DIFFUSERS_VERSION="0.27.2"
RUN python -m pip install --no-cache-dir diffusers==${DIFFUSERS_VERSION}

ARG EXTRA_REQUIREMENTS
RUN if [ -n "${EXTRA_REQUIREMENTS}" ]; then python -m pip install --no-cache-dir --force-reinstall ${EXTRA_REQUIREMENTS}; fi

ENV HANDLER_MODULE=google.cloud.aiplatform.prediction.handler
ENV HANDLER_CLASS=PredictionHandler
ENV PREDICTOR_MODULE=vertex_ai_huggingface_inference_toolkit.predictors.diffusers
ENV PREDICTOR_CLASS=DiffusersPredictor

EXPOSE 8080
ENTRYPOINT ["python", "-m", "google.cloud.aiplatform.prediction.model_server"]

Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ ARG FRAMEWORK="torch"
ARG FRAMEWORK_VERSION="2.2.0"
RUN python -m pip install --no-cache-dir ${FRAMEWORK}==${FRAMEWORK_VERSION}

ARG TRANSFORMERS_VERSION="4.11.3"
ARG TRANSFORMERS_VERSION="4.38.2"
RUN python -m pip install --no-cache-dir transformers==${TRANSFORMERS_VERSION}

ARG EXTRA_REQUIREMENTS
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ ARG FRAMEWORK="torch"
ARG FRAMEWORK_VERSION="2.2.0"
RUN python -m pip install --no-cache-dir ${FRAMEWORK}==${FRAMEWORK_VERSION}

ARG TRANSFORMERS_VERSION="4.11.3"
ARG TRANSFORMERS_VERSION="4.38.2"
RUN python -m pip install --no-cache-dir transformers==${TRANSFORMERS_VERSION}

ARG EXTRA_REQUIREMENTS
Expand Down
127 changes: 127 additions & 0 deletions src/vertex_ai_huggingface_inference_toolkit/diffusers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
from typing import Any, Dict, List, Literal, Optional

from vertex_ai_huggingface_inference_toolkit.model import Model


class DiffusersModel(Model):
"""Class that manages the whole lifecycle of a Hugging Face model either from the Hub
or from an existing Google Cloud Storage bucket to be deployed to Google Cloud Vertex AI
as an endpoint, running a Custom Prediction Routine (CPR) on top of a Hugging Face optimized
Docker image pushed to Google Cloud Artifact Registry.
This class is responsible for:
- Downloading the model from the Hub if `model_name_or_path` is provided.
- Uploading the model to Google Cloud Storage if `model_name_or_path` is provided.
- Building a Docker image with the prediction code, handler and the required dependencies if `image_uri` not provided.
- Pushing the Docker image to Google Cloud Artifact Registry if `image_uri` not provided.
- Registering the model in Google Cloud Vertex AI.
- Deploying the model as an endpoint with the provided environment variables.
Note:
This class is intended to be a high-level abstraction to simplify the process of deploying
models from the Hugging Face Hub to Google Cloud Vertex AI, and is built on top of `google-cloud-aiplatform`
and the rest of the required Google Cloud Python SDKs.
"""

def __init__(
self,
# Google Cloud
project_id: Optional[str] = None,
location: Optional[str] = None,
# Google Cloud Storage
model_name_or_path: Optional[str] = None,
model_kwargs: Optional[Dict[str, Any]] = None,
model_task: Literal[
"text-to-image", "image-to-text", "inpainting"
] = "text-to-image",
model_target_bucket: str = "vertex-ai-huggingface-inference-toolkit",
# Exclusive arg for Google Cloud Storage
model_bucket_uri: Optional[str] = None,
# Google Cloud Artifact Registry (Docker)
framework: Literal["torch", "tensorflow", "flax"] = "torch",
framework_version: Optional[str] = None,
diffusers_version: str = "0.26.3",
python_version: str = "3.10",
cuda_version: str = "12.3.0",
ubuntu_version: str = "22.04",
extra_requirements: Optional[List[str]] = None,
image_target_repository: str = "vertex-ai-huggingface-inference-toolkit",
# Exclusive arg for Google Cloud Artifact Registry
image_uri: Optional[str] = None,
# Google Cloud Vertex AI
environment_variables: Optional[Dict[str, str]] = None,
) -> None:
"""Initializes the `DiffusersModel` class, setting up the required attributes to
deploy a model from the Hugging Face Hub to Google Cloud Vertex AI.
Args:
project_id: is either the name or the identifier of the project in Google Cloud.
location: is the identifier of the region and zone where the resources will be created.
model_name_or_path: is the name of the model to be downloaded from the Hugging Face Hub.
model_kwargs: is the dictionary of keyword arguments to be passed to the model's `from_pretrained` method.
model_task: is the task of the model to be used by the `diffusers` library. It can be one of the following:
- `text-to-image`: AutoPipelineForText2Image
- `image-to-text`: AutoPipelineForImage2Image
- `inpainting`: AutoPipelineForInpainting
model_target_bucket: is the name of the bucket in Google Cloud Storage where the model will be uploaded to.
model_bucket_uri: is the URI to the model tar.gz file in Google Cloud Storage.
framework: is the framework to be used to build the Docker image, e.g. `torch`, `tensorflow`, `flax`.
framework_version: is the version of the framework to be used to build the Docker image.
diffusers_version: is the version of the `diffusers` library to be used to build the Docker image.
python_version: is the version of Python to be used to build the Docker image.
cuda_version: is the version of CUDA to be used to build the Docker image.
ubuntu_version: is the version of Ubuntu to be used to build the Docker image.
extra_requirements: is the list of extra requirements to be installed in the Docker image.
image_target_repository: is the name of the repository in Google Cloud Artifact Registry where the Docker image will be pushed to.
image_uri: is the URI to the Docker image in Google Cloud Artifact Registry.
environment_variables: is the dictionary of environment variables to be set in the Docker image.
Raises:
ValueError: if neither `model_name_or_path` nor `model_bucket_uri` is provided.
ValueError: if both `model_name_or_path` and `model_bucket_uri` are provided.
Examples:
>>> from vertex_ai_huggingface_inference_toolkit import DiffusersModel
>>> model = DiffusersModel(
... project_id="my-gcp-project",
... location="us-central1",
... model_name_or_path="stabilityai/stable-diffusion-2",
... model_task="text-to-image",
... )
>>> model.deploy(
... machine_type="n1-standard-8",
... accelerator_type="NVIDIA_TESLA_T4",
... accelerator_count=1,
... )
"""

if environment_variables is None:
environment_variables = {}

if model_task and environment_variables.get("HF_TASK"):
raise ValueError(
"Both `model_task` and `environment_variables['HF_TASK']` cannot be provided."
)

if model_task:
environment_variables["HF_TASK"] = model_task

super().__init__(
project_id=project_id,
location=location,
model_name_or_path=model_name_or_path,
model_kwargs=model_kwargs,
model_target_bucket=model_target_bucket,
model_bucket_uri=model_bucket_uri,
framework=framework,
framework_version=framework_version,
huggingface_framework="diffusers", # type: ignore
huggingface_framework_version=diffusers_version,
python_version=python_version,
cuda_version=cuda_version,
ubuntu_version=ubuntu_version,
extra_requirements=extra_requirements,
image_target_repository=image_target_repository,
image_uri=image_uri,
environment_variables=environment_variables,
)
11 changes: 11 additions & 0 deletions src/vertex_ai_huggingface_inference_toolkit/diffusers_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from diffusers.pipelines.auto_pipeline import (
AutoPipelineForImage2Image,
AutoPipelineForInpainting,
AutoPipelineForText2Image,
)

PIPELINE_TASKS = {
"text-to-image": AutoPipelineForText2Image,
"image-to-text": AutoPipelineForImage2Image,
"inpainting": AutoPipelineForInpainting,
}
Loading

0 comments on commit c5a6ca7

Please sign in to comment.