Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process text-to-image requested image count sequentially #66

Merged
merged 2 commits into from
May 21, 2024

Conversation

ad-astra-video
Copy link
Collaborator

Updated the text-to-image to sequentially process images if more than 1 requested.

This provides more stable on GPU memory usage and I believe inference is relatively linear for text-to-image models. If a user wants faster inference they can split the request to separate requests that would go to separate orchestrators.

This is a quick fix for issue #49.

Some questions:

  1. Should we set a sensible limit per request? Timeout seems to be 30s so 15 sounds like a good limit that would allow most graphics cards to respond within the timeout.
  2. Is there a check for if a set of images is requested but non-matching number of seeds is provided? For example, batch of 10 requested but only 5 seeds provided. I was not seeing where it would fill out the seeds to match the requested images count.

@ad-astra-video
Copy link
Collaborator Author

Did some testing and when I get past 12 images, the first ones processed return an ErrNotFound from the broadcaster. 15 images returns 3 ErrNotFound, 13 returns 1 ErrNotFound. 20 returns 8 ErrNotFound.

Is there a limit of 12 somewhere? or is this just my setup?

@ad-astra-video
Copy link
Collaborator Author

ad-astra-video commented Apr 21, 2024

Did some testing and when I get past 12 images, the first ones processed return an ErrNotFound from the broadcaster. 15 images returns 3 ErrNotFound, 13 returns 1 ErrNotFound. 20 returns 8 ErrNotFound.

Is there a limit of 12 somewhere? or is this just my setup?

I think I found it, the MemoryDriver in go-tools has a cache length of 12.
Node uses a session based cache setup here https://github.com/livepeer/go-livepeer/blob/930533388e62d21b339cccd8d6f4348dc14c5e75/cmd/livepeer/starter/starter.go#L1170.

Memory driver is in livepeer go-tools repo. https://github.com/livepeer/go-tools/blob/ac33a3c30a694a743f0077630aaf62423b5fd1f9/drivers/local.go#L17

@eliteprox
Copy link
Collaborator

1. Should we set a sensible limit per request?  Timeout seems to be 30s so 15 sounds like a good limit that would allow most graphics cards to respond within the timeout.

I think we should set a reasonable limit for the number of files sent to a given orchestrator and send remainder to the next orchestrator in the pool if one is available.

2. Is there a check for if a set of images is requested but non-matching number of seeds is provided?  For example, batch of 10 requested but only 5 seeds provided.  I was not seeing where it would fill out the seeds to match the requested images count.

I'll have to debug/learn the multi-file workflow first, but this line should be generating a seed when it's not provided https://github.com/livepeer/ai-worker/pull/66/files#diff-2952f66b536acb78e9bb5ee0337a00485762b802438a3209508b3b0ee088212dR62

@ad-astra-video ad-astra-video force-pushed the t2i-sequential-processing branch from 19129a2 to 620505d Compare May 13, 2024 22:15
@ad-astra-video ad-astra-video force-pushed the t2i-sequential-processing branch from f07e2e8 to 5ab6f10 Compare May 20, 2024 18:48
@ad-astra-video
Copy link
Collaborator Author

Updated to lastest main branch and force pushed to overwrite prior commits.

@ad-astra-video
Copy link
Collaborator Author

1. Should we set a sensible limit per request?  Timeout seems to be 30s so 15 sounds like a good limit that would allow most graphics cards to respond within the timeout.

I think we should set a reasonable limit for the number of files sent to a given orchestrator and send remainder to the next orchestrator in the pool if one is available.

2. Is there a check for if a set of images is requested but non-matching number of seeds is provided?  For example, batch of 10 requested but only 5 seeds provided.  I was not seeing where it would fill out the seeds to match the requested images count.

I'll have to debug/learn the multi-file workflow first, but this line should be generating a seed when it's not provided https://github.com/livepeer/ai-worker/pull/66/files#diff-2952f66b536acb78e9bb5ee0337a00485762b802438a3209508b3b0ee088212dR62

@rickstaa added a generator to create enough seeds if the images requested > 1.

My understanding of the os memory driver is it will fail to retrieve images from the gateway if greater than 12 are requested so we have a limit at 12 I believe but no error providing that it will fail after 12 is requested.

@eliteprox
Copy link
Collaborator

It sounds like we need to set a cap of 12 since that is the technical limit, but for reasonable processing time that might need to be even lower.

This commit cleans up the sequential images code a bit.
@rickstaa rickstaa merged commit 8b5cd1e into livepeer:main May 21, 2024
1 check passed
@rickstaa
Copy link
Member

@ad-astra-video merged into the main branch. Thanks a lot 🚀!

@rickstaa
Copy link
Member

It sounds like we need to set a cap of 12 since that is the technical limit, but for reasonable processing time that might need to be even lower.

Let's deal with this in a subsequent pull request I created https://linear.app/livepeer-ai-spe/issue/LIV-380/throw-warning-when-user-requests-a-bathc-with-more-than-12-images to track this issue.

eliteprox pushed a commit to eliteprox/ai-worker that referenced this pull request Jun 9, 2024
…ors (livepeer#66)

This commit ensures that batches in the T2I pipeline are processed sequentially. This change is necessary because we currently lack the ability to estimate a GPU's VRAM capacity and manage requests accordingly.

* process text-to-image requested image count sequentially

* refactor: cleanup sequential images code

This commit cleans up the sequential images code a bit.

---------

Co-authored-by: Rick Staa <[email protected]>
@ad-astra-video ad-astra-video deleted the t2i-sequential-processing branch July 25, 2024 01:59
eliteprox pushed a commit to eliteprox/ai-worker that referenced this pull request Jul 26, 2024
…ors (livepeer#66)

This commit ensures that batches in the T2I pipeline are processed sequentially. This change is necessary because we currently lack the ability to estimate a GPU's VRAM capacity and manage requests accordingly.

* process text-to-image requested image count sequentially

* refactor: cleanup sequential images code

This commit cleans up the sequential images code a bit.

---------

Co-authored-by: Rick Staa <[email protected]>
rickstaa added a commit that referenced this pull request Sep 18, 2024
This commit removes the 'batch_size' argument from the benchmarking
script since our current pipeliens don't support batching requests due
to us not having a way to estimate VRAM and preventing out of memory
errors. For more information see
#66. We can add this option back
in when we have solved this.
rickstaa added a commit that referenced this pull request Sep 18, 2024
This commit removes the 'batch_size' argument from the benchmarking script since our current pipeliens don't support batching requests due to us not having a way to estimate VRAM and preventing out of memory errors. For more information see #66. We can add this option back in when we have solved this.
JJassonn69 pushed a commit to JJassonn69/ai-worker that referenced this pull request Sep 18, 2024
This commit removes the 'batch_size' argument from the benchmarking script since our current pipeliens don't support batching requests due to us not having a way to estimate VRAM and preventing out of memory errors. For more information see livepeer#66. We can add this option back in when we have solved this.
JJassonn69 added a commit to JJassonn69/ai-worker that referenced this pull request Sep 20, 2024
* feat(model): add Realistic Vision model T2I support (livepeer#136)

This commit ensures that the https://huggingface.co/SG161222/Realistic_Vision_V6.0_B1_noVAE
model is supported in the T2I pipeline.

* ci: add JS/TS SDK update trigger (livepeer#138)

This commit adds a update trigger to the OpenAPI sync action that
triggers a update of the JS/TS SDK.

* ci: add TS/JS SDK OpenAPI spec update trigger (livepeer#139)

This commit addes a trigger to update the OpenAPI spec in https://github.com/livepeer/ai-sdk-js. Furhter it improves the OpenAPI spec upstream sync action to forward more information.

* refactor: add T2I parameter annotations (livepeer#141)

This commit adds parameter annotations to the T2I pipeline similar to
how it is done in the rest of the pipelines. Descriptions will be added
in a subsequenty commit.

* refactor: sort imports using isort (livepeer#142)

This commit sorts the python imports using the
https://pypi.org/project/isort package.

* ci: update OpenAPI spec trigger repos (livepeer#143)

This commit ensures the right upstream repos are triggered in the
trigger upstream OpenAPI sync action.

* feat: improve prompt splitter (livepeer#146)

This commit ensures that an empty dict is returned by the prompt
splitter when no valid prompt was found.

* feat(T2I): add Black Forrest Labs Flux 1 support (livepeer#147)

This commit adds support for the [Black Forrest Labs Flux 1 Schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) model to the T2I pipeline. It is important to note that this model can only run on GPUs with more than 33 GB or VRAM.

* refactor: fix unet load deprecation warnings (livepeer#151)

This commit fixes some unet deprecation warnings by chaning the way the
stable diffusion base model is loaded.

* refactor: resolve CLIPFeatureExtractor deprecation warning (livepeer#152)

This commit resolves a CLIPFeatureExtractor deprecation warning thrown
by the NSFW check logic.

* refactor: added descriptions to the pipeline parameters. (livepeer#144)

* Added descriptions to the parameters.

All parameters needing descriptions across: A2T, I2I, I2V, T2I, and Upscale have had their descriptions added.

* The descriptions have been updated to better apply to the current implementation.

* refactor: shorten parameter descriptions

This commit shortens some of the parameter descriptions since the longer
description is found on huggingface.

* chore: update OpenAPI spec and golang bindings

This commit ensures that the OpenAPI spec and golang bindings are
updated with the new descriptions.

---------

Co-authored-by: Rick Staa <[email protected]>

* chore: apply black formatter (livepeer#153)

This commit applies the black formatter to the codebase to ensure the
code formatting is consistent.

* feat: improve OpenAPI spec generation and naming (livepeer#155)

This commit improves the naming and generation for the OpenAPI spec so
that they are easier to work with.

* chore: remove redundant OpenAPI spec (livepeer#156)

This commit removes a redundant OpenAPI spec that was introduced some
time ago.

* Revert "chore: remove redundant OpenAPI spec (livepeer#156)" (livepeer#157)

This reverts commit 41fa3f4.

* chore: remove redundant OpenAPI spec (livepeer#158)

This commit removes a redundant OpenAPI spec file that was introduced
some time ago.

* refactor: cleanup Gateway OpenAPI spec (livepeer#160)

This commit removes the health endpoint schema from the generated
Gateway OpenAPI spec.

* chore: fix flake8 errors (livepeer#159)

This commit fixes the flake8 errors that were introduced into the
codebase in the last months.

* refactor: remove old OpenAPI spec (livepeer#161)

This commit removes the old unused OpenAPI spec.

* feat: update ByteDance/SDXL-Lighting to default to 8step (livepeer#162)

* update ByteDance/SDXL-Lightning to default to 8 step unet

* update I2I to 8step default for ByteDance/SDXL-Lightning model

* feat: apply git release version to OpenAPI spec (livepeer#164)

This commit ensures that the latest git release flag is applied to the OpenAPI spec.

* refactor: add pipeline descriptions (livepeer#169)

This commit adds pipeline descriptions so that each pipeline is clearly
explained on the docs.

* refactor(openapi): replace json with yaml (livepeer#170)

This commit replaces the default OpenAPI spec with yaml.

* refactor: add response type descriptions (livepeer#171)

This commit ensures that descriptions show up for the route response
types in the docs.

* chore(worker): update go bindings (livepeer#172)

This commit updates the go bindings to include the right docstrings.

* ci: fix OpenAPI spec check action (livepeer#173)

This commit fixes the OpenAPI spec check action. This action can be used
to ensure the OpenAPI spec and go bindings are up to date.

* ci: remove manual SDK/Docs update trigger (livepeer#174)

This commit replaces the manual update trigger for the docs and SDKs by
Speakeasy actions.

* refactor: type gen_openapi file (livepeer#175)

This commit ensures that the functions in the gen_openapi file are
typed.

* chore: remove redundant OpenAPI specs (livepeer#177)

This commit removes the JSON versions of the OpanAPI spec since they are
no longer used.

* refactor: rename A2T pipeline attribute (livepeer#179)

This commit renames the self.ldm (Latent Diffusion Model) to self.tm
(Transformer model) to make the distinction clearer.

* chore: update make go-bindings generation command (livepeer#180)

This commit ensures that the make file uses the right OpenAPI spec to
generate the go bindings.

* add studio api url (livepeer#178)

* feat: add Studio Gateway

This commit adds the studio Gateway to the list of servers.

* chore: update OpenAPI spec

This commit updates the OpenAPI spec to add the Studio gateway to the
list of servers and thus the documentation.

* feat: enable multiple containers for pipeline/model_id (livepeer#148)

This commit makes the container map more unique providing users the case
of running multiple pipelines behind one external endpoint.

Co-authored-by: Rick Staa <[email protected]>

* feat: add OpenAPI gen version arg (livepeer#184)

This commit provides developers with a `--version` argument they can use
when generating the OpenAPI spec using the `gen_openapi.py` script`.

* Segment anything 2 pipeline image (livepeer#185)

* feat(pipeline): add SAM2 image segmentation prototype

This commit introduces a prototype implementation of the
[Segment Anything v2](https://github.com/facebookresearch/segment-anything-2)
(SAM2) pipeline within the AI worker. The prototype demonstrates the basic
functionality needed to perform segmentation on an image. Note that video
segmentation is not yet implemented. Additionally, the dependencies were
updated quickly, which may temporarily break other pipelines.

* revert Dockerfile, requirements, add sam2 Dockerfile

* refactor: enhance SAM2 input handling and error management

This commit allows nested arrays to be supplied as JSON strings for SAM2
input. It also implements robust error handling to return a 400 error with
a descriptive message when incorrect parameters are provided.

* refactor: improve SAM2 return time

This commit ensures that we return the masks, iou_predictions and
low_res_masks in json format.

* Sam2 -> SegmentAnything2

* update go bindings

* update multipart.go binding with NewSegmentAnything2Writer

* update worker and multipart methods

* predictions -> scores, mask -> logits

* add sam2 specific multipartwriter fields

* add segment-anything-2 to containerHostPorts

* fix pipeline name in worker.go

* revert Dockerfile, requirements, add sam2 Dockerfile

* Sam2 -> SegmentAnything2

* predictions -> scores, mask -> logits

* feat: replace JSON.dump with str

This commit replaces the JSON.dump method with a simple str method since
it is highly unlikely that the string contains invalid data.

Co-authored-by: Peter Schroedl <[email protected]>

* move pipeline-specific dockerfile

* update openapi yaml

* add segment anything specific readme

* update go bindings

* refactor: move SAM2 docker

This commit moves the SAM2 docker file inside the docker container.

* refactor: add FastAPI descriptions

This commit cleansup the codebase and adds FastAPI parameter and
pipeline descriptions.

* refactor: improve sam2 route function name

This commit improves the sam2 route function name so that it is more
pythonic and shows up nicer in the OpenAPI spec pipeline summary.

* chore(worker): update golang bindings

This commit updates the golang bindings so that the runner changes are
reflected.

* refactor(runner): add media_type

This commit adds the media type content MIME type to the segment
anything 2 pipeline.

* chore(worker): remove debug patch

This commit removes the debug patch which was accidentally added to the
code.

* feat(runnner): add SAM2 model download command

This commit adds the SAM2 model download command so that orchestrators
can pre-download the model.

* refactor(worker): change SAM2 multipart reader param order

This commit ensures that the parameters are in the same order as the
pipeline parameters.

* determine docker image in createContainer

* fix: fix examples

This commit fixes the example scripts.

---------

Co-authored-by: Rick Staa <[email protected]>
Co-authored-by: Elite Encoder <[email protected]>
Co-authored-by: Peter Schroedl <[email protected]>

* fix(pipeline): add FLUX.1-dev and disable negative_prompt on flux (livepeer#167)

This commit adds the black-forest-labs/FLUX.1 model download commands.
The dev model is placed under the `--restricted` flag since it can not be
used for commercial purposes.

Co-authored-by: Rick Staa <[email protected]>

* chore: update OpenAPI spec version

This commit updates the version set in the OpenAPI spec.

* ci(docker): add ai-runner base Docker tag (livepeer#194)

This commit ensures that the main Docker container is also tagged as the
base container so that it can be used as the base for the pipeline
specific containers.

Co-authored-by: ad-astra-video <[email protected]>

* ci(docker): add workflow dispatch (livepeer#195)

This commit ensures that developers can trigger docker image building.

* ci(docker): ensure docker ci dispatch works (livepeer#196)

This commit ensures that the workflow dispatch triggers the docker job.

* ci: add pipeline docker ci (livepeer#193)

* chore(docker): add 'base' tag and segment-anything-2 docker image build

* update segment-anything-2 to dynamic base image

* make more space on runner

* refactor(ci): split Docker CI

This commit ensures that the pipeline docker build ci is found in a
seperate action from the base.

* ci(docker): enable pipeline docker workflow dispatch

This commit ensures that maintainers can trigger the pipeline specific
Docker action using a workflow dispatch.

* ci(docker): fix out of space error

This commit switches to the oxfort runner to see if it can fis the OS
storage error.

* ci: cleanup hosted runner

This commit cleans up the hosted runner so that we don't run into OS
storage issues when trying to build the container.

---------

Co-authored-by: Brad P <[email protected]>

* ci(docker): add sam2 docker tags (livepeer#197)

This commit ensures that the SAM2 docker has the right tags.

* ci(docker): enable pipeline docker workflow dispatch (livepeer#198)

This commit ensures that maintainers can manually run the pipeline
docker ci.

* feat(sdks): implement SDK-specific API customizations (livepeer#191)

* feat(sdks): implement SDK-specific API customizations

This commit introduces several SDK-specific OpenAPI configurations to the runner
API configuration. These customizations will enhance the SDKs we are planning
to release.

Co-authored-by: Victor Elias <[email protected]>

* feat: enable speakeasy retries (livepeer#201)

This commit enables the [speakeasy
retries](https://www.speakeasy.com/docs/customize-sdks/retries#global-retries)
feature for the SDKs.

* Revert "feat: enable speakeasy retries (livepeer#201)" (livepeer#202)

This reverts commit caa4bb7.

* chore: release v0.5.0

This commit releases a new minor version since we had to revert the SDK
related changes.

* chore: update alpha to beta phase (livepeer#203)

This commit updates the code and documentation to signal we are entering
the Beta phase of the AI network journey.

* fix(runner): improve 'num_inference_steps' logic (livepeer#205)

This commit prevents a Key Error from being thrown when the pipelines
are called directly.

* fix(runner): fix benchmarking script (livepeer#206)

This commit removes the 'batch_size' argument from the benchmarking script since our current pipeliens don't support batching requests due to us not having a way to estimate VRAM and preventing out of memory errors. For more information see livepeer#66. We can add this option back in when we have solved this.

* readme: update with note (livepeer#207)

* docs: remove AI Realtime Video note from main branch

This commit removes the AI Realtime video warning note from the
mainbranch as it should have been on the
https://github.com/livepeer/ai-worker/tree/realtime-ai-experimental
branch.

---------

Co-authored-by: Rick Staa <[email protected]>
Co-authored-by: ea_superstar <[email protected]>
Co-authored-by: ad-astra-video <[email protected]>
Co-authored-by: PSchroedl <[email protected]>
Co-authored-by: Elite Encoder <[email protected]>
Co-authored-by: Peter Schroedl <[email protected]>
Co-authored-by: Brad P <[email protected]>
Co-authored-by: Victor Elias <[email protected]>
Co-authored-by: Emran M <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants