Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow client to request subset of ensemble model outputs #763

Open
Vincouux opened this issue Jul 24, 2024 · 1 comment
Open

Allow client to request subset of ensemble model outputs #763

Vincouux opened this issue Jul 24, 2024 · 1 comment

Comments

@Vincouux
Copy link

Problem:

A common use case of ensemble model is preprocess -> inference -> postprocess. In most case, user will request last step output (postprocessed inference), as it might for example reduce networking between Triton & Application. But often, the application might also require the raw inference result (intermediate output). The current configuration & client interface (at least the Python one), seem to support requesting a subset or all the output (Python client takes a list of InferRequestedOutput). But for some reason, requesting a subset of all output results in InferenceServerException: [StatusCode.INVALID_ARGUMENT] in ensemble, [request id: ] unexpected deadlock, at least one output is not set while no more ensemble steps can be made.

Solution:

As of today, the only solution I found is to create 2 ensemble model (or actually N - 1 ensemble where N is the total number of steps):

  • preprocess -> inference
  • preprocess -> inference -> postprocess
    Then, my client is either requesting output from first or second ensemble model. Not ideal as it introduce unnecessary complexity.
@Vincouux Vincouux changed the title Allow client to request subset of ensemble model output Allow client to request subset of ensemble model outputs Jul 24, 2024
@dyastremsky
Copy link
Contributor

CC: @GuanLuo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants