Max width of ensemble fan out #7256

zachary-mcpher · 2024-05-22T00:46:37Z

zachary-mcpher
May 22, 2024

Hello,

I am working on a project to serve an encoder based model in a Triton Inference Server ensemble. The nodes will be a preprocessing node which feeds directly into an encoder (generate an embedding feature from roberta-base) and then a fan out to K light weight classification head ( think N linear layers).

How far can I reasonably push K? Would the ensemble orchestrator be capable of handling inference at K=100 classifiers?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Max width of ensemble fan out #7256

{{title}}

Replies: 0 comments

Select a reply

Max width of ensemble fan out #7256

zachary-mcpher May 22, 2024

Replies: 0 comments

zachary-mcpher
May 22, 2024