You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@fsatka -- ModelMesh was designed to optimize resource utilization. Why would you want to load additional instances of the same model/predictor/ISVC on all serving runtime pods regardless of inference request traffic? Just for testing purposes?
Now model load only on one instance, and lazy loading on another pods, when reauest has come.
Can we modify internal modelmesh parameters for default loading model on all ServingRuntime instances?
The text was updated successfully, but these errors were encountered: