How to boot up the inference server faster? #2340
Unanswered
chicsfever
asked this question in
Q&A
Replies: 1 comment 3 replies
-
Kindly share the information after running |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It is taking 1 minute to bootup and ideally, it would be something like 10 to 12 seconds. I am not converting to AWQ or doing any heavy operation at serve time. Here is the command that I use:
Options passed:
Beta Was this translation helpful? Give feedback.
All reactions