docker-compose multi-gpu example #1127
Replies: 5 comments
-
Since we don't do tensor parallelism at the moment, the only way to utilize multiple gpu is separate the workload into different process / different cuda devices. In this particular case, you might:
|
Beta Was this translation helpful? Give feedback.
-
@wsxiaoys Any chance I could get a docker-compose example? Not sure how that would work if both containers are using port 8080. |
Beta Was this translation helpful? Give feedback.
-
Just separating the models into different containers to use each GPU seems to work. I was hoping for a single model to use two GPUs, but that does not seem possible. For anyone looking for a multi-gpu setup, here is the docker-compose setup:
|
Beta Was this translation helpful? Give feedback.
-
Putting a reverse proxy upfront is a natural choice in this case (e.g. Caddy), on the other hand, if you're interested in Tabby's built-in distributed worker support...
|
Beta Was this translation helpful? Give feedback.
-
A blog post added here: https://tabby.tabbyml.com/blog/2024/03/26/tabby-with-replicas-behind-reverse-proxy |
Beta Was this translation helpful? Give feedback.
-
Can anyone give an example on how to get TabbyML working with two GPUs?
I have two 3060s. This is what I have so far, but I'm not sure if TabbyML is actually using both GPUs.
I also get this error:
Beta Was this translation helpful? Give feedback.
All reactions