How to run tabby on offline-environment with local model? #3517

wapleeeeee · 2024-12-05T12:46:01Z

wapleeeeee
Dec 5, 2024

Situation:
I've just managed to run Tabby successfully using my own HF-model on a MAC with vllm.
Now, my intention is to run the very same model with Tabby in an offline-GPU Linux environment.

Attempts:

I tried to pull the linux-gpu release to the env, I thought it might fetch config from ~/.tabby/config.toml, so I created the config file with

# Chat model
[model.chat.http]
kind = "openai/chat"
model_name = "my_model"
api_endpoint = "http://localhost:8049/v1"
api_key = "my_token"

# Completion model
[model.completion.http]
kind = "vllm/completion"
model_name = "my_model"
api_endpoint = "http://localhost:8049/v1"
api_key = "my_token"
prompt_template = "<|fim_prefix|><|fim_suffix|>{suffix}<|fim_middle|>{prefix}" # SPM

Then I ran with
bash ./tabby serve --port 8080
It went with

The application panicked (crashed).
Message:  Failed to fetch model organization <TabbyML>: Failed to download

Caused by:
    0: error sending request for url (https://raw.githubusercontent.com/TabbyML/registry-tabby/main/models.json)
    1: client error (Connect)
    2: Connection reset by peer (os error 104)
    3: Connection reset by peer (os error 104)
Location: /__w/tabby/tabby/crates/tabby-common/src/registry.rs:92

I've come to realize that this release might need to be made available online. Consequently, I visited this blog at https://tabby.tabbyml.com/blog/2024/03/25/deploy-tabby-in-air-gapped-environment-with-docker/ and gave it a try.

Question:
It appears that there isn't a way to use a local HF-Model for Tabby with vllm on offline-GPU Linux environment. (We're not using llama.cpp as it's challenging to convert our model to gguf.) Could you offer me some assistance?

Answered by zwpaper

Dec 9, 2024

Hi @wapleeeeee, Tabby requires the use of an embedding model in addition to the other models. You need to add the HTTP embedding model to the configuration for Tabby to operate in an offline environment.

for example:

[model.embedding.http]
kind = "openai/embedding"
model_name = "text-embedding-3-small"
api_endpoint = "http://localhost:8099/v1"
api_key = "apikey"

We have also created an example for vLLM at https://tabby.tabbyml.com/docs/references/models-http-api/vllm/.

View full answer

wapleeeeee · 2024-12-09T06:19:52Z

wapleeeeee
Dec 9, 2024
Author

Is there anyone can offer some help?

0 replies

zwpaper · 2024-12-09T09:03:02Z

zwpaper
Dec 9, 2024
Collaborator

Hi @wapleeeeee, Tabby requires the use of an embedding model in addition to the other models. You need to add the HTTP embedding model to the configuration for Tabby to operate in an offline environment.

for example:

[model.embedding.http]
kind = "openai/embedding"
model_name = "text-embedding-3-small"
api_endpoint = "http://localhost:8099/v1"
api_key = "apikey"

We have also created an example for vLLM at https://tabby.tabbyml.com/docs/references/models-http-api/vllm/.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run tabby on offline-environment with local model? #3517

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

How to run tabby on offline-environment with local model? #3517

wapleeeeee Dec 5, 2024

Replies: 2 comments

wapleeeeee Dec 9, 2024 Author

zwpaper Dec 9, 2024 Collaborator

wapleeeeee
Dec 5, 2024

wapleeeeee
Dec 9, 2024
Author

zwpaper
Dec 9, 2024
Collaborator