-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GenAIOrchestrator] add huggingface tgi - 1665 #1702
base: master
Are you sure you want to change the base?
[GenAIOrchestrator] add huggingface tgi - 1665 #1702
Conversation
Do not merge until the PR on huggingface_hub is merged. The pyproject.toml points to a fork of this library in the meantime. |
PR on huggingface_hub is merged, ready for review |
328fcdf
to
def2e65
Compare
This PR installs a tons of dependencies for instance :
It seems quite huge, I need to take a deeper look at that, a simple "huggingface hub client" shouldn't require any nvidia (cuda) dependencies, maybe you can isolate a sub group of dependencies. Otherwise the docker image will become too usage |
The langchain_huggingface library heavily relies on other libraries, such as CUDA and PyTorch. I found a discussion on GitHub about this: langchain-ai/langchain#24482 |
A discussion was create on langchain about this probleme |
We are discussing this integration internally at CM Arkéa, we are no longer sure that TGI will be used as our inference server meaning we wouldn't be able to maintain this integration. Waiting for a clear position on this subject on our side (test in progress to use OpenAI integration for vLLM). If any tock community user uses TGI and want this integration feel free to leave a comment here. |
add LLM provider hugging face tgi
issue 1665