-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HuggingFace Endpoint Inference Model Deployer #86
HuggingFace Endpoint Inference Model Deployer #86
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow how good is this! I love it!
I think the next step would be to use this deployer in the deployment step... maybe we can return the service in the step and mark it as a deployment_artifact
Might also make sense to test this out on some sort of base model already available on the huggingface hub
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so awesome and you did such a fantastic effort @dudeperf3ct ! I left some comments which are mostly nits and questions
endpoint_name: Optional[str] = None | ||
repository: Optional[str] = None | ||
framework: Optional[str] = None | ||
accelerator: Optional[str] = None | ||
instance_size: Optional[str] = None | ||
instance_type: Optional[str] = None | ||
region: Optional[str] = None | ||
vendor: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nitpick, but as these are now optional, there is no way that the user really knows what to pass in at the step level. Do you think it makes sense to rework this "BaseConfig" idea and just make seperate configs for the deployer and the service, so we can mark things as optional or not?
|
||
logger = get_logger(__name__) | ||
|
||
POLLING_TIMEOUT = 1200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a parameter somewhere for the user to control?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky to pass to. We want some way to pass timeout
to provision
metthod where POLLING_TIMEOUT
is being used. service.start(timeout=timeout)
calls provision
method underneath but does not provide a way to pass timeout
.
We can increase pass the same timeout constant POLLING_TIMEOUT
when we call start
function here. Right now it uses DEFAULT_DEPLOYMENT_START_STOP_TIMEOUT
as default value.
) | ||
else: | ||
raise NotImplementedError( | ||
"Tasks other than text-generation is not implemented." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much work is it to actually implement other tasks? This list here gives so many tasks and I am wondering whether we can easily support them? https://huggingface.co/docs/inference-endpoints/supported_tasks
In this PR, we implement a custom model deployer that uses huggingface inference endpoint.