HuggingFace Endpoint Inference Model Deployer #86

dudeperf3ct · 2024-01-16T16:55:51Z

In this PR, we implement a custom model deployer that uses huggingface inference endpoint.

dagshub · 2024-01-16T16:55:57Z

htahir1

Wow how good is this! I love it!

I think the next step would be to use this deployer in the deployment step... maybe we can return the service in the step and mark it as a deployment_artifact

Might also make sense to test this out on some sort of base model already available on the huggingface hub

zencoder/steps/deployment.py

zencoder/huggingface/hf_model_deployer.py

zencoder/huggingface/hf_deployment.py

htahir1

This is so awesome and you did such a fantastic effort @dudeperf3ct ! I left some comments which are mostly nits and questions

llm-finetuning/huggingface/README.md

htahir1 · 2024-01-24T08:40:25Z

llm-finetuning/huggingface/hf_deployment_base_config.py

+    endpoint_name: Optional[str] = None
+    repository: Optional[str] = None
+    framework: Optional[str] = None
+    accelerator: Optional[str] = None
+    instance_size: Optional[str] = None
+    instance_type: Optional[str] = None
+    region: Optional[str] = None
+    vendor: Optional[str] = None


This is nitpick, but as these are now optional, there is no way that the user really knows what to pass in at the step level. Do you think it makes sense to rework this "BaseConfig" idea and just make seperate configs for the deployer and the service, so we can mark things as optional or not?

htahir1 · 2024-01-24T08:41:14Z

llm-finetuning/huggingface/hf_deployment_service.py

+
+logger = get_logger(__name__)
+
+POLLING_TIMEOUT = 1200


should this be a parameter somewhere for the user to control?

This is tricky to pass to. We want some way to pass timeout to provision metthod where POLLING_TIMEOUT is being used. service.start(timeout=timeout) calls provision method underneath but does not provide a way to pass timeout.

https://github.com/zenml-io/zenml/blob/42baca0115eca3293f98468fdd7025ce20d02262/src/zenml/services/service.py#L376-L369

We can increase pass the same timeout constant POLLING_TIMEOUT when we call start function here. Right now it uses DEFAULT_DEPLOYMENT_START_STOP_TIMEOUT as default value.

https://github.com/dudeperf3ct/zenml-projects/blob/feature/zencoder-huggingface-model-deployer/llm-finetuning/huggingface/hf_model_deployer.py#L93

llm-finetuning/huggingface/hf_deployment_service.py

htahir1 · 2024-01-24T08:42:42Z

llm-finetuning/huggingface/hf_deployment_service.py

+                )
+        else:
+            raise NotImplementedError(
+                "Tasks other than text-generation is not implemented."


How much work is it to actually implement other tasks? This list here gives so many tasks and I am wondering whether we can easily support them? https://huggingface.co/docs/inference-endpoints/supported_tasks

llm-finetuning/huggingface/hf_model_deployer.py

llm-finetuning/steps/deployment.py

llm-finetuning/huggingface/hf_model_deployer.py

llm-finetuning/huggingface/hf_deployment_service.py

Add skeleton for huggingface inference endpoint custom deployment

466f638

Improve typing and reuse variable for timeout

b7aa875

htahir1 requested changes Jan 16, 2024

View reviewed changes

dudeperf3ct added 2 commits January 16, 2024 23:15

Update deployment step to use custom model deployer

2c8bb2f

Fix incorrect return type for step

70a6bca

htahir1 requested changes Jan 16, 2024

View reviewed changes

zencoder/steps/deployment.py Outdated Show resolved Hide resolved

zencoder/steps/deployment.py Outdated Show resolved Hide resolved

zencoder/huggingface/hf_model_deployer.py Outdated Show resolved Hide resolved

zencoder/huggingface/hf_deployment.py Outdated Show resolved Hide resolved

dudeperf3ct added 6 commits January 16, 2024 23:36

Address PR review comments

1c3028b

Test run deployment pipeline

8b949cb

Use latest deployment step and test the custom deployer

50feb19

Refactor and extend model deployer implementation

7fafffd

Merge 'main' branch into 'feature/zencoder-model-deployer' branch

4b01ed5

Update readme with new commands

8da2698

htahir1 changed the base branch from main to feature/add-huggingface-deployer January 22, 2024 09:45

htahir1 changed the base branch from feature/add-huggingface-deployer to main January 22, 2024 09:46

htahir1 and others added 10 commits January 22, 2024 11:27

Edited config + changed step

53fd3ef

Test running the deployment pipeline

b1e206e

Add logic for 'find_model_server' abstract function

39a4a4e

Basic logic

455439a

Error handle and update find_model_server function

03d937a

Update save_artifact function to use is_deployment_artifact

ba482cd

Update logic in find_model_server

447377f

Modify HuggingFaceBaseConfig to contain optional fields

37ef007

Fetch metadata artifact and test

e1f9c34

Update docstrings and handle circular condition when deprovision

0abaa15

htahir1 requested changes Jan 24, 2024

View reviewed changes

dudeperf3ct added 4 commits January 24, 2024 21:31

Address PR review comments

0bef8b0

Update docstrings

f517951

Update comments

df505af

remove generate_random_letters function

a552068

dudeperf3ct added 7 commits January 24, 2024 21:53

Set default to False in deploy_model

1a93d39

Fix bug in find_model_server

c86b3cb

Fix endpoint name when replacing the service

bd3bd90

Add logger error message

d8febd3

Modify HuggingfaceModelDeployerConfig class

6b9c448

Fix bug in get_model_server_info function

f305f4a

Update syntax for fetching artifact

fd12e00

htahir1 marked this pull request as ready for review January 26, 2024 09:42

Merge branch 'main' into feature/zencoder-huggingface-model-deployer

ec4f16a

htahir1 merged commit c254ccc into zenml-io:main Jan 30, 2024
1 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HuggingFace Endpoint Inference Model Deployer #86

HuggingFace Endpoint Inference Model Deployer #86

dudeperf3ct commented Jan 16, 2024 •

edited

Loading

dagshub bot commented Jan 16, 2024

htahir1 left a comment

htahir1 left a comment

htahir1 Jan 24, 2024

htahir1 Jan 24, 2024

dudeperf3ct Jan 24, 2024

htahir1 Jan 24, 2024

HuggingFace Endpoint Inference Model Deployer #86

HuggingFace Endpoint Inference Model Deployer #86

Conversation

dudeperf3ct commented Jan 16, 2024 • edited Loading

dagshub bot commented Jan 16, 2024

htahir1 left a comment

Choose a reason for hiding this comment

htahir1 left a comment

Choose a reason for hiding this comment

htahir1 Jan 24, 2024

Choose a reason for hiding this comment

htahir1 Jan 24, 2024

Choose a reason for hiding this comment

dudeperf3ct Jan 24, 2024

Choose a reason for hiding this comment

htahir1 Jan 24, 2024

Choose a reason for hiding this comment

dudeperf3ct commented Jan 16, 2024 •

edited

Loading