GMC: Add GPU support for GMC. #292

PeterYang12 · 2024-08-12T05:56:21Z

Description

Enable NVIDIA GPU support for GMC, including sequence and switch modes. Note that switch mode may fail due to not enough GPU memory.

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

irisdingbj · 2024-08-12T17:48:05Z

microservices-connector/config/samples/chatQnA_switch_nv.yaml

+            serviceName: tgi-service-llama
+            config:
+              endpoint: /generate
+              MODEL_ID: Intel/neural-chat-7b-v3-3


this is supposed to be the llama model

Thank you for your review. Here is a workaround. If using the llama model, one GPU card may fail to launch two instance due to few GPU memory. I am ok to keep the llama model here, what do you think?

this is just to work around for current NV machine, for the example we want to provide to end user, it is better we use a meaningful example, like keep the llama model.

this is just to work around for current NV machine, for the example we want to provide to end user, it is better we use a meaningful example, like keep the llama model.

Fixed

irisdingbj · 2024-08-12T17:49:26Z

microservices-connector/usage_guide.md

@@ -39,7 +43,7 @@ kubectl create deployment client-test -n chatqa --image=python:3.8.13 -- sleep i
 **Access the pipeline using the above URL from the client pod**

 ```bash
-export CLIENT_POD=$(kubectl get pod  -l app=client-test -o jsonpath={.items..metadata.name})
+export CLIENT_POD=$(kubectl get pod -n chatqa  -l app=client-test -o jsonpath={.items..metadata.name})


good catch!

Enable NVIDIA GPU support for GMC, including sequence and switch mode. Note that switch mode may fail due to NO enough GPU memory. Signed-off-by: PeterYang12 <[email protected]>

for more information, see https://pre-commit.ci

daisy-ycguo

lgtm

PeterYang12 · 2024-08-14T10:35:20Z

Thank you, Iris and Daisy. :)

irisdingbj reviewed Aug 12, 2024

View reviewed changes

GMC: Add GPU support for GMC.

04c72cb

Enable NVIDIA GPU support for GMC, including sequence and switch mode. Note that switch mode may fail due to NO enough GPU memory. Signed-off-by: PeterYang12 <[email protected]>

PeterYang12 force-pushed the NV-GPU-support-for-GMC branch from da66129 to 04c72cb Compare August 13, 2024 01:12

[pre-commit.ci] auto fixes from pre-commit.com hooks

7e0476a

for more information, see https://pre-commit.ci

irisdingbj approved these changes Aug 13, 2024

View reviewed changes

daisy-ycguo approved these changes Aug 14, 2024

View reviewed changes

daisy-ycguo merged commit 119941e into opea-project:main Aug 14, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GMC: Add GPU support for GMC. #292

GMC: Add GPU support for GMC. #292

PeterYang12 commented Aug 12, 2024 •

edited

Loading

irisdingbj Aug 12, 2024

PeterYang12 Aug 13, 2024

irisdingbj Aug 13, 2024

PeterYang12 Aug 13, 2024

irisdingbj Aug 12, 2024

daisy-ycguo left a comment

PeterYang12 commented Aug 14, 2024

GMC: Add GPU support for GMC. #292

GMC: Add GPU support for GMC. #292

Conversation

PeterYang12 commented Aug 12, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

irisdingbj Aug 12, 2024

Choose a reason for hiding this comment

PeterYang12 Aug 13, 2024

Choose a reason for hiding this comment

irisdingbj Aug 13, 2024

Choose a reason for hiding this comment

PeterYang12 Aug 13, 2024

Choose a reason for hiding this comment

irisdingbj Aug 12, 2024

Choose a reason for hiding this comment

daisy-ycguo left a comment

Choose a reason for hiding this comment

PeterYang12 commented Aug 14, 2024

PeterYang12 commented Aug 12, 2024 •

edited

Loading