Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
triton.yaml		triton.yaml

README.md

Predict on a InferenceService using Triton Inference Server

Setup

Your ~/.kube/config should point to a cluster with KFServing installed.
Your cluster's Istio Ingress gateway must be network accessible.

Create the InferenceService

Apply the CRD

kubectl apply -f triton.yaml

Expected Output

inferenceservice.serving.kubeflow.org/triton-simple-string created

Run a prediction

Uses the client at: https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/client_example.html

setup vars The first step is to determine the ingress IP and ports and set INGRESS_HOST and INGRESS_PORT

SERVICE_HOSTNAME=$(kubectl get inferenceservice triton-simple-string -o jsonpath='{.status.url}' | cut -d "/" -f 3)

check server status

curl -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/api/status

edit /etc/hosts to map the CLUSTER IP to triton-simple-string.default.example.com
run the client

docker run -e SERVICE_HOSTNAME:$SERVICE_HOSTNAME -it --rm --net=host nvcr.io/nvidia/tritonserver:20.03-py3-clientsdk
./build/simple_string_client -u $SERVICE_HOSTNAME

You should see output like:

root@trantor:/workspace# ./build/simple_string_client -u triton-simple-string.default.example.com
0 + 1 = 1
0 - 1 = -1
1 + 1 = 2
1 - 1 = 0
2 + 1 = 3
2 - 1 = 1
3 + 1 = 4
3 - 1 = 2
4 + 1 = 5
4 - 1 = 3
5 + 1 = 6
5 - 1 = 4
6 + 1 = 7
6 - 1 = 5
7 + 1 = 8
7 - 1 = 6
8 + 1 = 9
8 - 1 = 7
9 + 1 = 10
9 - 1 = 8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simple_string

simple_string

README.md

Predict on a InferenceService using Triton Inference Server

Setup

Create the InferenceService

Run a prediction

Files

simple_string

Directory actions

More options

Directory actions

More options

Latest commit

History

simple_string

Folders and files

parent directory

README.md

Predict on a InferenceService using Triton Inference Server

Setup

Create the InferenceService

Run a prediction