diff --git a/README.md b/README.md index 0c0a02f..aca10ac 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ Thank you. $ pip install -r requirements.txt ``` ### 2. Install required environment for inference using Triton server -Check [./README_Triton.md](./README_Triton.md) for details. Install tools/packages included: +Check [./README_ENV.md](./README_ENV.md) for details. Install tools/packages included: - TensorRT - Docker - nvidia-docker @@ -48,7 +48,7 @@ IC15 | SynthText, IC15 | Eng | For IC15 only | [Click](https://drive.google.com/ LinkRefiner | CTW1500 | - | Used with the General Model | [Click](https://drive.google.com/open?id=1XSaFwBkOaFOdtk4Ane3DFyJGPRw6v5bO) ### 5. Model preparation before run Triton server: -a. Triton Inference Server inference: see details at [./README_Triton.md](./README_Triton.md)
+a. Triton Inference Server inference: see details at [./README_ENV.md](./README_ENV.md)
Initially, you need to run a (.sh) script to prepare Model Repo, then, you just need to run Docker image when inferencing. Script get things ready for Triton server, steps covered: - Convert downloaded pretrain into mutiple formats - Locate converted model formats into Triton's Model Repository @@ -66,7 +66,7 @@ $ curl -v localhost:8000/v2/health/ready Now everythings ready, start inference by: - Run docker image of Triton server (replace mount -v path to your full path to model_repository): ``` -$ sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/triton-server-CRAFT-pytorch/model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models +$ sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/Triton-TensorRT-Inference-CRAFT-pytorch/model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models ... +------------+---------+--------+ | Model | Version | Status | diff --git a/README_Triton.md b/README_ENV.md similarity index 97% rename from README_Triton.md rename to README_ENV.md index ae93658..2364856 100644 --- a/README_Triton.md +++ b/README_ENV.md @@ -216,9 +216,9 @@ $ curl https://get.docker.com | sh \ Pull repo, image, and prepare models (Where is the version of Triton that you want to use): ``` $ sudo docker pull nvcr.io/nvidia/tritonserver:-py3 -$ git clone https://github.com/huukim911/triton-server-CRAFT-pytorch.git +$ git clone https://github.com/huukim911/Triton-TensorRT-Inference-CRAFT-pytorch.git Run the .sh script to convert model into target formats, prepare Model Repo and start Triton server container: -$ cd triton-server-CRAFT-pytorch +$ cd Triton-TensorRT-Inference-CRAFT-pytorch $ sh prepare.sh Convert source model into target formats and copy into Triton's Model Repository successfully. ``` @@ -227,9 +227,9 @@ Run server in container and client in cmd ``` $ sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v :/models nvcr.io/nvidia/tritonserver:-py3 tritonserver --model-repository=/models -For example, run on server with full path "/home/maverick911/repo/triton-server-CRAFT-pytorch +For example, run on server with full path "/home/maverick911/repo/Triton-TensorRT-Inference-CRAFT-pytorch /model_repository": -$ sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/triton-server-CRAFT-pytorch +$ sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/Triton-TensorRT-Inference-CRAFT-pytorch /model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models +----------------------+---------+--------+ @@ -244,7 +244,7 @@ I0611 04:10:23.080860 1 http_server.c9:2906] Started Metrics Service at 0.0.0.0: ``` 2. Infer by client in cmd (this repo), with method (triton), model name (_\), version (not required). For ex: ``` -$ cd triton-server-CRAFT-pytorch/ +$ cd Triton-TensorRT-Inference-CRAFT-pytorch/ $ python infer_triton.py -m='detec_trt' -x=1 --test_folder='./images' Request 1, batch size 1s/sample.jpg elapsed time : 0.9521937370300293s @@ -258,7 +258,7 @@ elapsed time : 1.244419813156128s Run server in container and client sdk in container: 1. Start the server side: ``` -$ sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/triton-server-CRAFT-pytorch/model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models +$ sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/Triton-TensorRT-Inference-CRAFT-pytorch/model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models +----------------------+---------+--------+ | Model | Version | Status | diff --git a/prepare.sh b/prepare.sh index 80bf4a6..071e76a 100644 --- a/prepare.sh +++ b/prepare.sh @@ -24,4 +24,4 @@ if [ ${failed} -ne 0 ]; then # III. Start Triton server image in container, mount Model Repo prepared into container volume # Update the full path to data/model_repository follow deploy server path: "-v /model_repository:/models" -sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/triton-server-CRAFT-pytorch/model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models \ No newline at end of file +sudo docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/maverick911/repo/Triton-TensorRT-Inference-CRAFT-pytorch/model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models \ No newline at end of file diff --git a/test.py b/test.py index e6b24c1..48779d3 100644 --- a/test.py +++ b/test.py @@ -176,4 +176,4 @@ def test_net(net, image, text_threshold, link_threshold, low_text, cuda, poly, r print("Done, elapsed time : {}s. Check at folder result/".format(time.time() - t)) # Example cmd: -# python test.py --weight /home/maverick911/repo/triton-server-CRAFT-pytorch/weight/craft_mlt_25k.pth \ No newline at end of file +# python test.py --weight /home/maverick911/repo/Triton-TensorRT-Inference-CRAFT-pytorch/weight/craft_mlt_25k.pth \ No newline at end of file