You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TritonModelAnalyzerException: [StatusCode.UNAVAILABLE] failed to connect to all addresses; Failed to connect to remote host: Timeout occurred: FD Shutdown
#950
Open
C-Dhayananthan opened this issue
Nov 26, 2024
· 0 comments
and i using this commad to run model-analyzer in container model-analyzer profile -f sweep.yaml
ISSUE
[Model Analyzer] Initializing GPUDevice handles
[Model Analyzer] Using GPU 0 NVIDIA RTX A4000 with UUID GPU-3fca6544-2e5c-de67-d283-a37b68e716bb
[Model Analyzer] Using GPU 1 NVIDIA RTX A4000 with UUID GPU-6dced96e-d063-1bf2-dcb8-f5d94e67f6a9
Traceback (most recent call last):
File "/workspace/model_analyzer/entrypoint.py", line 198, in create_output_model_repository
os.mkdir(config.output_model_repository_path)
FileExistsError: [Errno 17] File exists: '/workspace/output_model_repository'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/model-analyzer", line 8, in <module>
sys.exit(main())
File "/workspace/model_analyzer/entrypoint.py", line 266, in main
create_output_model_repository(config)
File "/workspace/model_analyzer/entrypoint.py", line 201, in create_output_model_repository
raise TritonModelAnalyzerException(
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: Path "/workspace/output_model_repository" already exists. Please set or modify "--output-model-repository-path" flag or remove this directory. You can also allow overriding of the output directory using the "--override-output-model-repository" flag.
root@test-MS-7D70:/workspace# rm -rf /workspace/output_model_repository
root@test-MS-7D70:/workspace# model-analyzer profile -m examples/quick-start --profile-models add_sub
[Model Analyzer] Initializing GPUDevice handles
[Model Analyzer] Using GPU 0 NVIDIA RTX A4000 with UUID GPU-3fca6544-2e5c-de67-d283-a37b68e716bb
[Model Analyzer] Using GPU 1 NVIDIA RTX A4000 with UUID GPU-6dced96e-d063-1bf2-dcb8-f5d94e67f6a9
[Model Analyzer] Starting a local Triton Server
[Model Analyzer] Loaded checkpoint from file /workspace/checkpoints/6.ckpt
[Model Analyzer] GPU devices match checkpoint - skipping server metric acquisition
[Model Analyzer]
[Model Analyzer] Starting automatic brute search
[Model Analyzer]
[Model Analyzer] Creating model config: add_sub_config_default
[Model Analyzer]
[Model Analyzer] Saved checkpoint to /workspace/checkpoints/7.ckpt
Traceback (most recent call last):
File "/workspace/model_analyzer/triton/client/client.py", line 60, in wait_for_server_ready
if self._client.is_server_ready():
File "/usr/local/lib/python3.10/dist-packages/tritonclient/grpc/_client.py", line 344, in is_server_ready
raise_error_grpc(rpc_error)
File "/usr/local/lib/python3.10/dist-packages/tritonclient/grpc/_utils.py", line 77, in raise_error_grpc
raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] failed to connect to all addresses; last error: UNKNOWN: ipv6:%5B::1%5D:8001: Failed to connect to remote host: Timeout occurred: FD Shutdown
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/model-analyzer", line 8, in <module>
sys.exit(main())
File "/workspace/model_analyzer/entrypoint.py", line 278, in main
analyzer.profile(
File "/workspace/model_analyzer/analyzer.py", line 131, in profile
self._profile_models()
File "/workspace/model_analyzer/analyzer.py", line 251, in _profile_models
self._model_manager.run_models(models=[model])
File "/workspace/model_analyzer/model_manager.py", line 154, in run_models
measurement = self._metrics_manager.execute_run_config(run_config)
File "/workspace/model_analyzer/record/metrics_manager.py", line 238, in execute_run_config
if not self._load_model_variants(run_config):
File "/workspace/model_analyzer/record/metrics_manager.py", line 452, in _load_model_variants
if not self._load_model_variant(variant_config=mrc.model_config_variant()):
File "/workspace/model_analyzer/record/metrics_manager.py", line 467, in _load_model_variant
retval = self._do_load_model_variant(variant_config)
File "/workspace/model_analyzer/record/metrics_manager.py", line 474, in _do_load_model_variant
self._client.wait_for_server_ready(
File "/workspace/model_analyzer/triton/client/client.py", line 72, in wait_for_server_ready
raise TritonModelAnalyzerException(e)
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: [StatusCode.UNAVAILABLE] failed to connect to all addresses; last error: UNKNOWN: ipv6:%5B::1%5D:8001: Failed to connect to remote host: Timeout occurred: FD Shutdown
need help to fix this issue
The text was updated successfully, but these errors were encountered:
I was trying to run model analyzer with triton launch - local (default)
the below commad is run the container (model-analyzer - imagename)
docker run -it --rm --gpus all -v $(pwd):/workspace --net=host model-analyzer
sweep.yaml given below
and i using this commad to run model-analyzer in container
model-analyzer profile -f sweep.yaml
ISSUE
need help to fix this issue
The text was updated successfully, but these errors were encountered: