multiGPU infer hanging. #2523

yufang67 · 2024-12-02T20:19:26Z

yufang67
Dec 2, 2024

i use 0.15.0.dev version. The model initialized by
llm=tensorrt_llm.LLM( path, tokenizer, dtype="float16", build_config=build_config, tensor_parallel_size=2, pipeline_parallel_size=1, )
Then, the infer run on a 4xa100 node with llm.generate(input, sampling_params).
The infer run through all samples and reach the end of the main program, but hang there.
Did i miss any config ?

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiGPU infer hanging. #2523

{{title}}

Replies: 0 comments

Select a reply

multiGPU infer hanging. #2523

yufang67 Dec 2, 2024

Replies: 0 comments

yufang67
Dec 2, 2024