Skip to content

PyTorch Native Tensor Parallel for HuggingFace models inference

Notifications You must be signed in to change notification settings

HamidShojanazeri/HF-TP-inference

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

HF-TP-inference

PyTorch Native Tensor Parallel for HuggingFace models inference

Run the inference TP(lized) model +Compile

Example of HF llama 7B

torchrun --nnodes 1 --nproc_per_node 2 llama-simple.py --model_name meta-llama/Llama-2-7b-chat-hf --compile

Convert HF checkpoints to DTensor Checkpoints

torchrun --nnodes 1 --nproc_per_node 2 hf_convertor.py --model_name meta-llama/Llama-2-7b-chat-hf --save_checkpoint_dir hf-dtensor-checkpoints

Run the inference with deferred init TP(lized) model +compile

torchrun --nnodes 1 --nproc_per_node 2 llama-simple.py --model_name meta-llama/Llama-2-7b-chat-hf --checkpoint_dir hf-dtensor-checkpoints --compile --meta_device

About

PyTorch Native Tensor Parallel for HuggingFace models inference

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages