PyTorch Native Tensor Parallel for HuggingFace models inference
Example of HF llama 7B
torchrun --nnodes 1 --nproc_per_node 2 llama-simple.py --model_name meta-llama/Llama-2-7b-chat-hf --compile
torchrun --nnodes 1 --nproc_per_node 2 hf_convertor.py --model_name meta-llama/Llama-2-7b-chat-hf --save_checkpoint_dir hf-dtensor-checkpoints
torchrun --nnodes 1 --nproc_per_node 2 llama-simple.py --model_name meta-llama/Llama-2-7b-chat-hf --checkpoint_dir hf-dtensor-checkpoints --compile --meta_device