You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sh hf_to_megatron.sh
Namespace(megatron_path='/workspace/megatron', convert_checkpoint_from_megatron_to_transformers=False, load_path='/workspace/checkpoints/LLM-Research/Meta-Llama-3-8B', save_path='/workspace/checkpoints/megatron/Meta-Llama-3-8B', print_checkpoint_structure=True, target_tensor_model_parallel_size=2, target_pipeline_model_parallel_size=1, target_data_parallel_size=2, target_params_dtype='fp16', make_vocab_size_divisible_by=1, use_distributed_optimizer=False, tokenizer_name=None, max_shard_size='10GB')
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
Converting
converting embedding layer
Keys in state_dict: dict_keys([])
Traceback (most recent call last):
File "/workspace/megatron/tools/checkpoint_conversion/llama_checkpoint_conversion.py", line 855, in
main()
File "/workspace/megatron/tools/checkpoint_conversion/llama_checkpoint_conversion.py", line 851, in main
convert_checkpoint_from_transformers_to_megatron(args)
File "/workspace/megatron/tools/checkpoint_conversion/llama_checkpoint_conversion.py", line 651, in convert_checkpoint_from_transformers_to_megatron
word_embedding = state_dict["model.embed_tokens.weight"].to(dtype)
KeyError: 'model.embed_tokens.weight'
The text was updated successfully, but these errors were encountered:
sh hf_to_megatron.sh
Namespace(megatron_path='/workspace/megatron', convert_checkpoint_from_megatron_to_transformers=False, load_path='/workspace/checkpoints/LLM-Research/Meta-Llama-3-8B', save_path='/workspace/checkpoints/megatron/Meta-Llama-3-8B', print_checkpoint_structure=True, target_tensor_model_parallel_size=2, target_pipeline_model_parallel_size=1, target_data_parallel_size=2, target_params_dtype='fp16', make_vocab_size_divisible_by=1, use_distributed_optimizer=False, tokenizer_name=None, max_shard_size='10GB')
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
Converting
converting embedding layer
Keys in state_dict: dict_keys([])
Traceback (most recent call last):
File "/workspace/megatron/tools/checkpoint_conversion/llama_checkpoint_conversion.py", line 855, in
main()
File "/workspace/megatron/tools/checkpoint_conversion/llama_checkpoint_conversion.py", line 851, in main
convert_checkpoint_from_transformers_to_megatron(args)
File "/workspace/megatron/tools/checkpoint_conversion/llama_checkpoint_conversion.py", line 651, in convert_checkpoint_from_transformers_to_megatron
word_embedding = state_dict["model.embed_tokens.weight"].to(dtype)
KeyError: 'model.embed_tokens.weight'
The text was updated successfully, but these errors were encountered: