We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transformers==4.42.3
No response
同时启动多个 glm4 实例
CUDA_LIST=(0 1 2 3 4 5 6 7) export PORTS=(18008 18009 18010 18011 18012 18013 18014 18015) CUDA_LIST_LENGTH=${#CUDA_LIST[@]} for i in $(seq 0 $((${CUDA_LIST_LENGTH}-1))); do export PORT=${PORTS[i]} echo Start vllm on ${PORT} export CUDA_VISIBLE_DEVICES=${CUDA_LIST[i]} nohup python -m vllm.entrypoints.openai.api_server --model ${MODEL} --port ${PORT} --trust-remote-code --gpu-memory-utilization 0.9 --max-model-len 8192 & done
有一定概率部分模型启动失败,已收到可能错误原因如下 原因 1
module 'transformers_modules.configuration_chatglm' has no attribute 'ChatGLMConfig'
原因 2
[2024-09-04 17:27:43] File "OpenRLHF/examples/scripts/../../openrlhf/cli/train_sft.py", line 219, in <module> [2024-09-04 17:27:43] train(args) [2024-09-04 17:27:43] File "OpenRLHF/examples/scripts/../../openrlhf/cli/train_sft.py", line 34, in train [2024-09-04 17:27:43] tokenizer = get_tokenizer(args.pretrain, model.model, "right", strategy, use_fast=not args.disable_fast_tokenizer) [2024-09-04 17:27:43] File "OpenRLHF/openrlhf/utils/utils.py", line 16, in get_tokenizer [2024-09-04 17:27:43] tokenizer = AutoTokenizer.from_pretrained(pretrain, trust_remote_code=True, use_fast=use_fast) [2024-09-04 17:27:43] File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 870, in from_pretrained [2024-09-04 17:27:43] tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs) [2024-09-04 17:27:43] File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 514, in get_class_from_dynamic_module [2024-09-04 17:27:43] return get_class_in_module(class_name, final_module) [2024-09-04 17:27:43] File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 212, in get_class_in_module [2024-09-04 17:27:43] module_spec.loader.exec_module(module) [2024-09-04 17:27:43] File "<frozen importlib._bootstrap_external>", line 879, in exec_module [2024-09-04 17:27:43] File "<frozen importlib._bootstrap_external>", line 1017, in get_code [2024-09-04 17:27:43] File "<frozen importlib._bootstrap_external>", line 947, in source_to_code [2024-09-04 17:27:43] File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed [2024-09-04 17:27:43] File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 240 [2024-09-04 17:27:43] """ [2024-09-04 17:27:43] ^ [2024-09-04 17:27:43] SyntaxError: unterminated triple-quoted string literal (detected at line 254)
此错误概率出现,且仅在微调 glm 时出现,其他模型系列目前并未观察到。
见上
The text was updated successfully, but these errors were encountered:
zRzRzRzRzRzRzR
No branches or pull requests
System Info / 系統信息
transformers==4.42.3
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
Reproduction / 复现过程
同时启动多个 glm4 实例
有一定概率部分模型启动失败,已收到可能错误原因如下
原因 1
原因 2
此错误概率出现,且仅在微调 glm 时出现,其他模型系列目前并未观察到。
Expected behavior / 期待表现
见上
The text was updated successfully, but these errors were encountered: