You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I installed the environment exactly according to the steps in INSTALLING.md. When I used the commands in TRAINING.md to test, the following error occurred
It is difficult to find a solution to this error on the Internet. I speculate that the version of mpi is too new. When I use the mpirun --version command, the version of mpi I get is 4.1.1. But I don't know how to solve this problem. I tried various solutions, such as replacing an older server with a completely different configuration, but the same problem still occurred
Hope to get your help, thank you
The text was updated successfully, but these errors were encountered:
I installed the environment exactly according to the steps in INSTALLING.md. When I used the commands in TRAINING.md to test, the following error occurred
horovodrun -np 2 -H 192.168.31.6:2 --verbose python examples/torch/pytorch_mnist.py
output:
Filtering local host names.
Remote host found:
All hosts are local, finding the interfaces with address 127.0.0.1
Local interface found lo
mpirun --allow-run-as-root --tag-output -np 2 -H 192.168.31.6:2 -bind-to none -map-by slot -mca btl_tcp_if_include lo -x NCCL_SOCKET_IFNAME=lo -x ADDR2LINE -x AR -x AS -x BROWSER -x CC -x CFLAGS -x CMAKE_PREFIX_PATH -x COLORTERM -x CONDA_BACKUP_ADDR2LINE -x CONDA_BACKUP_AR -x CONDA_BACKUP_AS -x CONDA_BACKUP_CC -x CONDA_BACKUP_CFLAGS -x CONDA_BACKUP_CMAKE_PREFIX_PATH -x CONDA_BACKUP_CONDA_BUILD_SYSROOT -x CONDA_BACKUP_CPP -x CONDA_BACKUP_CPPFLAGS -x CONDA_BACKUP_CXX -x CONDA_BACKUP_CXXFILT -x CONDA_BACKUP_CXXFLAGS -x CONDA_BACKUP_DEBUG_CFLAGS -x CONDA_BACKUP_DEBUG_CPPFLAGS -x CONDA_BACKUP_DEBUG_CXXFLAGS -x CONDA_BACKUP_ELFEDIT -x CONDA_BACKUP_GCC -x CONDA_BACKUP_GCC_AR -x CONDA_BACKUP_GCC_NM -x CONDA_BACKUP_GCC_RANLIB -x CONDA_BACKUP_GPROF -x CONDA_BACKUP_GXX -x CONDA_BACKUP_HOST -x CONDA_BACKUP_LD -x CONDA_BACKUP_LDFLAGS -x CONDA_BACKUP_LD_GOLD -x CONDA_BACKUP_NM -x CONDA_BACKUP_OBJCOPY -x CONDA_BACKUP_OBJDUMP -x CONDA_BACKUP_RANLIB -x CONDA_BACKUP_READELF -x CONDA_BACKUP_SIZE -x CONDA_BACKUP_STRINGS -x CONDA_BACKUP_STRIP -x CONDA_BACKUP__CONDA_PYTHON_SYSCONFIGDATA_NAME -x CONDA_BUILD_SYSROOT -x CONDA_CUPY_CUDA_PATH -x CONDA_DEFAULT_ENV -x CONDA_EXE -x CONDA_PREFIX -x CONDA_PREFIX_1 -x CONDA_PREFIX_2 -x CONDA_PREFIX_3 -x CONDA_PREFIX_4 -x CONDA_PREFIX_5 -x CONDA_PREFIX_6 -x CONDA_PREFIX_7 -x CONDA_PROMPT_MODIFIER -x CONDA_PYTHON_EXE -x CONDA_SHLVL -x CPP -x CPPFLAGS -x CUDA_PATH -x CXX -x CXXFILT -x CXXFLAGS -x DBUS_SESSION_BUS_ADDRESS -x DEBUG_CFLAGS -x DEBUG_CPPFLAGS -x DEBUG_CXXFLAGS -x ELFEDIT -x GCC -x GCC_AR -x GCC_NM -x GCC_RANLIB -x GIT_ASKPASS -x GPROF -x GXX -x HOME -x HOROVOD_CCL_BGT_AFFINITY -x HOROVOD_GLOO_TIMEOUT_SECONDS -x HOROVOD_NUM_NCCL_STREAMS -x HOROVOD_STALL_CHECK_TIME_SECONDS -x HOROVOD_STALL_SHUTDOWN_TIME_SECONDS -x HOST -x LANG -x LANGUAGE -x LD -x LDFLAGS -x LD_GOLD -x LESSCLOSE -x LESSOPEN -x LOGNAME -x LS_COLORS -x MOTD_SHOWN -x NCCL_SOCKET_IFNAME -x NM -x OBJCOPY -x OBJDUMP -x PATH -x PWD -x RANLIB -x READELF -x SHELL -x SHLVL -x SIZE -x SSH_CLIENT -x SSH_CONNECTION -x STRINGS -x STRIP -x TERM -x TERM_PROGRAM -x TERM_PROGRAM_VERSION -x USER -x VSCODE_GIT_ASKPASS_EXTRA_ARGS -x VSCODE_GIT_ASKPASS_MAIN -x VSCODE_GIT_ASKPASS_NODE -x VSCODE_GIT_IPC_HANDLE -x VSCODE_IPC_HOOK_CLI -x XDG_DATA_DIRS -x XDG_RUNTIME_DIR -x XDG_SESSION_CLASS -x XDG_SESSION_ID -x XDG_SESSION_TYPE -x _ -x _CE_CONDA -x _CE_M -x _CONDA_PYTHON_SYSCONFIGDATA_NAME python examples/torch/pytorch_mnist.py
[mpiexec@gpu-server-1] match_arg (lib/utils/args.c:166): unrecognized argument allow-run-as-root
[mpiexec@gpu-server-1] HYDU_parse_array (lib/utils/args.c:181): argument matching returned error
[mpiexec@gpu-server-1] parse_args (mpiexec/get_parameters.c:315): error parsing input array
[mpiexec@gpu-server-1] HYD_uii_mpx_get_parameters (mpiexec/get_parameters.c:47): unable to parse user arguments
[mpiexec@gpu-server-1] main (mpiexec/mpiexec.c:49): error parsing parameters
It is difficult to find a solution to this error on the Internet. I speculate that the version of mpi is too new. When I use the mpirun --version command, the version of mpi I get is 4.1.1. But I don't know how to solve this problem. I tried various solutions, such as replacing an older server with a completely different configuration, but the same problem still occurred
Hope to get your help, thank you
The text was updated successfully, but these errors were encountered: