Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1428) of binary: /opt/conda/bin/python #177

Open
zcqacqwc opened this issue Oct 25, 2024 · 3 comments

Comments

@zcqacqwc
Copy link

zcqacqwc commented Oct 25, 2024

I’m experiencing an issue. I’ve set up the environment and configured the dataset correctly, but when I run the code, this error appears. How can I resolve it

root@2700da6f9136:/style/Co-DETR# bash tools/dist_train.sh
/opt/conda/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

  warnings.warn(
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
Traceback (most recent call last):
  File "tools/train.py", line 26, in <module>
    from projects import *
  File "/style/Co-DETR/projects/__init__.py", line 1, in <module>
    from .models import *
  File "/style/Co-DETR/projects/models/__init__.py", line 1, in <module>
    from .co_detr import *
  File "/style/Co-DETR/projects/models/co_detr.py", line 10, in <module>
    from mmdet.models.losses.cross_entropy_loss import generate_block_target
ImportError: cannot import name 'generate_block_target' from 'mmdet.models.losses.cross_entropy_loss' (/opt/conda/lib/python3.8/site-packages/mmdet/models/losses/cross_entropy_loss.py)
Traceback (most recent call last):
  File "tools/train.py", line 26, in <module>
    from projects import *
  File "/style/Co-DETR/projects/__init__.py", line 1, in <module>
    from .models import *
  File "/style/Co-DETR/projects/models/__init__.py", line 1, in <module>
    from .co_detr import *
  File "/style/Co-DETR/projects/models/co_detr.py", line 10, in <module>
    from mmdet.models.losses.cross_entropy_loss import generate_block_target
ImportError: cannot import name 'generate_block_target' from 'mmdet.models.losses.cross_entropy_loss' (/opt/conda/lib/python3.8/site-packages/mmdet/models/losses/cross_entropy_loss.py)
Traceback (most recent call last):
  File "tools/train.py", line 26, in <module>
    from projects import *
  File "/style/Co-DETR/projects/__init__.py", line 1, in <module>
    from .models import *
  File "/style/Co-DETR/projects/models/__init__.py", line 1, in <module>
    from .co_detr import *
  File "/style/Co-DETR/projects/models/co_detr.py", line 10, in <module>
    from mmdet.models.losses.cross_entropy_loss import generate_block_target
ImportError: cannot import name 'generate_block_target' from 'mmdet.models.losses.cross_entropy_loss' (/opt/conda/lib/python3.8/site-packages/mmdet/models/losses/cross_entropy_loss.py)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1428) of binary: /opt/conda/bin/python
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 715, in run
    elastic_launch(
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
tools/train.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2024-10-25_12:00:25
  host      : 2700da6f9136
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 1429)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[2]:
  time      : 2024-10-25_12:00:25
  host      : 2700da6f9136
  rank      : 2 (local_rank: 2)
  exitcode  : 1 (pid: 1430)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-10-25_12:00:25
  host      : 2700da6f9136
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 1428)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

please help me

@TempleX98
Copy link
Collaborator

Please check the file mmdet/models/losses/cross_entropy_loss.py in your environment's mmdetection package. Is this file the same as the implementation of this repo?

@zcqacqwc
Copy link
Author

Yes, that's correct. The file mmdet/models/losses/cross_entropy_loss.py exists. However, I am encountering the following error:

ImportError: cannot import name 'generate_block_target' from 'mmdet.models.losses.cross_entropy_loss' (/opt/conda/lib/python3.8/site-packages/mmdet/models/losses/cross_entropy_loss.py).

How can I resolve this issue?

@xinyuestudent
Copy link

Yes, that's correct. The file mmdet/models/losses/cross_entropy_loss.py exists. However, I am encountering the following error:

ImportError: cannot import name 'generate_block_target' from 'mmdet.models.losses.cross_entropy_loss' (/opt/conda/lib/python3.8/site-packages/mmdet/models/losses/cross_entropy_loss.py).

How can I resolve this issue?

I'm running into the same error—any idea how to fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants