Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HI, maybe a small obstacle for the many, but what went wrong,can you help ? thanks #3

Closed
Jackyinuo opened this issue Mar 16, 2021 · 3 comments
Labels
good first issue Good for newcomers see README.md Didn't follow the README.md

Comments

@Jackyinuo
Copy link

cycles = [0, 1, 2, 3, 4, 5, 6]
work_directory = './work_dirs/MIAL'
gpu_ids = range(0, 1)

2021-03-16 11:48:19,160 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,192 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,539 - mmdet - INFO - load model from: torchvision://resnet50
2021-03-16 11:48:19,698 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

2021-03-16 11:48:24,285 - mmdet - INFO - Start running, host: phzhou@ubuntu-MW51-HP0-00, work_dir: /disk1/huihui/MIAL/work_dirs/MIAL/20210316_114818
2021-03-16 11:48:24,285 - mmdet - INFO - workflow: [('train', 1)], max: 3 epochs
Traceback (most recent call last):
File "./tools/train.py", line 267, in
main()
File "./tools/train.py", line 180, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/disk1/huihui/MIAL/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk1/huihui/MIAL/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/disk1/huihui/MIAL/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'
Traceback (most recent call last):
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/phzhou/anaconda3/envs/mial/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/MIAL.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

@xiaosa96
Copy link

cycles = [0, 1, 2, 3, 4, 5, 6]
work_directory = './work_dirs/MIAL'
gpu_ids = range(0, 1)

2021-03-16 11:48:19,160 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,192 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,539 - mmdet - INFO - load model from: torchvision://resnet50
2021-03-16 11:48:19,698 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

2021-03-16 11:48:24,285 - mmdet - INFO - Start running, host: phzhou@ubuntu-MW51-HP0-00, work_dir: /disk1/huihui/MIAL/work_dirs/MIAL/20210316_114818
2021-03-16 11:48:24,285 - mmdet - INFO - workflow: [('train', 1)], max: 3 epochs
Traceback (most recent call last):
File "./tools/train.py", line 267, in
main()
File "./tools/train.py", line 180, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/disk1/huihui/MIAL/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk1/huihui/MIAL/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/disk1/huihui/MIAL/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'
Traceback (most recent call last):
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/phzhou/anaconda3/envs/mial/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/MIAL.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

I also had this problem and my solution was to make sure that the epoch_base_runner.py file in your code was copied to the appropriate place. In README there is a statement cp-v epoch_base_runner.py ~balabala/runner/

For reference only.

@Jackyinuo
Copy link
Author

cycles = [0, 1, 2, 3, 4, 5, 6]
work_directory = './work_dirs/MIAL'
gpu_ids = range(0, 1)
2021-03-16 11:48:19,160 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,192 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,539 - mmdet - INFO - load model from: torchvision://resnet50
2021-03-16 11:48:19,698 - mmdet - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
2021-03-16 11:48:24,285 - mmdet - INFO - Start running, host: phzhou@ubuntu-MW51-HP0-00, work_dir: /disk1/huihui/MIAL/work_dirs/MIAL/20210316_114818
2021-03-16 11:48:24,285 - mmdet - INFO - workflow: [('train', 1)], max: 3 epochs
Traceback (most recent call last):
File "./tools/train.py", line 267, in
main()
File "./tools/train.py", line 180, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/disk1/huihui/MIAL/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk1/huihui/MIAL/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/disk1/huihui/MIAL/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'
Traceback (most recent call last):
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/phzhou/anaconda3/envs/mial/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/MIAL.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

I also had this problem and my solution was to make sure that the epoch_base_runner.py file in your code was copied to the appropriate place. In README there is a statement cp-v epoch_base_runner.py ~balabala/runner/

For reference only.

thank you so much, I updated Python 3.8. The original files are in Python 3.7

@yuantn
Copy link
Owner

yuantn commented Mar 16, 2021

cycles = [0, 1, 2, 3, 4, 5, 6]
work_directory = './work_dirs/MIAL'
gpu_ids = range(0, 1)
2021-03-16 11:48:19,160 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,192 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,539 - mmdet - INFO - load model from: torchvision://resnet50
2021-03-16 11:48:19,698 - mmdet - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
2021-03-16 11:48:24,285 - mmdet - INFO - Start running, host: phzhou@ubuntu-MW51-HP0-00, work_dir: /disk1/huihui/MIAL/work_dirs/MIAL/20210316_114818
2021-03-16 11:48:24,285 - mmdet - INFO - workflow: [('train', 1)], max: 3 epochs
Traceback (most recent call last):
File "./tools/train.py", line 267, in
main()
File "./tools/train.py", line 180, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/disk1/huihui/MIAL/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk1/huihui/MIAL/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/disk1/huihui/MIAL/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'
Traceback (most recent call last):
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/phzhou/anaconda3/envs/mial/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/MIAL.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

I also had this problem and my solution was to make sure that the epoch_base_runner.py file in your code was copied to the appropriate place. In README there is a statement cp-v epoch_base_runner.py ~balabala/runner/
For reference only.

thank you so much, I updated Python 3.8. The original files are in Python 3.7

Yes, just as @xiaosa96 mentioned, if you have modified anything in the mmcv package (including but not limited to: updating/re-installing Python, PyTorch, mmdetection, mmcv, mmcv-full, conda environment), you are supposed to copy the epoch_based_runner.py provided in this repository to the mmcv directory again (as described in the installation.md).

@yuantn yuantn closed this as completed Mar 16, 2021
@yuantn yuantn added see README.md Didn't follow the README.md good first issue Good for newcomers labels Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers see README.md Didn't follow the README.md
Projects
None yet
Development

No branches or pull requests

3 participants