I loaded the pre-training weights during training and the resolution matches my training set, but an error is reported in train.py. If it works fine without pre-training weights, which file do I need to change? #39

999789 · 2024-04-09T06:58:10Z

Traceback (most recent call last):
File "train.py", line 369, in
main() # pylint: disable=no-value-for-parameter
File "/root/miniconda3/lib/python3.8/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/root/miniconda3/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/miniconda3/lib/python3.8/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "train.py", line 362, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "train.py", line 94, in launch_training
torch.multiprocessing.spawn(fn=subprocess_fn, args=(c, temp_dir), nprocs=c.num_gpus)
File "/root/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/root/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/root/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in wrap
fn(i, *args)
File "/root/autodl-tmp/stylegan3-fun-main/train.py", line 50, in subprocess_fn
training_loop.training_loop(rank=rank, **c)
File "/root/autodl-tmp/stylegan3-fun-main/training/training_loop.py", line 163, in training_loop
misc.copy_params_and_buffers(resume_data[name], module, require_all=False)
File "/root/autodl-tmp/stylegan3-fun-main/torch_utils/misc.py", line 162, in copy_params_and_buffers
tensor.copy(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 1

999789 · 2024-04-09T06:58:55Z

python train.py --outdir=training-runs --cfg=stylegan3-t --data=/root/autodl-tmp/stylegan3-fun-main/hechengtupianrgba.zip --gpus=4 --batch=16 --gamma=6 --mirror=1 --kimg=5000 --snap=25 --batch-gpu=4 --metrics=none --resume=/root/autodl-tmp/stylegan3-fun-main/network-snapshot-011000.pkl

PDillis · 2024-04-10T14:35:38Z

Basically, the mismatch says it's when trying to load the pre-trained .pkl on the newly constructed stylegan3-t configuration. I'll try to fix it, as it also failed with me with a pre-trained StyleGAN3-T model, so perhaps the construction of the new networks is wrong. I'll update this whenever I can fix it.

999789 · 2024-04-11T02:44:36Z

Thanks for the reply.

PDillis mentioned this issue Apr 10, 2024

Error when training Stylegan2-ext #38

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I loaded the pre-training weights during training and the resolution matches my training set, but an error is reported in train.py. If it works fine without pre-training weights, which file do I need to change? #39

I loaded the pre-training weights during training and the resolution matches my training set, but an error is reported in train.py. If it works fine without pre-training weights, which file do I need to change? #39

999789 commented Apr 9, 2024

999789 commented Apr 9, 2024

PDillis commented Apr 10, 2024

999789 commented Apr 11, 2024

I loaded the pre-training weights during training and the resolution matches my training set, but an error is reported in train.py. If it works fine without pre-training weights, which file do I need to change? #39

I loaded the pre-training weights during training and the resolution matches my training set, but an error is reported in train.py. If it works fine without pre-training weights, which file do I need to change? #39

Comments

999789 commented Apr 9, 2024

999789 commented Apr 9, 2024

PDillis commented Apr 10, 2024

999789 commented Apr 11, 2024