Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error reported while training model # 4. Please help me solve it. #17

Open
Wt990310 opened this issue Apr 5, 2024 · 1 comment
Open

Comments

@Wt990310
Copy link

Wt990310 commented Apr 5, 2024

Traceback (most recent call last):
File "train.py", line 288, in
mp.spawn(init_processes, args=(args,), nprocs=args.gpus)
File "/home/WangTing/anaconda3/envs/prototype/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/WangTing/anaconda3/envs/prototype/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
while not context.join():
File "/home/WangTing/anaconda3/envs/prototype/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 119, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/WangTing/anaconda3/envs/prototype/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
fn(i, *args)
File "/home/WangTing/program/PycharmProgram/copyisallyouneed-master_1/copyisallyouneed-master/train.py", line 266, in init_processes
main(args, local_rank)
File "/PycharmProgram/copyisallyouneed-master_1/copyisallyouneed-master/train.py", line 192, in main
loss, acc = model(batch, update_mem_bias=(global_step > args.update_retriever_after))
File "/anaconda3/envs/prototype/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/PycharmProgram/copyisallyouneed-master_1/copyisallyouneed-master/generator.py", line 363, in forward
src_repr, src_mask, mem_repr, mem_mask, copy_seq, mem_bias = self.encode_step(data, update_mem_bias=update_mem_bias)
File "/PycharmProgram/copyisallyouneed-master_1/copyisallyouneed-master/generator.py", line 270, in encode_step
src_repr, src_mask, mem_ret = self.retrieve_step(inp, work)
File "/PycharmProgram/copyisallyouneed-master_1/copyisallyouneed-master/generator.py", line 264, in retrieve_step
src, src_mask, mem_ret = self.retriever.work(inp, allow_hit=work)
File "/PycharmProgram/copyisallyouneed-master_1/copyisallyouneed-master/retriever.py", line 247, in work
all_mem_feats = self.mem_feat_or_feat_maker[indices].to(src_feat.device)
IndexError: index 112898 is out of bounds for dimension 0 with size 87928

@rangehow
Copy link

rangehow commented Apr 5, 2024

我有点不记得源代码具体的过程了,单从报错log来看,应该是你faiss索引的向量和self.mem_feat_or_feat_maker并不是一个向量。faiss索引的向量数超出了你内存持有的向量,所以越界报错了,可以考虑进行相关检查。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants