Error when running yelp/train.py #16

jiwoongim · 2018-07-24T15:58:40Z

I followed README.md and ran
python train.py --data_path ./data

But then I got the following errors:

{'dropout': 0.0, 'lr_ae': 1, 'load_vocab': '', 'nlayers': 1, 'batch_size': 64, 'beta1': 0.5, 'gan_gp_lambda': 0.1, 'nhidden': 128, 'vocab_size': 30000, 'niters_gan_schedule': '', 'niters_gan_d': 5, 'lr_gan_d': 0.0001, 'grad_lambda': 0.01, 'sample': False, 'arch_classify': '128-128', 'clip': 1, 'hidden_init': False, 'cuda': True, 'log_interval': 200, 'device_id': '0', 'temp': 1, 'seed': 1111, 'maxlen': 25, 'lowercase': True, 'data_path': './data', 'lambda_class': 1, 'lr_classify': 0.0001, 'outf': 'yelp_example', 'noise_r': 0.1, 'noise_anneal': 0.9995, 'lr_gan_g': 0.0001, 'niters_gan_g': 1, 'arch_g': '128-128', 'z_size': 32, 'epochs': 25, 'niters_ae': 1, 'arch_d': '128-128', 'emsize': 128, 'niters_gan_ae': 1}
Original vocab 9599; Pruned to 9603
Number of sentences dropped from ./data/valid1.txt: 0 out of 38205 total
Number of sentences dropped from ./data/valid2.txt: 0 out of 25278 total
Number of sentences dropped from ./data/train1.txt: 0 out of 267314 total
Number of sentences dropped from ./data/train2.txt: 0 out of 176787 total
Vocabulary Size: 9603
382 batches
252 batches
4176 batches
2762 batches
Loaded data!
Seq2Seq2Decoder(
  (embedding): Embedding(9603, 128)
  (embedding_decoder1): Embedding(9603, 128)
  (embedding_decoder2): Embedding(9603, 128)
  (encoder): LSTM(128, 128, batch_first=True)
  (decoder1): LSTM(256, 128, batch_first=True)
  (decoder2): LSTM(256, 128, batch_first=True)
  (linear): Linear(in_features=128, out_features=9603, bias=True)
)
MLP_G(
  (layer1): Linear(in_features=32, out_features=128, bias=True)
  (bn1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (activation1): ReLU()
  (layer2): Linear(in_features=128, out_features=128, bias=True)
  (bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (activation2): ReLU()
  (layer7): Linear(in_features=128, out_features=128, bias=True)
)
MLP_D(
  (layer1): Linear(in_features=128, out_features=128, bias=True)
  (activation1): LeakyReLU(negative_slope=0.2)
  (layer2): Linear(in_features=128, out_features=128, bias=True)
  (bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (activation2): LeakyReLU(negative_slope=0.2)
  (layer6): Linear(in_features=128, out_features=1, bias=True)
)
MLP_Classify(
  (layer1): Linear(in_features=128, out_features=128, bias=True)
  (activation1): ReLU()
  (layer2): Linear(in_features=128, out_features=128, bias=True)
  (bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (activation2): ReLU()
  (layer6): Linear(in_features=128, out_features=1, bias=True)
)
Training...
Traceback (most recent call last):
  File "train.py", line 574, in <module>
    train_ae(1, train1_data[niter], total_loss_ae1, start_time, niter)
  File "train.py", line 400, in train_ae
    output = autoencoder(whichdecoder, source, lengths, noise=True)
  File "/localhome/imd/anaconda2/envs/Pytorch/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/groups/branson/home/imd/Documents/project/ARAE/yelp/models.py", line 143, in forward
    hidden = self.encode(indices, lengths, noise)
  File "/groups/branson/home/imd/Documents/project/ARAE/yelp/models.py", line 160, in encode
    batch_first=True)
  File "/localhome/imd/anaconda2/envs/Pytorch/lib/python3.5/site-packages/torch/onnx/__init__.py", line 56, in wrapper
    if not might_trace(args):
  File "/localhome/imd/anaconda2/envs/Pytorch/lib/python3.5/site-packages/torch/onnx/__init__.py", line 130, in might_trace
    first_arg = args[0]
IndexError: tuple index out of range

The text was updated successfully, but these errors were encountered:

jakezhaojb · 2018-07-30T11:01:12Z

Hmm, could you try maybe run with python3?

vineetjohn · 2018-08-17T17:11:01Z

I've run into the same issue.
Python 3.5.2
torch==0.4.1

Training...     
Traceback (most recent call last):
  File "train.py", line 574, in <module>
    train_ae(1, train1_data[niter], total_loss_ae1, start_time, niter)                                                       
  File "train.py", line 400, in train_ae
    output = autoencoder(whichdecoder, source, lengths, noise=True)
  File "/home/v2john/.pyenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__                   
    result = self.forward(*input, **kwargs)
  File "/home/v2john/ARAE/yelp/models.py", line 143, in forward
    hidden = self.encode(indices, lengths, noise)                                                                            
  File "/home/v2john/ARAE/yelp/models.py", line 160, in encode
    batch_first=True)
  File "/home/v2john/.pyenv/lib/python3.5/site-packages/torch/onnx/__init__.py", line 67, in wrapper                         
    if not might_trace(args):
  File "/home/v2john/.pyenv/lib/python3.5/site-packages/torch/onnx/__init__.py", line 141, in might_trace
    first_arg = args[0]                                                                                                      
IndexError: tuple index out of range

Python3 clearly isn't the fix. It seems like something about the PyTorch + ONNX interop is broken.
Is there a specific version of PyTorch that's needed to run this?

vineetjohn · 2018-08-19T01:20:16Z

@jiwoongim

You can try using my forked version of the repository to see if it fixes the issue for you.
I've verified it to be working for Python 3.5.2 and PyTorch 0.4.1
https://github.com/vineetjohn/arae

I've not identified the actual problem yet, but I've added a workaround that avoids having to deal with ONNX altogether. The pack_padded_sequence method in torch.nn.utils.rnn seems to be buggy.

jakezhaojb · 2018-08-24T15:13:22Z

Guys can you try python 3.6? @jiwoongim @vineetjohn

rainyrainyguo · 2018-08-24T23:44:41Z

@jiwoongim
You can try using my forked version of the repository, I have resolved the issue by doing several changes to the original code.
I have verified it to be working for python 3.6.5 and PyTorch 0.4.1
https://github.com/rainyrainyguo/ARAE

vineetjohn · 2018-08-28T14:08:04Z

@jakezhaojb

This doesn't look like a Python version issue.
The named arguments used in this project vs. those accepted by PyTorch 0.4.1 are inconsistent.

You should consider adding the version of PyTorch used to perform your experiments, to the project README.

jakezhaojb · 2018-08-30T18:41:58Z

@vineetjohn Good point! I used PyTorch 0.3.1. I'm adding this to the README

dangvanthin · 2018-12-22T04:03:17Z

@rainyrainyguo
I have run your forked version in python 3.6.5 with PyTorch 0.4.1 (Cudnn=7.1.3, Cudatoolkit=8.0) and I have a error as follow:
Training ....
run_oneb.py:256: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_
.
run_oneb.py:259: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() t
o convert a 0-dim tensor to a Python number
run_oneb.py:263: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() t
o convert a 0-dim tensor to a Python number
| epoch 1 | 0/ 765 batches | ms/batch 0.61 | loss 0.05 | ppl 1.05 | acc 0.00
Traceback (most recent call last):
File "run_oneb.py", line 102, in
exec(open("train.py").read())
File "", line 434, in
File "", line 395, in train
File "", line 324, in train_gan_d
File "/home/thindv/anaconda3/envs/ARAE/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/thindv/anaconda3/envs/ARAE/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]

Can you give me some advices?

V-Enzo · 2020-03-12T10:40:17Z

@dangvanthin Hi, I met the same problem. Do you have the solution right now? Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when running yelp/train.py #16

Error when running yelp/train.py #16

jiwoongim commented Jul 24, 2018

jakezhaojb commented Jul 30, 2018

vineetjohn commented Aug 17, 2018

vineetjohn commented Aug 19, 2018

jakezhaojb commented Aug 24, 2018

rainyrainyguo commented Aug 24, 2018

vineetjohn commented Aug 28, 2018

jakezhaojb commented Aug 30, 2018

dangvanthin commented Dec 22, 2018

V-Enzo commented Mar 12, 2020

Error when running yelp/train.py #16

Error when running yelp/train.py #16

Comments

jiwoongim commented Jul 24, 2018

jakezhaojb commented Jul 30, 2018

vineetjohn commented Aug 17, 2018

vineetjohn commented Aug 19, 2018

jakezhaojb commented Aug 24, 2018

rainyrainyguo commented Aug 24, 2018

vineetjohn commented Aug 28, 2018

jakezhaojb commented Aug 30, 2018

dangvanthin commented Dec 22, 2018

V-Enzo commented Mar 12, 2020