-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when decoding,something wrong. #17
Comments
Could you plz share your command? |
my command is blow: |
Yes, if you get the same results during decoding, your trained model still hasn't converged. You should check convergence of the model by looking at the loss you receive from evaluation dataset. If after some point this loss doesn't decrease, your model has converged and is ready to decode. |
Hello, I encountered some confusion when training the model. |
For CNN/DM dataset with 287226 train size, I'd suggest the following setup: |
Thank you for your reply, but if I just only want to use RL training, Should i set eta=1, scheduled_sampling=True and sampling_probability=1? |
Yes, but this only works if you have a very well-trained model based on MLE loss and usually what researchers do is not to use eta=1 but use a value that is close to 1, i.e. eta=0.9984. Check out Paulus et. all paper for more information. |
In the third step( coverage for 3 epochs), should I set rl_training=True and eta=1/269274=3.71368E-06at the same time? |
Yes, you still need the RL training to be true since you are using RL for training and make sure to set eta to the right value for the coverage. |
I use the parameters in "Get to the point: summarization with pointer-generator networks" to train the model for 600000 iters, and final the pgen_loss converge to 3.7~4.3.
However all decoded result is same. What's wrong with me? How can i fix this problem? |
Excuse me, have you solved this problem? |
Could you please share your decoding command and some of the outputs? |
Thank you very much. The issue has been solve. Thank you for your reply. |
I meet the sample problem that all the decoded results of different examples are the same no matter I set rl_training=Flase or True, and I have trained the model for about 300000 steps and the loss has stopped failling. How did you solve the issue? This is very important to me and I am looking forward to your reply. |
I meet the sample problem as you when I train the model in chinese dataset. |
hello, when I decode using eval model, something wrong,
could you help me?
the main information is:
Traceback (most recent call last):
File "run_summarization.py", line 845, in
tf.app.run()
File "/home/ices/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "run_summarization.py", line 841, in main
seq2seq.main(unused_argv)
File "run_summarization.py", line 810, in main
decoder.decode() # decode indefinitely (unless single_pass=True, in which case deocde the dataset exactly once)
File "/home/ices/zhangbowen/RLSeq2Seq/src/decode.py", line 115, in decode
best_hyp = beam_search.run_beam_search(self._sess, self._model, self._vocab, batch)
File "/home/ices/zhangbowen/RLSeq2Seq/src/beam_search.py", line 144, in run_beam_search
prev_encoder_es = encoder_es if FLAGS.use_temporal_attention else tf.stack([], axis=0))
File "/home/ices/zhangbowen/RLSeq2Seq/src/model.py", line 855, in decode_onestep
results = sess.run(to_return, feed_dict=feed) # run the decoder step
File "/home/ices/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/ices/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1111, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (4, 256) for Tensor 'prev_decoder_outputs:0', which has shape '(?, 4, 256)'
The text was updated successfully, but these errors were encountered: