Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugs preventing reproducing results #5

Open
felix-schneider opened this issue Jun 23, 2022 · 5 comments
Open

Bugs preventing reproducing results #5

felix-schneider opened this issue Jun 23, 2022 · 5 comments

Comments

@felix-schneider
Copy link

felix-schneider commented Jun 23, 2022

The code is broken in a number of places, making it impossible to reproduce results:

  • There is no way to specify args.mode in run.py other than editing it into the script
    • passing --mode train to run.py does not work, because that argument will be stored in args.train.mode, not args.mode.
    • The same is true for args.checkpoint_dir for inference.
  • In gen_summary/inference.py:38, you assign to self.bart.cfg.dataset.batch_size_valid = bsz. I'm not sure what this is supposed to do, but the bart object comes from fairseq and has no cfg member.
  • in stage 2, run.py does not actually use the output from stage 1 and will just create the same input/output as stage 1 again
    • the others are one-line fixes but for this one I am not sure how to fix

For others who are also trying to reproduce:

  • The required fairseq version is actually 0.10.0, there is no 1.10.0
    • You also need to copy the examples folder somewhere and add its parent folder to PYTHONPATH. You must move it out of the main fairseq dir, or the fairseq logging module will override the Python builtin.
  • Since AnyROUGE does not provide any installation instructions: Clone the library and add its root folder to the PYTHONPATH.
  • Then clone https://github.com/pltrdy/rouge into the ThirdParty folder inside AnyRouge. If you put it somewhere else or pip install it, you will need to edit the scripts to change imports.
@Jinhyeong-Lim
Copy link

  • in stage 2, run.py does not actually use the output from stage 1 and will just create the same input/output as stage 1 again

    • the others are one-line fixes but for this one I am not sure how to fix

hi
I reproduced almost everything and modified the code to get similar performance.

I think you should modify the data preprocessing part code.
And the hyperparameter should be modified according to the details written in the paper.

@WeixiangYAN
Copy link

Hello, can you share your modified code? thanks.

@xianglous
Copy link

  • in stage 2, run.py does not actually use the output from stage 1 and will just create the same input/output as stage 1 again

I think the at line 100 of run.py, it is getting the output data from stage 1 with args.cur_stage still being 1 at that time:

source_path = os.path.join(args.train.output_path, f"stage_{args.cur_stage}")
cur_source = load_split_aslist(source_path, suffix='hypo')

But the other mentioned problems do exist and also at line 87 and 123 or run.py, the path to the checkpoint should be something like:

os.path.join(training_args.checkpoint_dir, f"stage_{args.cur_stage}/trainer_output")

if we want to use the checkpoints after training without any file/folder movements.

@CAH9487
Copy link

CAH9487 commented Oct 12, 2022

I only achieved around 70% paper rouge score on AMI dataset.
I used fairseq 0.12.2 or 0.10.0 + comment out codes about self.bart.cfg.xxxxx in inference.py
Is there something except @felix-schneider & @xianglous said should be modified?

@heatherzheng
Copy link

Hi @ CAH9487, I would like to know if you produced the result. do we need to change anything about the configure file? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants