Code for the paper Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation (ACL 2023 Short Findings paper)
The code is written in PyTorch library. Main dependencies are as follows:
- Python: 3.6.9
- torch: 1.8.1
- transformers: 4.6.1
Other dependencies can be found in requirements.txt
We train HINT based on the platform:
- OS: Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-98-generic x86_64)
- CUDA Version: 10.1
- GPU: NVIDIA Tesla V100
The full data can be downloaded from THUcloud.
The initial checkpoint of GPT2 can be downloaded from HuggingFace. We provide our checkpoints on THUcloud.
The 1st stage (get the premature checkpoint): Execute the following command (or run
bash ./
directly):data_name=wikitext env CUDA_VISIBLE_DEVICES=0 python3 -u ./ \ --model_name_or_path gpt2 \ --train_file ./${data_name}_data/train.txt \ --validation_file ./${data_name}_data/val.txt \ --do_train \ --do_eval \ --num_train_epochs 100 \ --max_eval_samples 1000 \ --dataloader_num_workers 64 \ --per_device_train_batch_size 8 \ --per_device_eval_batch_size 8 \ --gradient_accumulation_steps 2 \ --output_dir ./${data_name}_f0_ckpt \ --logging_steps 5 \ --learning_rate 1e-4 \ --lr_scheduler_type linear \ --evaluation_strategy epoch \ --save_strategy epoch \ --cache ../cache
The 1st training stage is exactly the same as fine-tuning the standard GPT2 model.
The 2nd stage (get the final model): Execute the following command (or run
bash ./
directly):data_name=wikitext env CUDA_VISIBLE_DEVICES=0 python3 -u ./ \ --model_name_or_path ./${data_name}_f0_ckpt \ --model_name_or_path2 ./${data_name}_f0_ckpt \ --train_file ./${data_name}_data/train.txt \ --validation_file ./${data_name}_data/val.txt \ --do_train \ --do_eval \ --num_train_epochs 100 \ --max_eval_samples 1000 \ --dataloader_num_workers 64 \ --per_device_train_batch_size 8 \ --per_device_eval_batch_size 8 \ --gradient_accumulation_steps 2 \ --output_dir ./${data_name}_selfcont_ckpt \ --logging_steps 5 \ --learning_rate 1e-4 \ --lr_scheduler_type linear \ --evaluation_strategy epoch \ --save_strategy epoch \ --cache ../cache
Execute the following command to generate texts (or run bash ./
bsz=16 # batch size
topp=0 # p of top-p sampling, 0 means greedy decoding
python3 ./ $model_ckpt_path $result_file $device $bsz $task_name $data_ipt_file $topp
Execute the following command for evaluation:
cd ./eval
python3 ./
You can change result_list
in the script
to specify the results you want to evaluate.
Please kindly cite our paper if this paper and it is helpful.
title={Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation},
author={Jian Guan and Minlie Huang},