GitHub - NYCU-MLLab/Learning-Diffusion-Transitions-for-Continuous-Discrete-Denoising-Process

Setup:

The code is based on PyTorch and HuggingFace transformers.

pip install -r requirements.txt

cd scripts
bash train.sh

Arguments explanation:

--dataset: the name of datasets, just for notation
--data_dir: the path to the saved datasets folder, containing train.jsonl,test.jsonl,valid.jsonl
--seq_len: the max length of sequence $z$ ($x\oplus y$)
--resume_checkpoint: if not none, restore this checkpoint and continue training
--vocab: the tokenizer is initialized using bert or load your own preprocessed vocab dictionary (e.g. using BPE)
--learned_mean_embed: set whether to use the learned soft absorbing state.
--denoise: set whether to add discrete noise
--use_fp16: set whether to use mixed precision training
--denoise_rate: set the denoise rate, with 0.5 as the default, no effect in this version

Perform full 2000 steps diffusion process. Achieve higher performance compare with Speed-up Decoding

cd scripts
bash run_decode.sh

We customize the implementation of DPM-Solver++ to DiffuSeq to accelerate its sampling speed.

cd scripts
bash run_decode_solver.sh

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
datasets/QQP		datasets/QQP
diffuseq		diffuseq
scripts		scripts
word_freq		word_freq
.gitignore		.gitignore
README.md		README.md
basic_utils.py		basic_utils.py
dpm_solver_pytorch.py		dpm_solver_pytorch.py
requirements.txt		requirements.txt
sample_seq2seq.py		sample_seq2seq.py
sample_seq2seq_dpmSolver.py		sample_seq2seq_dpmSolver.py
train.py		train.py
train_util.py		train_util.py
word_freq.py		word_freq.py