Skip to content

Latest commit

 

History

History
89 lines (57 loc) · 2.9 KB

README.md

File metadata and controls

89 lines (57 loc) · 2.9 KB

Multi-turn-Dialogue-Response

image

Install requirements.txt

conda install --yes --file requirements.txt

Pre-processing

  • Download Pretrained GloVe Embeddings and save it in /vectors.
  • The preprocessed dataset is saved as /data/ED/dataset_preproc.p. If you want to create the dataset yourself or change the knowledge types generated by COMET, delete this file, download the COMET checkpoint and place it in /data/ED/Comet. The preprocessed dataset would be generated after the training script. Here, we use BART, since the GPT-2 version can not be used.

Dataset

In this study we use two different dataset. In this repository, we set the config and mapping to EmpatheticDialogues.

Data Resource

Data Structure

The data folder is organized as follows:

data
|--- ED
|    |--- Comet
|    |--- emp.pkl
|    |--- train.csv
|    |--- valid.csv
|    |--- test.csv
|    
|--- DD
|    |--- Comet
|    |--- dd.pkl
|    |--- train.csv
|    |--- valid.csv
|    |--- test.csv

Directory and File Descriptions

  • data/: The root directory containing all data files.

    • ED/: This directory contains files for the "EmpatheticDialogues" dataset.

      • Comet/: A subdirectory containing the "Comet" package used to generate common-sense knowledge.

      • emp.pkl: A pickle file containing topic appearance probabilities for the "EmpatheticDialogues" dataset.

      • train.csv: The training data file for the "EmpatheticDialogues" dataset.

      • valid.csv: The validation data file for the "EmpatheticDialogues" dataset.

      • test.csv: The testing data file for the "EmpatheticDialogues" dataset.

    • DD/: This directory contains files for the "DailyDialog" dataset.

      • Comet/: A subdirectory containing the "Comet" package used to generate common-sense knowledge.

      • dd.pkl: A pickle file containing topic appearance probabilities for the "DailyDialog" dataset.

      • train.csv: The training data file for the "DailyDialog" dataset.

      • valid.csv: The validation data file for the "DailyDialog" dataset.

      • test.csv: The testing data file for the "DailyDialog" dataset.

Training

python main.py --cuda --save_path save/your_dir

Testing

python main.py --cuda --test --save_path save/your_dir --model_path save/dir_save/KITM_XXXX.XXX

Please be free to contact with us via [email protected]