Skip to content

jacksonchen1998/Multi-turn-Dialogue-Response

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-turn-Dialogue-Response

image

Install requirements.txt

conda install --yes --file requirements.txt

Pre-processing

  • Download Pretrained GloVe Embeddings and save it in /vectors.
  • The preprocessed dataset is saved as /data/ED/dataset_preproc.p. If you want to create the dataset yourself or change the knowledge types generated by COMET, delete this file, download the COMET checkpoint and place it in /data/ED/Comet. The preprocessed dataset would be generated after the training script. Here, we use BART, since the GPT-2 version can not be used.

Dataset

In this study we use two different dataset. In this repository, we set the config and mapping to EmpatheticDialogues.

Data Resource

Data Structure

The data folder is organized as follows:

data
|--- ED
|    |--- Comet
|    |--- emp.pkl
|    |--- train.csv
|    |--- valid.csv
|    |--- test.csv
|    
|--- DD
|    |--- Comet
|    |--- dd.pkl
|    |--- train.csv
|    |--- valid.csv
|    |--- test.csv

Directory and File Descriptions

  • data/: The root directory containing all data files.

    • ED/: This directory contains files for the "EmpatheticDialogues" dataset.

      • Comet/: A subdirectory containing the "Comet" package used to generate common-sense knowledge.

      • emp.pkl: A pickle file containing topic appearance probabilities for the "EmpatheticDialogues" dataset.

      • train.csv: The training data file for the "EmpatheticDialogues" dataset.

      • valid.csv: The validation data file for the "EmpatheticDialogues" dataset.

      • test.csv: The testing data file for the "EmpatheticDialogues" dataset.

    • DD/: This directory contains files for the "DailyDialog" dataset.

      • Comet/: A subdirectory containing the "Comet" package used to generate common-sense knowledge.

      • dd.pkl: A pickle file containing topic appearance probabilities for the "DailyDialog" dataset.

      • train.csv: The training data file for the "DailyDialog" dataset.

      • valid.csv: The validation data file for the "DailyDialog" dataset.

      • test.csv: The testing data file for the "DailyDialog" dataset.

Training

python main.py --cuda --save_path save/your_dir

Testing

python main.py --cuda --test --save_path save/your_dir --model_path save/dir_save/KITM_XXXX.XXX

Please be free to contact with us via [email protected]

About

Master Thesis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages