Skip to content

Repository for NeSy2024 paper Valid Text-to-SQL Generation with Unification-based DeepStochLog

License

Notifications You must be signed in to change notification settings

ML-KULeuven/deepstochlog-lm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepStochLog-LM

DeepStochLog-LM is a neuro-symbolic framework that combines grammar, logic, probabilities, and language models. By writing a DeepStochLog-LM program, one can train a language model with the given background knowledge. We show the power of DeepStochLog-LM in providing the validity guarantee for the text-to-SQL task. It is an extension of DeepStochLog. For more information, consult the NeSy2024 paper Valid Text-to-SQL Generation with Unification-based DeepStochLog.

Installation

Installing SWI Prolog

sudo apt-add-repository ppa:swi-prolog/stable
sudo apt-get update
sudo apt-get install swi-prolog

Build Conda environment

conda env create -f deepstochlog-lm.yml

Data

Download the Spider dataset from here, and put it under "data/".

Run tasks

Schema and data preparation

  • For task 1
python src/process.py --task 1
  • For task 2
python src/process.py --task 2

Task 1

  • For setting 1, Bert as selection switcher
python src/task1/task1.py --ss_type "bert"
  • For setting 2, T5 as selection switcher
python src/task1/task1.py

Task 2

  • Train the language models with data in "data/lms_task2" and "src/task2/train_lms.ipynb". Download the saved checkpoints and put them under "models/".
  • Generate outputs
python src/task2/task2.py
  • Evaluation
output/evaluate.bat

Outputs from vanilla T5-small baseline are generated with "data/train_t5_baseline.json", "data/test_t5_baseline.json", and "src/task2/train_lms.ipynb". DAIL-SQL, DIN-SQL, Graphix-T5, and C3 putputs are from their orginal repos. T5-small+CFGs outputs are generated by "src/task2/task2.py" with context_sensitive = False.

Credits & Paper citation

The paper is also accepted to NeSy24. Please cite that version of the paper when the proceedings are out.

Acknowledgments

We would like to acknowledge the following sources for their contributions to this project:

  • Spider for the dataset and the official evaluation matrices
  • DeepStochLog for its implementation of NDCGs

About

Repository for NeSy2024 paper Valid Text-to-SQL Generation with Unification-based DeepStochLog

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published