Skip to content

Latest commit

 

History

History
84 lines (59 loc) · 3.47 KB

README.md

File metadata and controls

84 lines (59 loc) · 3.47 KB

Graph2Seq

This repository contains the code to replicate the results in Graph2Seq paper.

Resources:

In addition to the main resources, I also checked the following references:

Requirements

  • python==3.9 was used for the implementation.
  • Other dependencies:
torch==1.13
torch_geometric==2.2
numpy==1.23
pandas==1.5
  • Please refer to requirements.txt for all the dependencies.

Project Structure

The Graph2Seq model contains two main components. In what follows, an overview of the files implementing these component (and their current state) is elaborated.

Graph2Seq:

  • Graph Encoder & Graph Embedding
    • graph_encoder.py: Two different variation of a GNN model (GCN & Bi-GCN) are implemented. Bi-GCN follows the GNN architecture explained in the paper. The underlying convolution layer used in Bi-GCN is implemented in conv_layer.py. The graph encoder is complete and its functionality can be tested separately (by running graph_encoder.py file).
  • Attention-Based Decoder
    • attention_decoder.py: This file contains the implementation of the attention-based decoder. To check its correct functionality, it has been tested in Seq2Seq_model.py as the decoder part of a sequence-to-sequence translation task.

Other files and their functionalities are as follows:

  • params.py: contains different parameters.
  • parser.py: parses the required arguments.
  • utils.py: contains some utility functions and classes.
  • 👀main.py: controls the main flow of the procedure that consists of:
    • Data loading and processing: The data should be loaded and processed to the correct format usable by the model in this part of the code. This part is incomplete 👀. However, I wrote the assumptions about the data format, which also specifies what steps I need to take to prepare the data.
    • Model definition: Here, different components of the model, their corresponding optimizers, and the criterion are defined.
    • Training & validation: Here, the training and validation takes place. The training and validation procedure is implemented in train.py file.
    • Testing: Here the trained model is tested with the test split of the data. The evaluation of the test split is implemented in eval.py file.
  • train.py: This file contains the training and validation procedure.
  • eval.py: This file contains the evaluation procedure.
  • 👀data_proc/data_loading.py: Here, the data should be loaded and processed to the correct format. The original data contains natural language question, SQL queries, and SQL tables. The SQL queries need to be converted to graph so that they can be used by the graph encoder.