Skip to content

Latest commit

 

History

History
174 lines (135 loc) · 6.43 KB

README.md

File metadata and controls

174 lines (135 loc) · 6.43 KB

drawing

Python application

InterroLang

A TalkToModel (Slack et al., 2022) adaptation to NLP use cases (question answering, hate speech detection, dialogue act classification).
The name is a word-play on Interrobang, a ligature of question mark and exclamation mark, and the interrogation of Language models.
Our tool offers a dialogue-based exploration of NLP interpretability methods (feature attribution, counterfactuals and perturbations, free-text rationalization) and dataset analyses (similar examples, keywords, label distribution).
Our accompanying paper is accepted at EMNLP 2023 Findings.

    drawing

InterroLang Interface

We consider 7 categories of operations.

  • About: Capabilities of our system.
  • Metadata: Information in terms of data, model, labels.
  • Prediction: Various operations related to gold labels and model predictions.
  • Understanding: Keyword-based analysis and retrieval of similar instances.
  • Explanation: Feature attribution methods (local-, class-, global-level) and free-text rationalization.
  • Perturbation: Methods to change some parts of an instance, e.g. such that the label of the instance would change.
  • Custom input: We not only allow instances from the dataset but also instances given by users. (More details see below)

They are defined in actions and prompts.

Dataset Viewer

We provide a dataset view with which users can explore instances contained in the pre-defined dataset (in screenshot, BoolQ dataset is used). Users can search instances that include the entered string.

Datasets / Use cases

Running with conda / virtualenv

Create the environment and install dependencies.

Conda

conda create -n interrolang python=3.9
conda activate interrolang

venv

python -m venv venv
source venv/venv/activate

Install the requirements

python -m pip install --upgrade pip
pip install -r requirements.txt

# Download omw-1.4
python -m nltk.downloader omw-1.4

# Punkt
python -m nltk.downloader punkt

# Install spacy
pip install -U pip setuptools wheel
pip install -U spacy
python -m spacy download en_core_web_sm

Install polyjuice-nlp and its dependencies due to some issues from polyjuice:

cd utils
bash dependency.sh

Download models

In our tool, we currently use the following Transformer models:

For BoolQ and OLID Model:

Put them under ./data and name the folders boolq_model and olid_model respectively.

For Daily Dialog model:

Put the file 5e_5e-06lr under ./explained_models/da_classifier/saved_model

Set up configuration

In ./configs, there are all gin config files for all three datasets with different parsing models. You can choose one of them and set its path in ./global_config.gin:

GlobalArgs.config = "./configs/boolq_adapter.gin"

Run the application

You can launch the Flask web app via

python flask_app.py

Running with Docker

If you want to run with Docker, you can build the docker app

sudo docker build -t interrolang .

And then run the image

sudo docker run -d -p 4000:4000 interrolang

User Guide

After the project is set up, we provide an optional user guide using Selenium to demonstrate how to use our system if you have a Chrome Browser available.

How to use custom input

Supported operations

  1. Feature importance on token level
  2. Feature importance on sentence level
  3. Prediction
  4. Similarity
  5. Rationalization

Process

1. Enter your custom input in the text area and then click send button. Be aware: you have to choose "Custom input" in the selection box.

2. After clicking the button, you could see your custom input in the terminal

3. Then you should enter prompts for operations mentioned above. Operations that support custom input are highlighted with yellow border.

4. In the end, click the send button and you will get the result. (In this example, we show the result of feature importance on token level on given custom input)

How to use include operation

Include operation works similar to custom input

  1. Enter a single token in the text area and choose "Include"" in the selection box. Then click the send button.
  2. After clicking the button, you could see your entered token in the interface.
  3. Then you could enter a prompt (refer here)
    • Supported operations:
      • countdata
      • label
      • mistake
      • predict
      • score
      • show

Cite InterroLang

@inproceedings{feldhus-etal-2023-interrolang,
    title = "{I}nterro{L}ang: Exploring {NLP} Models and Datasets through Dialogue-based Explanations",
    author = {Feldhus, Nils and Wang, Qianli and Anikina, Tatiana and Chopra, Sahil and Oguz, Cennet and M{\"o}ller, Sebastian},
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    month = dec,
    year = "2023",
    address = "Singapore, Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2310.05592",
}