SPIDer: Summarization Analysis with Partial Information Decomposition

Description

SPIDer is a framework that decomposes the mutual information contained in a multi-document summary into redundant, synergistic, union, and unique information. We base our implementation on the Partial Information Decomposition (PID) approach defined in A Novel Approach to the Partial Information Decomposition.

Installation

Create a Python environment and run the following command to install the required packages:

pip install -r requirements.txt

Usage

The class MDSPID works as follows:

define the dataset and underlying language model you want to use
compute the needed sentence probabilities
run the computation of PIDs (partial information decomposition)

Important: Create the output folders outputs/precomputed, outputs/preprocessed_data, and outputs/results

Run:

python run_pid.py --mode run_all --config ../configs/config.yaml

After running run_spider.py, the MDSPID instance with the computed PID is stored as a pickle file under outputs/results. To analyze the results you can use the Jupyter Notebook notebooks/pid_results.ipynb or the script spider_results.py.

The code to convert MultiRC into a MDS dataset and get the synergy scores is under notebooks/multiRC.

Citation

@inproceedings{mascarell-2024-which,
    title = "Which Information Matters? Dissecting Human-written Multi-document Summaries with Partial Information Decomposition",
    author = "Mascarell, Laura and L'Homme, Yan and El Helou, Majed",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
notebooks		notebooks
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
spider_logo.png		spider_logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPIDer: Summarization Analysis with Partial Information Decomposition

Description

Installation

Usage

Citation

About

Releases

Packages

Languages

License

mediatechnologycenter/SPIDer

Folders and files

Latest commit

History

Repository files navigation

SPIDer: Summarization Analysis with Partial Information Decomposition

Description

Installation

Usage

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages