Familiarity-aware Evidence Compression for Retrieval Augmented Generation

We propose FaviComp (Familiarity-aware Evidence Compression), a novel training-free evidence compression technique that makes retrieved evidence more familiar to the target model, while seamlessly integrating parametric knowledge from the model.

Installation

conda create -n favicomp python=3.12
conda activate favicomp
pip install -r requirements.txt

Data

Data can be download in this link. Place data/ under root directory.

Run FaviComp

Example script for NQ dataset with the compression and target model being Llama-3-8B-Instruct. Make sure the both compression and target model has the same tokenizer. Change the parameters to run on other datasets and models.

python main.py \
--model_name meta-llama/Meta-Llama-3-8B-Instruct \
--target_model_name meta-llama/Meta-Llama-3-8B-Instruct \
--alpha 0.5 \
--batch_size 28 \
--dataset nq

model_name: Compression model name (e.g. meta-llama/Llama-3.2-3B-Instruct)
target_model_name: Target model name (e.g. meta-llama/Meta-Llama-3-8B-Instruct)
alpha: Ensemble coefficient alpha
dataset: Dataset ('nq', 'tqa', 'hotpotqa', 'wiki', 'musique')

Evaluation

After running FaviComp, run the performance evaluation script below using the same parameters.

python evaluate.py \
--model_name meta-llama/Llama-3.2-3B-Instruct \
--target_model_name meta-llama/Meta-Llama-3-8B-Instruct \
--alpha 0.5 \
--dataset nq

Calculate perplexity of the compressed evidence using the script below.

python eval_ppl.py \
--model_name meta-llama/Llama-3.2-3B-Instruct \
--target_model_name meta-llama/Meta-Llama-3-8B-Instruct \
--alpha 0.5 \
--dataset nq

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Familiarity-aware Evidence Compression for Retrieval Augmented Generation

Installation

Data

Run FaviComp

Evaluation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Familiarity-aware Evidence Compression for Retrieval Augmented Generation

Installation

Data

Run FaviComp

Evaluation