FS-MEVQA

Authors' source for ACM MM 2024 paper "Few-Shot Multimodal Explanation for Visual Question Answering" (FS-MEVQA)

Dataset

We release the SME dataset in dataset/dataset.zip, inlcuding questions, answers, and multimodal explanations. You can also download the dataset from Huggingface Datasets. The images should be downloaded from the GQA source.

If you are interesting in our dataset construction, you can refer to 1extract_semantic_structure.py, 2semantic_tree_to_text.py, 3correct_some_issues.py, 4separate_text_and_box.py, 5complete_grammar.py, 6GPT_check_grammar.py, 7Add_mannual_corrections.py in the dataset folder.

Method

We provide an easy-to-use end-to-end pipeline in pipeline.py. You can just set the image path and question in the script and run it.

If you want to reimplement our experiments on the SME dataset, you can just run 1generate_program.py, 2generate_process.py, 3intepretation.py, 4revover_boxes.py, which is much faster than run samples one-by-one.

Our Results

We provide our results in results/MEAgent_results.json.

Evaluation

We provide the evaluation script in evaluation.py. The detailed metrics are implemented in language_metrics.py, visual_metrics.py, attribution_metric.py in the metrics folder.

Furthermore, you should download the pycocoevalcap package and put it in metrics, which is needed for language evaluation.

Baseline

We provide the implementation of GPT-4V for our task in GPT4V_baseline.py.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
OOD_images		OOD_images
dataset		dataset
engine		engine
metrics		metrics
prompts		prompts
results		results
1generate_program.py		1generate_program.py
2generate_process.py		2generate_process.py
3intepretation.py		3intepretation.py
4recover_boxes.py		4recover_boxes.py
GPT4V_baseline.py		GPT4V_baseline.py
LICENSE		LICENSE
README.md		README.md
evaluation.py		evaluation.py
pipeline.py		pipeline.py
vis_utils.py		vis_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FS-MEVQA

Dataset

Method

Our Results

Evaluation

Baseline

About

Releases

Packages

Languages

License

LivXue/FS-MEVQA

Folders and files

Latest commit

History

Repository files navigation

FS-MEVQA

Dataset

Method

Our Results

Evaluation

Baseline

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages