Skip to content

Latest commit

 

History

History
39 lines (32 loc) · 2.76 KB

README.md

File metadata and controls

39 lines (32 loc) · 2.76 KB

DiversityMetrics

This is the implementation of self-CIDEr and LSA-based diversity metrics (only for python 2.7). If you think this is helpful for your work, please cite the paper: Qingzhong Wang and Antoni Chan. Describing like humans: on diversity in image captioning. CVPR, 2019

Note

To compute the CIDEr score, TF-IDF file is required. In our paper, the TF-IDF is obtained from MSCOCO training dataset. And to compute the diversity, multiple captions for each image should be generated and the format must be the same as the file ./results/merge_results.json.

Evaluation

  1. Generating multiple captions for each image, for example 10 for each.
  2. Put the json file in ./results and make sure that the format is the same as that of merge_results.json.
  3. Download the TF-IDF file from this link and put the file in ./data. Dowonload MSCOCO validation annotation file and put it in ./annotations.
  4. Fill the information in the params.json.
  5. Run accuracy_evalscript.py or diversity_evalscript.py to obtain the accuracy or diversity.

References

Acknowledgement

  • Ramakrishna Vedantam (Virgina Tech)
  • MSCOCO Caption Evaluation Team (Xinlei Chen (CMU), Hao Fang (University of Washington), Tsung-Yi Lin (Cornell))

Citation

If this is helpful for your work, please cite our paper as:

@InProceedings{Wang_2019_CVPR,
  author = {Wang, Qingzhong and Chan, Antoni B.},
  title = {Describing Like Humans: On Diversity in Image Captioning},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2019}
  }