In-Image Learning

Code for the paper "All in an Aggregated Image for In-Image Learning".

Requirement

pip install -r requirements.txt

Download Dataset

The processed dataset and demonstration examples are available from this link. Unzip the file after downloading and keep the dataset directory in the root directory of the project.

----IIL
    |----dataset
    |----src
    ...

Run In-Image Learning and Baselines

In-Image Learning

python run_iil.py --exp_name exp_on_mv --dataset mathvista --lt few_shot

Visual-text interleaved in-context learning

python run_vticl.py --exp_name exp_on_mv --dataset mathvista --lt few_shot

Text-only in-context learning

python run_ticl.py --exp_name exp_on_mv --dataset mathvista --lt few_shot

Cite

If you find In-Image Learning useful for your research and applications, please kindly cite using this BibTeX:

@misc{wang2024single,
      title={All in a Single Image: Large Multimodal Models are In-Image Learners}, 
      author={Lei Wang and Wanyu Xu and Zhiqiang Hu and Yihuai Lan and Shan Dong and Hao Wang and Roy Ka-Wei Lee and Ee-Peng Lim},
      year={2024},
      eprint={2402.17971},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

In-Image Learning

Requirement

Download Dataset

Run In-Image Learning and Baselines

In-Image Learning

Visual-text interleaved in-context learning

Text-only in-context learning

Cite

Files

README.md

Latest commit

History

README.md

File metadata and controls

In-Image Learning

Requirement

Download Dataset

Run In-Image Learning and Baselines

In-Image Learning

Visual-text interleaved in-context learning

Text-only in-context learning

Cite