GPT4V_Captioner

A powerful script that allows you to caption a single image or batch process any number of images within a directory. This script utilizes OpenAI's GPT-4 model to generate precise tags for images to enhance the CLIP model's understanding.

Features

User Friendly: You can put or copy paste the Script anywhere you like where you want to caption the images.
Single and Batch Image Processing: Caption one image or multiple images in a directory.
Base64 Encoding: Converts images to base64 for API processing.(Must have Open AI's API key)
GPT-4 Integration: Uses OpenAI's GPT-4 model for generating image captions.
Adding your own Captions: After Vision done captioning your images, it will ask you to add your own custom Tags to add on top of the .txt file for all .txt files at once. (Optional)

Installation

Clone the repository:

git clone https://github.com/ababiya-worku/GPT4v-Auto_Captioner.git

cd GPT4V_Captioner

Install the required dependencies:
```
pip install -r requirements.txt
```
Set up your OpenAI API key: Obtain your API key from OpenAI and put the key when asked:

Before Running the script, Manually install this in CMD

pip install openai requests

pip install openai requests pillow

pip install --upgrade openai

For Banner Image creation install this:

pip install wordcloud matplotlib

pip install colorama openai pillow wordcloud matplotlib

Usage

Single Image Captioning:

python GPT4V_Captioner.py single /path/to/your/image.jpg

Batch Image Captioning:

python GPT4V_Captioner.py batch /path/to/your/directory

Use the .Bat File: Simply Put both the script & bat file in your images directory & double-click on the bat file, and let it do the Magic!
```
 GPT4V_Captioner.bat
```

Functions

is_image_file(filename): Checks if the provided file is a valid image.
encode_image(image_path): Encodes the image to a base64 string.
describe_image(image_path, api_key): Sends the image to OpenAI's GPT-4 model for captioning.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

OpenAI for the GPT-4 model.
Colorama for terminal text formatting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

GPT4V_Captioner

Features

Installation

Before Running the script, Manually install this in CMD

For Banner Image creation install this:

Usage

Functions

License

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

GPT4V_Captioner

Features

Installation

Before Running the script, Manually install this in CMD

For Banner Image creation install this:

Usage

Functions

License

Acknowledgements