GitHub - AGXAI/GPT4v-Auto_Captioner: A simple & powerful GPT4V- Image captioner for images. single or batch process multiple images in a directory where you run the script..

GPT4V_Captioner

A powerful script that allows you to caption a single image or batch process any number of images within a directory. This script utilizes OpenAI's GPT-4 model to generate precise tags for images to enhance the CLIP model's understanding.

Features

User Friendly: You can put or copy paste the Script anywhere you like where you want to caption the images.
Single and Batch Image Processing: Caption one image or multiple images in a directory.
Base64 Encoding: Converts images to base64 for API processing.(Must have Open AI's API key)
GPT-4 Integration: Uses OpenAI's GPT-4 model for generating image captions.
Adding your own Captions: After Vision done captioning your images, it will ask you to add your own custom Tags to add on top of the .txt file for all .txt files at once. (Optional)

Installation

Clone the repository:

git clone https://github.com/ababiya-worku/GPT4v-Auto_Captioner.git

cd GPT4V_Captioner

Install the required dependencies:
```
pip install -r requirements.txt
```
Set up your OpenAI API key: Obtain your API key from OpenAI and put the key when asked:

Before Running the script, Manually install this in CMD

pip install openai requests

pip install openai requests pillow

pip install --upgrade openai

For Banner Image creation install this:

pip install wordcloud matplotlib

pip install colorama openai pillow wordcloud matplotlib

Usage

Single Image Captioning:

python GPT4V_Captioner.py single /path/to/your/image.jpg

Batch Image Captioning:

python GPT4V_Captioner.py batch /path/to/your/directory

Use the .Bat File: Simply Put both the script & bat file in your images directory & double-click on the bat file, and let it do the Magic!
```
 GPT4V_Captioner.bat
```

Functions

is_image_file(filename): Checks if the provided file is a valid image.
encode_image(image_path): Encodes the image to a base64 string.
describe_image(image_path, api_key): Sends the image to OpenAI's GPT-4 model for captioning.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

OpenAI for the GPT-4 model.
Colorama for terminal text formatting.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
GPT4V_Captioner.bat		GPT4V_Captioner.bat
GPT4V_Captioner.py		GPT4V_Captioner.py
LICENSE		LICENSE
README.md		README.md
Read Me!.txt		Read Me!.txt
Spotlight on the collections _ Royal Museum for Central Africa - Tervuren - Belgium.jpg		Spotlight on the collections _ Royal Museum for Central Africa - Tervuren - Belgium.jpg
Spotlight on the collections _ Royal Museum for Central Africa - Tervuren - Belgium.txt		Spotlight on the collections _ Royal Museum for Central Africa - Tervuren - Belgium.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT4V_Captioner

Features

Installation

Before Running the script, Manually install this in CMD

For Banner Image creation install this:

Usage

Functions

License

Acknowledgements

About

Releases

Packages

Languages

License

AGXAI/GPT4v-Auto_Captioner

Folders and files

Latest commit

History

Repository files navigation

GPT4V_Captioner

Features

Installation

Before Running the script, Manually install this in CMD

For Banner Image creation install this:

Usage

Functions

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages