A powerful script that allows you to caption a single image or batch process any number of images within a directory. This script utilizes OpenAI's GPT-4 model to generate precise tags for images to enhance the CLIP model's understanding.
- User Friendly: You can put or copy paste the Script anywhere you like where you want to caption the images.
- Single and Batch Image Processing: Caption one image or multiple images in a directory.
- Base64 Encoding: Converts images to base64 for API processing.(Must have Open AI's API key)
- GPT-4 Integration: Uses OpenAI's GPT-4 model for generating image captions.
- Adding your own Captions: After Vision done captioning your images, it will ask you to add your own custom Tags to add on top of the .txt file for all .txt files at once. (Optional)
-
Clone the repository:
git clone https://github.com/ababiya-worku/GPT4v-Auto_Captioner.git cd GPT4V_Captioner
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up your OpenAI API key: Obtain your API key from OpenAI and put the key when asked:
pip install openai requests
pip install openai requests pillow
pip install --upgrade openai
pip install wordcloud matplotlib
pip install colorama openai pillow wordcloud matplotlib
-
Single Image Captioning:
python GPT4V_Captioner.py single /path/to/your/image.jpg
-
Batch Image Captioning:
python GPT4V_Captioner.py batch /path/to/your/directory
-
Use the .Bat File: Simply Put both the script & bat file in your images directory & double-click on the bat file, and let it do the Magic!
GPT4V_Captioner.bat
is_image_file(filename)
: Checks if the provided file is a valid image.encode_image(image_path)
: Encodes the image to a base64 string.describe_image(image_path, api_key)
: Sends the image to OpenAI's GPT-4 model for captioning.
This project is licensed under the MIT License. See the LICENSE file for details.