This project provides a simple yet effective tool for identifying and managing duplicate image files within a directory. It's particularly useful for organizing large collections of images, freeing up disk space, and maintaining a clean file structure.
- Scans directories recursively for image files (supports .png, .jpg, .jpeg, .gif, .bmp)
- Uses MD5 hashing to identify duplicate images
- Moves duplicate files to a separate 'dupes' folder
- Provides a progress bar for visual feedback during processing
- The script walks through the specified directory and its subdirectories.
- It calculates an MD5 hash for each image file encountered.
- If a duplicate hash is found, the corresponding file is moved to a 'dupes' folder.
- The original files remain in their initial locations.
- Place the script in the directory you want to scan.
- Run the script using Python:
python dupes.py
- The script will process all images in the current directory and its subdirectories.
- Duplicate images will be moved to a 'dupes' folder within the current directory.
- Python 3.x
- Pillow library
- tqdm library
You can install the required libraries using pip.
This is an early version and I'm sure there's a lot of room for improvement. This is an open source project and I'm happy to accept any contributions.