We want to figure out how to compare two different images and measure their similarity.
Instead of having Bjorn manually open up two pairs of images and give a "Bjorn Score" for their similarities, we want to automate this process by iterating through an entire list of images pairs and calculating a score. This will be outputted into a results file so it will be nicely aggregated.
In a nutshell, this python script reads-in a csv file that contains (absolute) paths to images that are to be compared. Once the pair of images are known, it calculates the Structural Similarity Index, which is a method of evaluating the pixels in given windows of the image. Information regarding the algorithm can be found here: https://en.wikipedia.org/wiki/Structural_similarity
First looked into a few python libraries that handled image comparison. There were two methods that stood out: MSE, SSIM.
MSE performs its calculation by comparing each pixel, thus measuring absolute errors.
We then take the difference between the images by subtracting the pixel intensities. Next up, we square these difference (hence mean squared error, and finally sum them up. In order to calculate the mean, all we are doing is dividing our sum of squares by the total number of pixels in the image.
SSIM is a method for predicting perceived quality of digital images, videos, etc. It was designed to improve traditional methods, such as MSE. "SSIM is a perception-based model that considers image degradation as perceived change in structural information", also taking considerations to both luminance masking and contrast masking terms.
More information can be found on the wiki: https://en.wikipedia.org/wiki/Structural_similarity
The SSIM index is calculated on various windows of an image. The measure between two windows x and y of common size N×N is:
where:
L is the dynamic range of the pixels
Once the method was chosen, I then had to break down the rest of the script.
Using the built-in csv libraries, reading in CSV files were straightforward.
Each time a row from the csv file is read, it passes the two images into the function and does its magic. However, because SSIM requires the images to be of the same dimension, there is extra work prior to the calculation where the images are resized to a default value of 640x480 unless specified. Differing file types (tested with .jpg and .png) can be compared.
As each pair of images are compared, the similarity results and the elapsed time of the comparison are stored and outputted into a new csv file, in the required format from the assignment.
The new csv file will have headers: image1, image2, similar, elapsed
Once all the images are compared, the csv file should have the same amount of rows as the original csv file with the list of images to compare.
PNG and JPEG file types were tested. Comparison between the differing file types work as intended. File types of different sizes are resized to the intended dimensions and can be properly compared. Output CSV file was checked to ensure it matches the order of images being compared from the original CSV file.
Automated test cases has been written and it covers the following scenario, 1) different size image 2) different extension image 3) same image
The expected_results folder has the expected resultset for each scenario. In the test cases we only compared the similar
column
- Valid CSV file
- The headers, rows, and columns are valid
- The order of the columns are: image1, image2
- The paths of the images lead to an existing image
- Python is already installed (I'm using version 3.7 in mac 3.6.8 in windows)
pip
is already installed
pip3 install opencv-python
pip install opencv-python=3.3.0.9
( for windows)
pip3 install scikit-image
pip3 install colorama
pip3 install pathlib
- Setup Python in Windows
- Setting up the code GIT BASH in Windows
- go to Python download page
- Select version 3.6.8 <https://www.python.org/downloads/release/python-368/>
- Download and Install
- Download git from the link https://git-scm.com/download/win
- Install The downloaded installer
git colne [email protected]:kuntalkumarbasu/image-comparison.git
- setup the required libraries
pip install opencv-python==3.3.0.9
( for windows)
pip install scikit-image
pip install colorama
pip install pathlib
- In winodws we used git bash client in linux or mac environment any client will work
- Import all the necessary libraries.
- Pull the code use
git clone [email protected]:kuntalkumarbasu/image-comparison.git
you must make sure theimage-comparison.csv
file is in the same level as themain.py
file. Or you can provide an absolute or relative filepath with a file name
To run the script,
type: python3 main.py
in the command line (if image-comparison.csv
is available in same level as the main.py
file ).
type: python3 main.py <absolute/relative file for the input csv>
in the command line
To run the test,
type: python3 test_main.py
in the command line
Note: I have added a sample csv which is compatible with the images in the image folder. if we run the below command from the base directory of this project, the code will run and generate result.
python3 main.py sample.csv
I've included some sample images (original, contrasted, "photoshopped") in case you needed some quick "similar" photos to test with. Apologies for no sample csv file.
If this project is being passed onto someone else, I will tell that person to read this README to make sure they are following the steps. Unless there needs more features/capabilities, there should be no reason to modify the code.
If all the modules have been installed/imported, yet errors regarding the modules are occuring, double check the versions of the libraries match the ones of the README. It is possible differing library versions can interfere with the functionality. Uninstall and reinstall the proper version of the libraries, or update them to the proper version if they are outdated.
The base of this project is inspired from "https://github.com/denzelkwan/image-comparison". I already opened an PR to the orginal project with all my modifictaions and improvements