Clustering scenes based on images
- opencv 2.4
- numpy
- pandas
- python 2.7
- Clone the repository
- mkdir data/ and put images in the directory
- mkdir out/
- cd into src/ directory
python run.py
- The 'distance' measure betwen the pairwise images is computed based on number of good matches obtained from SIFT features and FLANN matcher(
utils.getDistanceM
), the higher the measure, more closely the image is related - An equalizer option is also provided(deafult= False) in
utils.getDistanceM
, which utilizes CLAHE (Contrast Limited Adaptive Histogram Equalization) algorithm for histogram equalization. This has a very overhead in terms of time execution, for example with 3 images, the time is increased from 7.9s to 123.5s for the execution ofutils.getDistanceM
. When tried on a test case it does improve clustering for an highly illuminated image, but it has no affect in the case where the image is highly illuminated in a region(eg .sun is visible in the image). In such a case, a better model for detection and matching would be useful, mentioned in TODO - The produced distanceM is fed into
utils.isSimilar
to generate a dictionary with images similar based on the average distance and a user provided threshold(default = 5) - Once the dictionary is obtained, a clustering operation is executed in
utils.cluster
which clusters it into different clusters and outputs a dictionary with different cluster assignments. - A tiebreaking operation is also executed in
utils.tieBreak
in the case an image belongs to two different clusters. This is done based on average distance from the image to all the other images in the cluster, the bigger value wins the tie. - The output is dumped as a .json file in out/
- Use a better model to detect region of interest
- Use deep features from the detected regions to do matching