diff --git a/README.rst b/README.rst index 0dfc196..a4584c1 100644 --- a/README.rst +++ b/README.rst @@ -90,27 +90,35 @@ image of my kitchen, only from a different angle and some persons in the foreground, but the information -- this is my kitchen -- is the same. This is a feature-detection task which relies on the ability to recognize *objects* within a scene, regardless of other scene parameters (like view angle, color, -light, ...). It turns out that we need Neural Networks (you know: Tensor Flow -etc) and some real machine learning for the generation of better *feature -vectors*, e.g. a feature vector that always encodes the information "my -kitchen" -- similar to DNNs which learn features automatically. The simple -image hashing done here is rather limited in that respect. It only does a very -pedestrian smoothing / low-pass filtering to reduce the noise and extract the -"important" parts of the image. But this helps to find duplicates and -almost-duplicates in a collection of photos. And we learn how to do clustering -with scipy! +light, ...). It turns out that we may need Neural Networks and some real +machine learning for the generation of better *feature vectors*, e.g. a feature +vector that always encodes the information "my kitchen" -- similar to DNNs +which learn features automatically. The simple image hashing done here is +rather limited in that respect. It only does a very pedestrian smoothing / +low-pass filtering to reduce the noise and extract the "important" parts of the +image. But this helps to find duplicates and almost-duplicates in a collection +of photos. Note that we do not want to do the clustering with an NN, only the feature vector generation. The clustering shall be done in a classical fashion as used here (hierarchical/agglomerative clustering). +NN-related keywords for doing it right: + +* distributed representation/embedding: the network has an abstract + representation of the object ("my kitchen"), distributed across the whole net + via the weights +* unsupervised learning: learn that all images should produce the same net + output "my kitchen" w/o labeled training data -- i.e. find clusters + (structure in the data) automatically +* neighorhood component analysis, wasabi, category learning image fingerprints: simple and fast ----------------------------------- These methods basically squash down the image to something like 16x16, transform to gray scale and store that as a feature vector of length 16x16, for -example -> fast. But the method is not invariant against rotation, only scaling -along x and/or y. +example -> fast. But the method is not invariant against rotation and the +like. The idea is always to calculate a database of image fingerprints ("hashes", feature vectors) and then do searches in feature space (all fingerprints) using diff --git a/bin/00resize.py b/bin/00resize.py index 0fa6512..dc1ba3d 100755 --- a/bin/00resize.py +++ b/bin/00resize.py @@ -69,6 +69,7 @@ def _worker(tup, fraction=None, tgtdir=None, nfiles=None, verbose=False): src = os.path.abspath(_src) # /home/foo -> _home_foo -> home_foo tgt = os.path.join(tgtdir, src.replace('/','_')[1:]) + # -auto-orient: keep EXIF orientation information cmd = "convert -limit thread 1 -sample {}% -auto-orient {} {}".format( fraction*100, src, tgt) if verbose >= 1: diff --git a/calc.py b/calc.py index de45578..a5f6ddd 100644 --- a/calc.py +++ b/calc.py @@ -23,7 +23,7 @@ # We tried the resample interpolation methods and measured the speed (ipython's # timeit) for resizing an image 3840x2160 -> 8x8. We also compared the resample # quality as difference to the best possible resample result, which we defined -# to be the LANCZOS method (from visual inspection abd b/c it is +# to be the LANCZOS method (from visual inspection and b/c it is # computationally the most elaborate). # # speed [ms] @@ -51,9 +51,8 @@ def img2arr(img, size=(8,8), dtype=INT, resample=2): - """ - Convert PIL Image to gray scale and resample to numpy array of shape - ``(size,size)`` and `dtype`. + """Convert PIL Image to gray scale and resample to numpy array of shape + `size` and `dtype`. Parameters ---------- @@ -99,7 +98,9 @@ def dhash(img, size=(8,8)): def cluster(files, fps, sim=0.2, metric='hamming'): - """ + """Hierarchical clustering of images `files` based on image fingerprints + `fps`. + files : list of file names sim : float similarity (1=max. allowed similarity, all images are considered