DOC: misc updates and typo fixes

elcorto · Aug 7, 2017 · 5aee1d9 · 5aee1d9
1 parent 533aabe
commit 5aee1d9
Show file tree

Hide file tree

Showing 3 changed files with 26 additions and 16 deletions.
diff --git a/README.rst b/README.rst
@@ -90,27 +90,35 @@ image of my kitchen, only from a different angle and some persons in the
 foreground, but the information -- this is my kitchen -- is the same. This is a
 feature-detection task which relies on the ability to recognize *objects*
 within a scene, regardless of other scene parameters (like view angle, color,
-light, ...). It turns out that we need Neural Networks (you know: Tensor Flow
-etc) and some real machine learning for the generation of better *feature
-vectors*, e.g. a feature vector that always encodes the information "my
-kitchen" -- similar to DNNs which learn features automatically. The simple
-image hashing done here is rather limited in that respect. It only does a very
-pedestrian smoothing / low-pass filtering to reduce the noise and extract the
-"important" parts of the image. But this helps to find duplicates and
-almost-duplicates in a collection of photos. And we learn how to do clustering
-with scipy!
+light, ...). It turns out that we may need Neural Networks and some real
+machine learning for the generation of better *feature vectors*, e.g. a feature
+vector that always encodes the information "my kitchen" -- similar to DNNs
+which learn features automatically. The simple image hashing done here is
+rather limited in that respect. It only does a very pedestrian smoothing /
+low-pass filtering to reduce the noise and extract the "important" parts of the
+image. But this helps to find duplicates and almost-duplicates in a collection
+of photos. 
 
 Note that we do not want to do the clustering with an NN, only the feature
 vector generation. The clustering shall be done in a classical fashion as used
 here (hierarchical/agglomerative clustering).
 
+NN-related keywords for doing it right:
+
+* distributed representation/embedding: the network has an abstract
+  representation of the object ("my kitchen"), distributed across the whole net
+  via the weights
+* unsupervised learning: learn that all images should produce the same net
+  output "my kitchen" w/o labeled training data -- i.e. find clusters
+  (structure in the data) automatically
+* neighorhood component analysis, wasabi, category learning
 
 image fingerprints: simple and fast
 -----------------------------------
 These methods basically squash down the image to something like 16x16,
 transform to gray scale and store that as a feature vector of length 16x16, for
-example -> fast. But the method is not invariant against rotation, only scaling
-along x and/or y. 
+example -> fast. But the method is not invariant against rotation and the
+like.
 
 The idea is always to calculate a database of image fingerprints ("hashes",
 feature vectors) and then do searches in feature space (all fingerprints) using

diff --git a/bin/00resize.py b/bin/00resize.py
@@ -69,6 +69,7 @@ def _worker(tup, fraction=None, tgtdir=None, nfiles=None, verbose=False):
     src = os.path.abspath(_src)
     # /home/foo -> _home_foo -> home_foo
     tgt = os.path.join(tgtdir, src.replace('/','_')[1:])
+    # -auto-orient: keep EXIF orientation information
     cmd = "convert -limit thread 1 -sample {}% -auto-orient {} {}".format(
             fraction*100, src, tgt)
     if verbose >= 1:   

diff --git a/calc.py b/calc.py
@@ -23,7 +23,7 @@
 # We tried the resample interpolation methods and measured the speed (ipython's
 # timeit) for resizing an image 3840x2160 -> 8x8. We also compared the resample
 # quality as difference to the best possible resample result, which we defined
-# to be the LANCZOS method (from visual inspection abd b/c it is
+# to be the LANCZOS method (from visual inspection and b/c it is
 # computationally the most elaborate).
 #
 #                                      speed [ms]     
@@ -51,9 +51,8 @@
 
 
 def img2arr(img, size=(8,8), dtype=INT, resample=2):
-    """
-    Convert PIL Image to gray scale and resample to numpy array of shape
-    ``(size,size)`` and `dtype`.
+    """Convert PIL Image to gray scale and resample to numpy array of shape
+    `size` and `dtype`.
 
     Parameters
     ----------
@@ -99,7 +98,9 @@ def dhash(img, size=(8,8)):
 
 
 def cluster(files, fps, sim=0.2, metric='hamming'):
-    """
+    """Hierarchical clustering of images `files` based on image fingerprints
+    `fps`.
+
     files : list of file names
     sim : float
         similarity (1=max. allowed similarity, all images are considered