Skip to content
This repository has been archived by the owner on Dec 13, 2024. It is now read-only.

Commit

Permalink
DOC: misc updates and typo fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
elcorto committed Aug 7, 2017
1 parent 533aabe commit 5aee1d9
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 16 deletions.
30 changes: 19 additions & 11 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,27 +90,35 @@ image of my kitchen, only from a different angle and some persons in the
foreground, but the information -- this is my kitchen -- is the same. This is a
feature-detection task which relies on the ability to recognize *objects*
within a scene, regardless of other scene parameters (like view angle, color,
light, ...). It turns out that we need Neural Networks (you know: Tensor Flow
etc) and some real machine learning for the generation of better *feature
vectors*, e.g. a feature vector that always encodes the information "my
kitchen" -- similar to DNNs which learn features automatically. The simple
image hashing done here is rather limited in that respect. It only does a very
pedestrian smoothing / low-pass filtering to reduce the noise and extract the
"important" parts of the image. But this helps to find duplicates and
almost-duplicates in a collection of photos. And we learn how to do clustering
with scipy!
light, ...). It turns out that we may need Neural Networks and some real
machine learning for the generation of better *feature vectors*, e.g. a feature
vector that always encodes the information "my kitchen" -- similar to DNNs
which learn features automatically. The simple image hashing done here is
rather limited in that respect. It only does a very pedestrian smoothing /
low-pass filtering to reduce the noise and extract the "important" parts of the
image. But this helps to find duplicates and almost-duplicates in a collection
of photos.

Note that we do not want to do the clustering with an NN, only the feature
vector generation. The clustering shall be done in a classical fashion as used
here (hierarchical/agglomerative clustering).

NN-related keywords for doing it right:

* distributed representation/embedding: the network has an abstract
representation of the object ("my kitchen"), distributed across the whole net
via the weights
* unsupervised learning: learn that all images should produce the same net
output "my kitchen" w/o labeled training data -- i.e. find clusters
(structure in the data) automatically
* neighorhood component analysis, wasabi, category learning

image fingerprints: simple and fast
-----------------------------------
These methods basically squash down the image to something like 16x16,
transform to gray scale and store that as a feature vector of length 16x16, for
example -> fast. But the method is not invariant against rotation, only scaling
along x and/or y.
example -> fast. But the method is not invariant against rotation and the
like.

The idea is always to calculate a database of image fingerprints ("hashes",
feature vectors) and then do searches in feature space (all fingerprints) using
Expand Down
1 change: 1 addition & 0 deletions bin/00resize.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ def _worker(tup, fraction=None, tgtdir=None, nfiles=None, verbose=False):
src = os.path.abspath(_src)
# /home/foo -> _home_foo -> home_foo
tgt = os.path.join(tgtdir, src.replace('/','_')[1:])
# -auto-orient: keep EXIF orientation information
cmd = "convert -limit thread 1 -sample {}% -auto-orient {} {}".format(
fraction*100, src, tgt)
if verbose >= 1:
Expand Down
11 changes: 6 additions & 5 deletions calc.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
# We tried the resample interpolation methods and measured the speed (ipython's
# timeit) for resizing an image 3840x2160 -> 8x8. We also compared the resample
# quality as difference to the best possible resample result, which we defined
# to be the LANCZOS method (from visual inspection abd b/c it is
# to be the LANCZOS method (from visual inspection and b/c it is
# computationally the most elaborate).
#
# speed [ms]
Expand Down Expand Up @@ -51,9 +51,8 @@


def img2arr(img, size=(8,8), dtype=INT, resample=2):
"""
Convert PIL Image to gray scale and resample to numpy array of shape
``(size,size)`` and `dtype`.
"""Convert PIL Image to gray scale and resample to numpy array of shape
`size` and `dtype`.
Parameters
----------
Expand Down Expand Up @@ -99,7 +98,9 @@ def dhash(img, size=(8,8)):


def cluster(files, fps, sim=0.2, metric='hamming'):
"""
"""Hierarchical clustering of images `files` based on image fingerprints
`fps`.
files : list of file names
sim : float
similarity (1=max. allowed similarity, all images are considered
Expand Down

0 comments on commit 5aee1d9

Please sign in to comment.