-
Notifications
You must be signed in to change notification settings - Fork 18
/
README
240 lines (193 loc) · 9.69 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
iqdb - Image Query Database system
Distributed under the terms of the GNU General Public License,
see the file COPYING for details.
1) Compiling
- install either the libGD+libpng+libjpeg, or the ImageMagick dev package
Using GD, on Debian: libgd2-xpm-dev libpng12-dev libjpeg62-dev
Using ImageMagick, on Debian: libmagick9-dev, on Fedora: ImageMagick-devel)
- check the top of the Makefile for settings
- run make
- if it fails, fix the Makefile or something
2) Running
The iqdb program has two operating modes: database maintenance and query server.
The maintenance mode is to update image databases: add and remove images, set
image properties and retrieve database stats. The query server mode listens on
a TCP port for query commands and returns matching image IDs.
a) Maintenance
Two major modes of calling the maintenance mode:
$ iqdb add foo.db
This mode expects a list of image IDs and filenames on stdin, in this format:
<ID> [<width> <height>]:<filename>
The ID is the image ID in hexadecimal. If width and height are not specified,
the values from the filename are used instead. The filename can be any image
format supported by ImageMagick; the image will be resized to 128x128 and
added to the database. Duplicate IDs are ignored. Note that iqdb does not
remember the filename associated with an image ID, you are responsible for
keeping track of that. It refers to an image exclusively by the ID.
A more complex mode allows add, removing, updating and querying images for
multiple databases at once:
$ iqdb command foo.db bar.db baz.db
This mode expects commands on stdin. For a full list see do_commands() in
iqdb.cpp. The <dbid> argument refers to the database on the command line, with
the first argument being dbid=0. Commands of particular interest are:
add <dbid> <imgid> [<width> <height>]:<filename>
See above.
remove <dbid> <imgid>
Remove the given image from the DB.
rehash <dbid>
This commands updates the coefficient bucket sizes.
It is normally not needed, and is inefficient because
it needs to read in the entire database.
(As a special feature, `iqdb rehash foo.db' can
be used to upgrade an old version of the DB.)
done
Quit and save all databases to their original file.
quit
Quit and save all databases to their original file.
The maintenance commands (see listen mode below) are also supported in
command mode, but perhaps not very useful here.
Changes are automatically saved in the original file upon EOF on stdin,
or when sending the "done" or "quit" commands.
DO NOT interrupt the program with Ctrl-C or similar until the database
is closed or it may become corrupted.
b) Server mode
In query server mode, iqdb loads the databases into memory in read-only mode
to allow the fastest image queries. No database modifications are possible.
$ iqdb listen [IP:]port [-r] [-d=<debuglevel>] [-s<IP/host>...] foo.db bar.db baz.db
Listens on the given IP:port (default localhost if no IP given) for commands,
after loading the given databases. If -r is specified and the port is
currently in use by another instance of iqdb, it will load the database and
then terminate that instance, to reduce the downtime while loading the
databases (and to leave the old server running in the rare case that it
should crash while starting up). The debug level may be set with -d.
The -s option allows limiting iqdb queries to the given source IPs. This
is useful if you cannot bind iqdb to localhost, for instance because the
query script and iqdb run on different hosts. Then you can limit iqdb
requests to the host where the script is running. Note that you should
probably also specify "-s<host-IP>" with the IP as given to the listen
argument, to allow requests from the local host, for instance to add images
to the database and to make the -r option work.
$ iqdb listen2 [IP:]port [options...] foo.db bar.db baz.db
Same as above, but listens on the given port and one port below it (i.e.
if port 5588 is specified, also listens on 5587). The lower port is higher
priority and all pending requests are serviced before the other port.
To end a request, send the "done" command.
Of particular interest are the following commands:
query <dbid> <flags> <numres> <filename>
Find numres images most similar to given filename. If the
filename starts with ':', use e.g. "./filename" to disambiguate
it from the literal image data query below.
Flags is a bitmask:
0 = normal operation
1 = file contains a sketch, use different weights
2 = force grayscale match, discard color information
8 = consider image width as set ID instead and
return only the best match for each set
16= discard common coefficients (those present in at
least 10% of all images), this finds a
near-identical match much faster but the
non-matching images have less similarity
query <dbid> <flags> <numres> <:size>
As above, but if the filename argument starts with ':', the
given number of bytes of literal image data starts on the next
line after the command. This allows querying without first
writing the image data to a file, and to query images that don't
locally exist on the host running the iqdb server.
multi_query <dbid> <flags> <numres> [+ <dbid2> <flags2> <numres2> +...] <filename>
multi_query <dbid> <flags> <numres> [+ <dbid2> <flags2> <numres2> +...] <:size>
Merge query results from multiple databases. This adjusts the
individual scores to deal with varying similarity noise levels
in different databases (the noise level depends on the number
of images in the DB and how self-similar they are).
sim <dbid> <flags> <numres> <id>
Like the "query" function, but queries for images similar to
the image with the given ID. This only works if the database
has been loaded in "readonly" mode, e.g. using the "load"
command described below.
query_opt <option> <arguments...>
Set advanced options for the next query. Options are reset after
the next query or multi_query command. Available options are:
mask <and> <xor>
Consider image height as bitmask and only return images
for which ((bitmask & and) ^ xor) == 0. In other words,
for each bit that is set in "and", the image mask's bit
must match that of the "xor" mask. This can be used to
limit search results to given rating masks for example.
mindev <min.std.dev.>
Return only results that are the given standard deviation
above the noise level. Nothing is returned if there are
no relevant results at all.
The "add" and "remove" commands are now supported as well, however they
only modify the memory representation of the DB and cannot be saved back
to disk later. They allow you to update the server without restarting it
but require that you also update the DB file directly.
Additionally, since version 20081123 the listen mode also supports DB
maintenance commands. In two-port mode these will only be accepted on
the lower (high-priority) port.
load <dbid> <mode> <filename>
Loads the given DB file into the given dbid (which must not
be in use yet). The mode can be one of the following:
simple default listen mode, supports fast queries
as well as adding/removing images
readonly requires more memory but supports image
ID queries (needed for duplicate finding)
alter default command mode, supports fast updates
of DB files but no queries
normal full functionality but requires much memory
and has slow queries, mainly useful for
upgrading old DB versions
drop <dbid>
Drops the given database. With drop and load commands in the
same connection, a database will be reloaded or replaced
atomically, however the server will be blocked until loading
is complete. When loading a large DB it is recommended to
start a new copy of the server with the -r option instead.
db_list
Lists all loaded databases with dbid and filename.
The server has the following possible responses:
000 iqdb ready
Displayed whenever iqdb is ready to
accept a command.
100 <text>
General informational message.
101 <key>=<value>
Specific parseable information.
102 <dbid> <filename>
Response to db_list command.
200 <imgid> <score> <width> <height>
Query result.
201 <dbid> <imgid> <score> <width> <height>
Multi-query result.
202 <original id>=<std.dev> <dupe1 id>:<sim1> [...]
Duplicate finder result.
300 <text>
General error message.
301 <exception> <description>
Non-fatal error message (e.g. invalid image ID, or
unreadable image file)
302 <exception> <description>
Fatal error message (e.g. corrupted database)
3) Querying
The file iqdb.php holds sample PHP code to connect to a running iqdb server
and queries it for similar images.
4) Converting
Since version 20090612, iqdb will automatically detect the integer sizes
used for writing a given image database. It will automatically convert them
to its internal format on reading. Writing a non-native image database is
not possible.
This version also by default is compiled in 64-bit mode to make the image
database file platform-independent.
It is possible, but not recommended however, to have iqdb convert an image
database of an older version into this new platform-independent format. It's
better to rebuild the database from scratch. If you do want to convert the
database, first make a backup of it, in case the conversion should fail.
Then run "iqdb rehash <db-file>".
If you haven't turned off debugging, this should output the following:
Loading db (converting data sizes)...
Database loaded from <db-file>, has <num> images.
After loading, it will save the database again in the platform-independent
format. Pay particular attention to the number of images it reports. If
this isn't the number you expect, the conversion probably didn't work. In
that case you will have to rebuild the database from scratch using the
new version.