-
Notifications
You must be signed in to change notification settings - Fork 64
/
CHANGES.html
402 lines (303 loc) · 15 KB
/
CHANGES.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
<!DOCTYPE HTML>
<html>
<head>
<title>Myrrix Changes and Release Notes</title>
<style type="text/css">
body {background-color:#202020}
body,p,h1,h2,h3 {font-family:Gill Sans,Helvetica,sans-serif;font-weight:300;color:white}
h1,h2,h3,a {color:#CCFF66}
h1,h2,h3 {text-transform:uppercase}
hr {margin:20px 0 10px 0}
</style>
</head>
<body>
<h1>Version 1.0.2</h1>
<h2>Fixes</h2>
<ul>
<li>Issue 84: Issue 84: not setting property <code>batch.emr.workers.bid</code> should again
correctly choose ON_DEMAND worker instances whe using Amazon EMR</li>
</ul>
<h1>Version 1.0.1</h1>
<p>July 11 2013</p>
<h2>Fixes</h2>
<ul>
<li>Issue 79: fix a ClassCastException from SelfOrganizingMaps in the feature space map display.</li>
<li>Issue 81: don't require a port in fs.defaultFS, in Hadoop configuration.</li>
</ul>
<h1>Version 1.0.0</h1>
<p>June 20 2013</p>
<h2>Other</h2>
<ul>
<li>Remove license file requirement. Do not set <code>--licenseFile</code> on the command line.</li>
<li>Issue 78: If model.features is too low to build a model, it will be automatically lowered so that
a model can be built.</li>
</ul>
<h1>Version 1.0.0 beta 1</h1>
<p>May 13 2013</p>
<h2>Fixes</h2>
<ul>
<li>Issue 63: the REST API endpoint for the <code>setItemTag</code> has been corrected. See Upgrade Notes.</li>
<li>Issue 64: Values of λ that are too large should cause the model build to be rejected,
rather than cause runtime errors related to large user and item vectors during fold-in.</li>
<li>Issue 65: Evaluation framework can again handle files with tag data</li>
<li>Issue 66: Resolved an issue that generated an exception on adding new data, when few or no clusters
are present, but clustering is enabled</li>
<li>Issue 67: REST API endpoints correctly return a client error instead of exception when expecting a
URL path but none is presented</li>
<li>Issue 68: Fixed error in /estimateForAnonymous endpiont that returned a server error / exception
when all items in the request were unknown, instead of a simple client error</li>
<li>Issue 70: Mean average precision computation has been corrected</li>
<li>Issue 74: Computation Layer: Fixed a possible incorrect cluster assignment; can be incorrect when
input is small, cluster count is high</li>
<li>Issue 75: Fixed an important error in the Computation Layer (only) that can cause recommendations
or similar items, or in some cases steps in iteration, to be missed in some cases where
multi-threaded reducers are used.</li>
</ul>
<h2>Other</h2>
<ul>
<li>Issue 72: When number of feature changes, still bootstrap model state from previous generation's
Y matrix vectors</li>
<li>Issue 77: Standardize output of AllItemSimilarities, AllRecommendations and direct output to a file</li>
</ul>
<h1>Version 0.11</h1>
<p>March 31 2013</p>
<h2>New Features</h2>
<ul>
<li>New "tag" API which allows callers to record interactions between users/items and concepts, or labels,
which inform the model but are not returned in results. For example a user can be "female" and an item
"yellow".</li>
<li>New <code>estimateToAnonymous</code> method, which functions like <code>estimatePreference</code> for
"anonymous" users, like <code>recommendToAnonymous</code>.</li>
</ul>
<h2>Fixes</h2>
<ul>
<li>Minor fixes and improvements to model loading and persistence in Serving Layer.</li>
<li>Java client can now fail over to replicas in more error cases, like mid-request interruption</li>
</ul>
<h2>Other</h2>
<ul>
<li>Improved speed in Computation Layer.</li>
<li><code>IDRescorer</code> behavior has slightly changed for <code>recommendToMany</code>. See Upgrade Notes.</li>
<li>Default value of <code>model.als.alpha</code> and negative strength values has changed. This should result in
a small improvement in quality on many data sets. See Upgrade Notes.</li>
<li>Computation Layer <code>model.cluster.k</code> parameter has been split into <code>model.cluster.user.k</code>
and <code>model.cluster.item.k</code> to control k for user and item clustering, respectively.</li>
<li>The Computation Layer will now delete old generations, keeping 10 generations by default.
If a different value is desired, set <code>model.generations.maxToKeep</code>.</li>
</ul>
<h1>Version 0.10</h1>
<p>February 19 2013</p>
<h2>New Features</h2>
<ul>
<li>Computation Layer now supports <strong>clustering with kmeans++ / spectral clustering</strong></li>
<li>New <code>similarityToItem</code> method to retrieve individual item-item similarities</li>
<li>New <code>mostPopularItems</code> method to list items associated to the largest number of users</li>
<li>New <code>/user/clusters/*</code> and <code>/item/clusters/*</code> methods to query clusters</li>
<li>New "client thread" in the Serving Layer can be used to pull/poll for data from an external source
and interact directly with the Serving Layer</li>
<li>Optional basic DoS attack protection in the Serving Layer</li>
</ul>
<h2>Other</h2>
<ul>
<li><code>RescorerProvider</code> can now use the local recommender instance in its operation</li>
<li>More support for multiple <code>RescorerProvider</code> classes at once</li>
<li>Model building dynamically calculates number of iterations needed; default biases to more initial
iterations (and better accuracy) than previous default.</li>
</ul>
<h2>Fixes</h2>
<ul>
<li>Resolved an issue that could cause two Computation Layer computations sharing the same bucket to interfere.
Support files like <code>keystore.ks</code> and <code>rescorer.jar</code>, which are normally placed in
<code>sys/</code>, should now be placed in <em>instance</em><code>/sys/</code>.</li>
</ul>
<h1>Version 0.9</h1>
<p>December 31 2012</p>
<h2>New Features</h2>
<ul>
<li>Serving Layer can now run in read-only mode to serve a model but not accept updates</li>
<li><code>/ingest</code> endpoint can now accept web-browser-style file uploads</li>
<li>New unified license system is in place. Obtain a trial license from
<a href="http://myrrix.com">myrrix.com</a></li>
</ul>
<h2>Fixes</h2>
<ul>
<li>Standalone Serving Layer .war file deployment works again</li>
<li>HTTP DIGEST authentication is now made compatible with all clients and the latest Tomcat again</li>
<li>Choice of partition and replica is now deterministic in the client. This avoids problems where two
replicas presented slightly different results to the client on successive requests.</li>
<li>Removed "_LOCK" file mechanism in Computation Layer in favor of querying the live cluster state</li>
<li>Stand-alone serving layer avoids a possible exception when recomputing a new model on new items</li>
<li>Several fixes for problems in computing models over very small input</li>
<li>Repeated updates from new users and items should not be able to produce extreme values in the model
now even after many updates</li>
<li>Java client library now supports rescorer parameters</li>
<li>Added missing <code>recommendToMany</code> to client CLI, changed some APIs to be more consistent,
and fixed other small issues in the CLI.</li>
</ul>
<h2>Other</h2>
<ul>
<li>Faster display of self-organizing map visualization for moderate to large data sets.</li>
<li>Client command line has changed slightly, to take optional argument like "howMany" as a flag
(e.g. <code>--howMany 5</code> instead of stand-alone argument</li>
<li><code>mostSimilarItems</code> and <code>recommendToAnonymous</code> will return a result if
at least one argument item exists, rather than if all exist</li>
<li>Update to support Hadoop 1.1.1 and latest Amazon EMR releases</li>
</ul>
<h1>Version 0.8</h1>
<p>November 16 2012</p>
<h2>New Features</h2>
<ul>
<li>Serving Layer console now provides a visualization of users and items in feature space as a self-organizing map.</li>
<li>Partitions may be specified as <code>auto</code> in the Java client, to automatically discover partitions.</li>
<li>Serving Layers may automatically discover partition information, when run on Amazon EC2.</li>
<li>Instance IDs can now be any string, instead of strictly numeric.</li>
<li><code>recommendToAnonymous</code> method now supports rescoring.</li>
</ul>
<h2>Fixes</h2>
<ul>
<li>New users and items should no longer be able to become temporarily unavailable when a new model loads</li>
<li>Forced shutdown (with Ctrl-C, <code>kill</code>) should initiate orderly shutdown and sync
data to S3 / HDFS</li>
<li>Corrected output format from --recommend and --itemSimilarity in Computation Layer</li>
<li>Fixed a bug wherein the Computation Layer could drop a small number of users' data</li>
<li><code>AllItemSimilarities</code> now correctly uses item IDs from model</li>
<li>It is now possible to specify an SSL keystore file and JAR file containing a RescorerProvider to the
Serving Layer, when run on Amazon AWS.</li>
<li>Normalized Discounted Cumulative Gain evaluation metric uses a corrected formula</li>
<li>More efficient LSH implementation which allows performance-accuracy tradeoff at smaller (more reasonable)
data sizes.</li>
</ul>
<h2>Other</h2>
<ul>
<li>Optimizations in Computation Layer for lower Hadoop latency and overhead</li>
<li>Now can control the number of writes between rebuild in stand-alone Serving Layer mode with property
<code>model.local.writesBetweenRebuild</code></li>
<li><code>estimate</code> method now returns 0 for unknown users and items rather than an exception.</li>
<li>Computation Layer supports Hadoop 1.1.x</li>
<li>Computation Layer has minor changes now to operate with Hadoop 2.x, but still requires "MRv1" mode.</li>
</ul>
<h1>Version 0.7</h1>
<p>October 08 2012</p>
<h2>New Features</h2>
<ul>
<li>Computation Layer functionality from PeriodicRunner and GenerationRunner has been merged, and now
adds a web-based console</li>
<li>Experimental feature in Computation Layer to decay importance of old data over time.</li>
<li>New example shell scripts for more convenient execution of Serving Layer, Computation Layer</li>
</ul>
<h2>Fixed</h2>
<ul>
<li>Resolved an exception on startup on the stand-alone Serving Layer when no data is present.</li>
</ul>
<h2>Other</h2>
<ul>
<li>Serving Layer can reload new models incrementally, reducing peak heap size required.</li>
<li>ALS algorithm has been modified to handle negative strength values more reasonably at model building time,
though input should still generally be positive.</li>
<li>Fold-in less aggressively emphasizes new input's effect</li>
<li>Updated to Tomcat 7.0.30, SLF4J 1.7</li>
</ul>
<h1>Version 0.6</h1>
<p>September 3 2012</p>
<h2>New Features</h2>
<ul>
<li>Added <code>getAllUserIDs</code> and <code>getAllItemIDs</code> API methods</li>
<li>Serving Layer contains new command-line programs, <code>AllRecommendations</code> and
<code>AllItemSimilarities</code>, which will recommendations for all users / similar items for all items,
respectively, in bulk. This may be useful to create a simple batch recommendation process when that is
all that's needed.</li>
<li>Serving Layer console has simple online quality estimate: average estimation error</li>
<li>New optional parameter can suppress loading of known items, saving memory but meaning any item
may be recommended, even ones existing in the input already for a user</li>
<li>New <code>PeriodicRunner</code> in Computation Layer for continuous execution</li>
<li>Computation Layer has experimental feature to merge models from other instances</li>
</ul>
<h2>Other</h2>
<ul>
<li>Better parallelization and performance in Computation Layer recommend and similar item computations</li>
<li>Improved evaluation framework, including AUC and precision/recall test</li>
<li>Java client: <code>--translate</code> command line flag has become <code>--translateUser</code> and
<code>--translateItem</code></li>
<li>Update to Guava 13</li>
</ul>
<h1>Version 0.5</h1>
<p>July 9 2012</p>
<h2>New Features</h2>
<ul>
<li>Log output now available on Serving Layer console</li>
<li><code>/ready</code> endpoint in Serving Layer tells whether it is ready to answer requests</li>
<li><code>removePreference</code> now exposed in the REST API</li>
</ul>
<h2>Fixes</h2>
<ul>
<li>Java client can now upload files to ingest when HTTP DIGEST authentication is enabled</li>
<li>Java client and Serving Layer now use compression when receiving data to ingest</li>
<li><code>removePreference</code> behavior is now more correct (e.g. doesn't accept <code>NaN</code>,
recorded correctly by the Serving Layer) and rational (removed problematic 'value' parameter).</li>
</ul>
<h2>Other</h2>
<ul>
<li>Radically faster model building in Serving Layer standalone mode -- usually minutes instead of hours now.</li>
<li>Computation Layer iterations are ~40% faster as three MapReduce jobs have been merged into one</li>
<li>Computation Layer workers no longer have to load the feature matrix into memory, and can deal with
feature matrices that don't fit in memory (at a performance cost).</li>
<li>Update to Tomcat 7.0.29</li>
</ul>
<h1>Version 0.4</h1>
<p>June 10 2012</p>
<h2>New Features</h2>
<ul>
<li>Serving Layer may now be built as a WAR file for deployment in a servlet container.</li>
<li>Rescorer can now accept run-time arguments via URL parameter <code>rescorerParams=</code></li>
<li>Console page now supports ingest, setting pref, etc. -- all API methods</li>
</ul>
<h2>Other</h2>
<ul>
<li>Minor improvements to ALS convergence</li>
<li>Computation Layer is now available, separately</li>
<li>Serving Layer and Java Client are available from Subversion repository on Google Code</li>
</ul>
<h1>Version 0.3</h1>
<p>May 7 2012</p>
<h2>New Features</h2>
<ul>
<li>Now supports rescoring logic for recommend() and mostSimilarItems(). These are available in ServerRecommender
and also may be added to a Serving Layer instance by setting --rescorerProviderClass to an implementation
class which provides rescoring objects.</li>
<li>recommend() can now consider known items as candidates for recommendation, if desired</li>
</ul>
<h2>Fixes</h2>
<ul>
<li>Java client was not able to configure SSL key or auth credentials in some cases.</li>
</ul>
<h2>Other</h2>
<ul>
<li>Update to Guava 12.0</li>
</ul>
<h1>Version 0.2</h1>
<p>April 23 2012</p>
<h2>New Features</h2>
<ul>
<li>Alternating-least-squares computation is now seeded with output of previous run, if available,
for faster model convergence</li>
<li>Added new estimatePreference() method and REST API method that can compute multiple estimates
in one call.</li>
<li>Now contains Ant build script for building the distributed code.</li>
</ul>
<h2>Fixes</h2>
<ul>
<li>Java client result for <code>recommendToAnonymous</code> and <code>mostSimilarItems</code>
were swapped.</li>
<li>Several minor bug fixes.</li>
</ul>
<h2>Other</h2>
<ul>
<li>Update to Apache Tomcat 7.0.27</li>
<li>Update to Apache Commons Math 3.0</li>
</ul>
<hr/>
<h1>Version 0.1</h1>
<p>April 3 2012</p>
<p>Initial release.</p>
</body>
</html>