-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the two piecewise-linear regression calculation #67
base: main
Are you sure you want to change the base?
Fix the two piecewise-linear regression calculation #67
Conversation
Said, I'm hoping Johnny and/or Victor will review this. If they are unable, I will. |
a9cd222
to
b12b93d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Said,
I've spent a chunk of time going over this change and from my read thus far this looks like a good fix for the issues you pointed out in the description of this PR, nothing really material to comment on, but I did leave a few comments point out a few minor issues for you to consider.
The bigger question I have is whether you already have a view of how this performs relative to the current code, i.e. I'm assuming you've pushed a number of run results through this and done a side-by-side comparison already? If so, seeing the results of that in this PR/issue would certainly be helpful as there's a decent amount of complexity in how this is all calculated.
Thanks!
MotionMark/resources/statistics.js
Outdated
@@ -1,5 +1,5 @@ | |||
/* | |||
* Copyright (C) 2015-2017 Apple Inc. All rights reserved. | |||
* Copyright (C) 2015-2014 Apple Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guessing this should be 2024 rather than 2014...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
_areEssentiallyEqual: function(n1, n2) { | ||
const epsilon = 0.0001; | ||
return Math.abs(n1 - n2) < epsilon; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add some rationale in a comment here for why 0.0001 is a reasonable epsilon? I'm assuming that was empirically determined, or is there more behind that choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wanted to avoid having two essentially parallel lines but treated as intersected lines. This will lead to having a very far intersection point by this calculation if the difference between segment2.t and segment1.t is too small.
_intersection: function(segment1, segment2)
{
return (segment1.s - segment2.s) / (segment2.t - segment1.t);
}
MotionMark/resources/statistics.js
Outdated
|
||
if (!this._areEssentiallyEqual(segment1.t, segment2.t)) { | ||
// If segment1 and segment2 are not parallel, then they have to meet | ||
// at complexity such that xp < complexity < x. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code now uses xn instead of xp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is right. x
and xn
represent two consecutive samples. So this comment should include the inequality x < complexity < xn
.
MotionMark/resources/statistics.js
Outdated
let a1 = 0, b1 = 0, c1 = 0, d1 = 0, h1 = 0, k1 = 0; | ||
let a2 = 0, b2 = 0, c2 = 0, d2 = 0, h2 = 0, k2 = 0; | ||
|
||
for (var j = 0; j < sortedSamples.length; ++j) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use "let" in this loop as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Could we add a unit test for the old scoring algorithm, and then this PR should modify that test. I'd also like to see an independent implementation of the scoring (either in code, or in Numbers) to validate the algorithm. |
WebKit#66 The current implementation of the regression calculation has these flaws: 1. When processing (x[0], y[0]), L1 must be any line through (x0, y0) which meets L2 at a point (x’, y’) where x[0] < x' < x[1]. L1 has no error. 2. When processing (x[n - 2], y[n - 2]), L2 must be any line through (x[n - 1], y[n - 1]) which meets L1 at a point (x’, y’) where x[n - 2] < x' < x[n - 1]. L2 has no error. 3. The lambda calculation is incorrect. It includes a term called H which is equal to C - I. Looking at the algorithm of Kundu/Ubhaya, this should be just C. 4. lambda should to be used with calculating L1 and (1 - lambda) should to be used with calculating L2. Currently (1 - lambda) is used in calculating L1 and L2. 5. The current calculation has this condition if (t1 != t2) continue; This condition is almost always true even if t1 and t2 are essentiallyEqual.
b12b93d
to
40af19e
Compare
Hi Johnny, Thanks for your review. I have not done extensive comparisons between the old and the new calculations. Our A/B system can't compare MotionMark changes. But I did compare JSON's of some runs with and without this change. If you open the developer.html page and drag and drop the JSON file on the button "Drop results here", MotionMark will calculate the scores without running the tests. I am attaching two JSON files for two MotionMark runs on two different devices. I am attaching screenshots, when I dropped the JSON files locally and when I drop them in https://browserbench.org/MotionMark1.3.1/developer.html. As you can see from the screenshots there is a slight difference between the two calculations. |
Hey Said, First off, apologies it took a while to look into this again. I was able to reproduce the numbers you show in the screenshots. Did you have any thoughts on the Design test results in your first screenshot, where the 80% Confidence Interval is "0.00 - 547.00 -100.00% +17.20% ". The graph for that one does not look like the algorithm did a good job fitting the lines to the data points, though the score is arguably close to correct? Here's what I see in the graph: |
Hi Johnny, The large confidence interval with the new rewrite happens because the optimal complexity is not clipped as before. This part in the original code removes the extreme results in the histogram you posted above (the ones which are close to zero and the ones around 550).
Please notice that options.shouldClip was always set to true. I can restore this clipping back. |
#66
The current implementation of the regression calculation has these flaws:
When processing (x[0], y[0]), L1 must be any line through (x[0], y[0]) which meets L2 at a point (x’, y’) where x[0] < x' < x[1]. L1 has no error.
When processing (x[n - 2], y[n - 2]), L2 must be any line through (x[n - 1], y[n - 1]) which meets L1 at a point (x’, y’) where x[n - 2] < x' < x[n - 1]. L2 has no error.
The lambda calculation is incorrect. It includes a term called H which is equal to C - I. Looking at the algorithm of Kundu/Ubhaya, this should be just C.
lambda should to be used with calculating L1 and (1 - lambda) should to be used with calculating L2. Currently (1 - lambda) is used in calculating L1 and L2.
The current calculation has this condition if (t1 != t2) continue; This condition is almost always true even if t1 and t2 are essentiallyEqual.