-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize CollectionSpec checker #436
base: master
Are you sure you want to change the base?
Conversation
@@ -23,19 +23,18 @@ | |||
(let [_ (macros/assert! (= 2 (count e)) "remaining can have only one schema.") | |||
c (spec/sub-checker (second e) params)] | |||
#?(:clj (fn [^java.util.List res x] | |||
(doseq [i x] | |||
(.add res (c i))) | |||
(reduce (fn [res i] (doto res (.add (c i)))) res x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this expected to be faster? The List
can't be used safely concurrently so I would be surprised if you could do better than the doseq
. Have you profiled and seen a speedup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reduce
(and run!
) is faster and more memory efficient because as an internal iterator it can be fully specialized to the collection and usually does not need to allocate at all. I have profiled it sometime but by now it is just a habit like using (into [] (map ...) x)
instead of (into [] (map ... x))
. Since reduce
works on all seqs really the only reason to ever use doseq
is if :when
/:while
/:let
is super useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The weird thing about reduce
here is that it's threading through an extra argument that you don't care about because you're mutating the argument anyway. I think it makes the code slightly less clear and I would be a bit surprised if there was a performance benefit, but I'd be happy to believe it if there were accompanying measurements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The weird thing about reduce here is that it's threading through an extra argument that you don't care about because you're mutating the argument anyway.
Might be worth comparing to (reduce #(.add res (c %2))) nil x)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it would probably beat the current code handily since I don't see a type hint on the inner res
.
EDIT: I've added a reflection check to the master build.
EDIT2: Of course I missed the doto
propagating the type hint, nevermind.
(then res nil)) | ||
:cljs (fn [res x] | ||
(swap! res into (map c x)) | ||
(reduce (fn [res i] (doto res (.push (c i)))) res x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that something mutable could be faster here, but I'm again unsure about the use of reduce
. Can you share profiling results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
into
uses reduce
too! Here it is more important that the data structure is mutable. I'll do some profiling later.
@@ -58,7 +57,7 @@ | |||
|
|||
:cljs | |||
(defn- has-error? [l] | |||
(some utils/error? l))) | |||
(.some l utils/error?))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with cljs, can you explain the change here please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that l
is a JS array it just uses Array#some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Is that expected to be appreciably faster?
Thanks for the PR! Certainly happy to accept optimizations but always prefer if they come with measurements. And would like to learn more about the choice of |
I want to do some benchmarks but for now I am out of time. |
Thanks! I just want to make sure we are striking the right balance between performance and clarity, i.e. only making optimizations that make a measurable difference. My intuition is admittedly stale here, so I'm happy to take these with measurements, or happy if another maintainer wants to take them without. |
I don't think this adds much complexity; it is just using JS arrays just like it was already using While I am fairly confident it will also be faster it would be silly to optimize without benchmarking. |
doseq
withreduce
on JVMThe latter could improve #356 substantially; in the Malli seqex impl I had a similar situation with a persistent set in an atom on cljs and it was just unacceptably slow.