-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize GeneralBindingModel #90
Comments
@pgrinaway @gregoryross @maxentile : If you had any advice on how to optimize the target function with minimal effort, I'd be very appreciative of suggestions! |
Taking a look at this now. |
Thanks! I realized that even just encoding the quantities as numpy arrays outside of the target function will likely lead to a big speedup, but I was wondering if there was some sort of numba or other thing that might trivially speed this up. |
Ok, the most obvious way of optimizing this would be to implement it in Cython, but there are a few tricky points:
It's not substantially easier with So, it would require a bit of re-thinking how the function works (which might even in pure python/numpy make the code faster). I can take a stab at this, if you want. |
I don't think |
I think the key to speeding things up is to speed up just the target function that gets called many times by the root finder. It accepts a numpy array and returns two numpy arrays. I should be able to move some logic outside the target function and do something to speed up the computation inside the target. We can still have the binding model funciton accept and return dicts, but it can do some preprocessing so that the target function runs as fast as possible on numpy arrays. |
Also took a look! Yeah, numba doesn't know what to do with dicts (code using dictionaries will run in "object mode" / as slow as interpreted code), so those will need to be flattened out into arrays beforehand for numba to have any effect... @jchodera : Could you point to a few input instances to optimize for? On the input in the doctest, the Pycharm profiler says the most time-consuming functions are in blas, so I think loop-jitting might not produce a substantial speed-up on that instance. However, if there are a lot more |
Here is the description of the calculation in equations, for reference: http://assaytools.readthedocs.io/en/latest/theory.html#general-binding-model |
Whoops, let me post the branch name and info on how to run the example shortly. Sorry about that! I do realize we don't want to have any dict processing inside the target function. I can do preprocessing outside the target and feed numpy arrays in. But is that the best I can do? |
Right, this is what I meant. If we deal with dicts and resizing lists inside the target function, it's going to be pretty difficult to optimize. |
Once you've done that, you can give Same with Cython--you can see what I did in yank. I also gradually made the code more and more C-like to get maximum performance. |
At first glance it might not work, but you might be able to use |
Thanks! This is super helpful! |
No need to spend time with this, but since someone asked, you want to check out the @maxentile may be totally right that the time-consuming part may not be the target function computation. I hadn't even considered that! I had tried adding an LRU cache to store the most recent 128 or 1024 evaluations, but that did not seem to improve performance. |
Ok, I've profiled that script, and the top two functions that are causing slowness are:
I will post the profiling results in full later, but this does suggest that |
Ah, |
Sorry for the spam, but it seems to me like it is bits of PyMC that are quite slow, and that the |
That's really odd. The I wish I knew more about how pymc worked, since this suggests that we could speed things up by unwrapping the input pymc variables once and feeding them to a pre-constructed function that only dealt in numpy floats to do the optimization with pre-computed matrices. In other news, trying to figure out how to recode this all for tensorflow or edward is also not proving easy, since tensorflow compute graphs can't have nodes that are iterative operations like function solvers. |
It is called several times more than
I suspect this is the case. It seems like some sort of
Right, that's unfortunate. What about just coding it up in regular Python with |
From some profiling tests, it looks like root solving in the
GeneralBindingModel
is the dominant cost of the new autoprotocol-based assaytools changes, with total cost being somewhere around 76% of total sampling time spent in calls toscipy.optimize.root
here.I imagine we could greatly speed this up by rewriting the target function using some optimization scheme.
The text was updated successfully, but these errors were encountered: