Numba for speed up #27

amoodie · 2020-05-07T18:40:31Z

Highlights

I was messing about and tried to see what kind of speed-up we could get with just using Numba.

I profiled the following loop on develop (i.e., the tests/time_water.py script):

delta = pyDeltaRCM()
start = time.time()
for _t in range(0, 10):
    print("timestep:", _t)
    delta.update()
end = time.time()

for an elapsed time = 1.2779 min.

With some really simple changes to the water routing routines (only changed one of three list(map(lambda(... calls) and change of random picking in a few places, I get an elapsed time 0.5052 min.

This is really promising for #1, I think.

Question

How much do we care about exact consistency, down to the random number sequences?

These changes use the Numba random number generator (rather than the numpy rng), and so they produce different results. For example, the tests/test_output_consistency.py test fails with the Numba implementation. The expected bed elevation at a small slice after one update():

what is returned with Numba:

array([[ 0.        , -1.        , -1.        , -1.        ,  0.        ],
       [-0.92157906, -0.9095978 , -0.8295175...99838   , -0.99919   ],
       [-1.        , -1.        , -1.        , -1.        , -1.        ]],

what is expected:

array([[ 0.        , -1.        , -1.        , -1.        ,  0.        ],
       [-0.8469125 , -0.82371545, -0.8295175...   , -0.9981775 , -0.9989875 , -0.9997975 ],
       [-1.        , -1.        , -1.        , -1.        , -1.        ]]))

This ultimately would be an issue for any C implementations too. My feeling is that it's fine to break exact consistency between versions, as long as we note it in a changelog. I think breaking behavior changes of the model are much more important, and require a higher-level need to allow.

Anyway, I don't want to merge this now, just looking for some disucssion. @elbeejay @ericbarefoot @ammilten

Also, sorry the commit history will shorten when #26 is merged.

amoodie · 2020-05-08T01:54:51Z

On second thought, reproducible runs are important, so I'll look at ways to make sure the runs are deterministic from a seed.

amoodie · 2020-05-08T02:23:42Z

okay, well turns out that was pretty easy, just have to make sure the rng seed is set in a jitted function, and then only and always call np.random from jitted functions.

elbeejay · 2020-05-14T18:16:26Z

pyDeltaRCM/utils.py

+import time
+import numba
+
+# utilities used in various places in the model and docs


Seems very similar in intent to the shared_tools.py file we have. Any reason they can't be combined? (I have no attachment to either name)

yeah this would eliminate the shared_tools file. I have this done in another commit, but haven't pushed it. I'll do some of the jitting again, on top of Eric's changes.

It turns out that numba doesn't support jitting classes in the way that we would like to use them (for example, they don't import inheritance) and so the way you've done it seems like the simplest way forward for us.

elbeejay · 2020-05-14T18:17:51Z

Idea is quite nice, that is a pretty significant speed up indeed.

@amoodie

Copy of what @amoodie did in DeltaRCM#27

@amoodie

This makes them available outside the object, which is important in some cases, and the class just gets them from here. feat: make shared_tools into a module, not a class. This is in preparation for jitting, since we want them to be standalone functions. feat: break out step that updates water parcel positions These functions now become helper utility functions in shared_tools, and are wrapped by update_Q, which handles them all, and passes them what they need. refactor: break out sed functions into smaller functions This modularizes the code and makes it easier to compile. feat: jit the functions apply jit to functions, and fix some type problems that came along with refactoring. refactor: move next step calculation to jitted functions The next slowest operation was getting the indexes of the next cells, so I moved that to a jitted function too. refactor: jit the calculation of weight array for the water fix: move random number generation into numba Copy of what @amoodie did in DeltaRCM#27 fix: some broken tests to test the shared tools portion. fix: add numba to requirements fix: move all inlet picking into the random_pick function There's essentially no difference between the random_pick function and the random_pick_inlet function other than there is no preference given for the inlet cells, so you can pick whichever one, so I just adjust the mechanism to use the existing random pick machinery. fix: water and sediment routing bugs that caused inconsistent behavior There were two issues with the refactoring. One was that water routing developed a flat flow field because I wasn't adding each passing water parcel to the Q field. The other was related to sediment parcel routing, where I had transcribed the i and j of stepping incorrectly in the jitted functions.

ericbarefoot · 2020-06-03T14:57:53Z

Since we merged #36, I'm going to go ahead and close this. Thanks for the great idea and all the good investigation and development that this PR motivated.

@amoodie

This makes them available outside the object, which is important in some cases, and the class just gets them from here. feat: make shared_tools into a module, not a class. This is in preparation for jitting, since we want them to be standalone functions. feat: break out step that updates water parcel positions These functions now become helper utility functions in shared_tools, and are wrapped by update_Q, which handles them all, and passes them what they need. refactor: break out sed functions into smaller functions This modularizes the code and makes it easier to compile. feat: jit the functions apply jit to functions, and fix some type problems that came along with refactoring. refactor: move next step calculation to jitted functions The next slowest operation was getting the indexes of the next cells, so I moved that to a jitted function too. refactor: jit the calculation of weight array for the water fix: move random number generation into numba Copy of what @amoodie did in DeltaRCM#27 fix: some broken tests to test the shared tools portion. fix: add numba to requirements fix: move all inlet picking into the random_pick function There's essentially no difference between the random_pick function and the random_pick_inlet function other than there is no preference given for the inlet cells, so you can pick whichever one, so I just adjust the mechanism to use the existing random pick machinery. fix: water and sediment routing bugs that caused inconsistent behavior There were two issues with the refactoring. One was that water routing developed a flat flow field because I wasn't adding each passing water parcel to the Q field. The other was related to sediment parcel routing, where I had transcribed the i and j of stepping incorrectly in the jitted functions.

amoodie marked this pull request as ready for review May 7, 2020 20:01

amoodie marked this pull request as draft May 7, 2020 20:01

amoodie added 8 commits May 9, 2020 14:56

shift random choices out to numba funcs

b35069a

use utils version

3261170

shift new_cells ops out to standalone jitted functions

7683df3

switch one random call in sed

3cdcc5d

add a consistency check

0de5452

add timer

4a463ce

add numba dep

01bfe2b

set rng in numba, and make all calls to random in numba jitted functions

9fda2c2

amoodie force-pushed the numba branch from 5c1450d to 9fda2c2 Compare May 9, 2020 19:57

addtl changes to defaults and plotting and timing

5edba66

ericbarefoot mentioned this pull request May 13, 2020

Change weight calculations for water and sediment routing #29

Merged

elbeejay reviewed May 14, 2020

View reviewed changes

ericbarefoot added a commit to ericbarefoot/pyDeltaRCM that referenced this pull request May 22, 2020

fix: move random number generation into numba

2726176

Copy of what @amoodie did in DeltaRCM#27

ericbarefoot mentioned this pull request May 22, 2020

BUMBANUMBA #36

Merged

disable jitting for coverage tests

1c38f66

ericbarefoot closed this Jun 3, 2020

amoodie deleted the numba branch February 25, 2021 17:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numba for speed up #27

Numba for speed up #27

amoodie commented May 7, 2020

amoodie commented May 8, 2020

amoodie commented May 8, 2020

elbeejay May 14, 2020

amoodie May 14, 2020

ericbarefoot May 14, 2020

elbeejay commented May 14, 2020

ericbarefoot commented Jun 3, 2020

Numba for speed up #27

Numba for speed up #27

Conversation

amoodie commented May 7, 2020

Highlights

Question

amoodie commented May 8, 2020

amoodie commented May 8, 2020

elbeejay May 14, 2020

Choose a reason for hiding this comment

amoodie May 14, 2020

Choose a reason for hiding this comment

ericbarefoot May 14, 2020

Choose a reason for hiding this comment

elbeejay commented May 14, 2020

ericbarefoot commented Jun 3, 2020