You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the moment, the segment and triangle partials are being computed one at a time. This is probably pretty slow. It's worth profiling (with something like line_profiler or pyspy) first to check before investing effort to change this, but I'm guessing this is something like 10x slower than it would be to pass all the necessary observation points and source triangles to cutde.disp_matrix. This is basically just standard Python/matlab "vectorize your code" advice.
Concretely, passing only a single triangle at a time removes a lot of the potential for parallelization and for amortizing operations over many source triangles.
As a point of reference, I remember getting something like 5 million TDE source-observation point computations per second running on CPU with cutde. So, if you do a line_profiler run and see performance substantially worse than that, the problem is likely to be the individual calls.
I think the challenge that I've seen here is that I do a different map projection for each of the triangles. This essentially means that for each triangle there is a whole new set of observation coordinates because they go through the map projection too. That's not a linear transform so I don't think I can just factor it out. Maybe it's worth considering packaging all of the projected TDEs and projected station coordinates together anyway and doing a lot of unused calculations and then just extracting the subsets we need?
At the moment, the segment and triangle partials are being computed one at a time. This is probably pretty slow. It's worth profiling (with something like line_profiler or pyspy) first to check before investing effort to change this, but I'm guessing this is something like 10x slower than it would be to pass all the necessary observation points and source triangles to
cutde.disp_matrix
. This is basically just standard Python/matlab "vectorize your code" advice.Concretely, passing only a single triangle at a time removes a lot of the potential for parallelization and for amortizing operations over many source triangles.
As a point of reference, I remember getting something like 5 million TDE source-observation point computations per second running on CPU with cutde. So, if you do a line_profiler run and see performance substantially worse than that, the problem is likely to be the individual calls.
Both here:
celeri/celeri/celeri.py
Line 1125 in dde0941
and here:
celeri/celeri/celeri.py
Line 1851 in dde0941
The text was updated successfully, but these errors were encountered: