Reduce overhead of read_data and write_data #202

fsimonis · 2024-04-22T14:20:29Z

In this discourse thread, I tracked down the increased duration spent in the "do-nothing solver" down to read_data and write_data.

Most logical explanation would be the additional

input vertex_ids and values are copied to a vector, even though passing np.reshape(X, -1) to the preCICE API suffices and prevents copies.
output values are allocated, then passed to the API, then allocated to build an np.array
we do a lot of additional error checking (which is good)

Example of rhoVW on solver2, being vectorial data of large mesh:

Time measured in preCICE: 7ms (note: this doesn't allocate)
Time measured in Python: 40ms (including overhead from activating profiling in python, this needs to allocate, so overhead scales with size)

Notes:

With some tweaking I can get this down to 30ms. This makes the function actually shorter, simpler, and easier to follow.
np.flatten() copies the input, while np.reshape doesn't if it can avoid it.
The majority of the generated code seems to be error handling, which we could potentially be avoided by using the CPP API directly for calls to getDataDimensions and do this in one place.
This overhead could be profiled with something like Profiling of user-code via API precice#1647

The text was updated successfully, but these errors were encountered:

fsimonis mentioned this issue Apr 23, 2024

Missing API Profiling precice/precice#2013

Closed

Provide feedback