-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE REQUEST] add support to the "out=" argument #592
Comments
I see your point, and this could definitely be added. Since it's going to eat into flash space, I would probably implement this as an option. One issue I would like you to clarify is, how alignment of Also, I would like to ask you to specify a handful of methods where you want to see it first. Implementing this for everything is a relatively big undertaking, and I would like to have just a couple methods first, where we can sort out all issues. |
A good point.. I think the best is to stick to numpy's behavior.
Looks like they try casting to the actual output type. I guess if that complicates the implementation, it's also ok to raise an exception if they don't match. As for methods - my current use case is mixing several audio channels (3 channels are read from files into a 3x4096 array and then mixed using np.mean() before sent to I2S), but if I try to think more globally then this is probably my prioritization: |
I would second this as a good feature to have for embedded use. Being able to perform operations in-place would be a big bonus (especially if it follows how numpy does it). |
On the flash-space-cost point, a configuration option that makes this the only way to do it would be a great potential tradeoff -- i.e. it would be quite useful to have a "no allocations inside ulab" mode where I always have to use |
Casting is expensive, because either you unroll all possible combinations in-place for each method (requires more flash), or you implement that as a function, which then has the function call overhead, and is significantly slower. |
Wouldn't this trip pretty much everyone? I see two options here:
But I think this would be a great addition. I'm not dragging my feet, I'm just trying to explore the possibilities here. |
I think to start with it would be OK to not support casting and require that the input array (passed to |
This would be relatively easy to implement, so if Damien's proposal is acceptable to everyone, I would just go ahead with that. |
Just wanted to add |
I second @jimmo 's suggestion of having a configuration option for "no !implicit! allocations inside This is very useful for real-time system programming. Otherwise, we have to tread carefully to avoid unintentional allocation that may trigger the garbage collector, which could be disastrous when there are short deadlines to meet. However, I would prefer this to be a runtime module attribute rather than a compile time configuration. To implement this without code size penalties, I suggest something of this sort:
Optionally there could be a second, separate attribute that also prohibits |
I leave the implementation details to the experts, but I'd like to reignite the discussion on introducing an We seek performance when using ulab, and indeed, ulab's performance is impressive. However, when running in tight loops with sampled data, the continuous memory allocation leads to eventual garbage collections. This is problematic in itself, but it becomes a critical issue in real-time applications, particularly on platforms with significant amounts of PSRAM, where garbage collection can take a considerable amount of time. A real-world example: an ESP32-S3 with 8MB of PSRAM, which is not an unusual platform but rather an increasingly popular configuration, can take more than 400ms to perform a garbage collection in a worst-case scenario. Other platforms might experience a smaller penalty, but at the cost of more frequent garbage collections. Under these conditions, you can negate all performance gains of ulab, rendering it almost unusable. I'm implementing a Kalman filter for a project and planned to use ulab for the matrix operations (already utilizing numpy in a CPython implementation). But, when porting it to Micropython/ulab, I realized I can't use it due to unpredictable memory allocation. I can't afford to have a garbage collection in the middle of the filter loop. This is just my personal experience, but I'm sure many others face similar challenges. By the way, before finding this thread, while considering writing a small native C module to implement the basic operations in-place, I stumbled upon your excellent work at https://micropython-usermod.readthedocs.io. So thank you @v923z, because, one way or another, I will be utilizing your amazing contributions. |
There is no question about the benefit of adding the The fact of the matter is, people request all kinds of features (some of them quite exotic, and sometimes I wonder, whether the request was genuine), but when I ask for at least a reasonable test script to go with the implementation, then they tend to disappear. Having said this, Here is a list from one of the comments above
Beyond these, what would you need for your filters? |
The easiest solution in this case is to add the function in the |
I have also noticed that there are some random requests here and there on this repo, so I'm glad to hear that this one is already in your plans!! To implment a standard Kalman filter you use matrix addition, subtraction, multiplication, transposition and inversion. There are several variants that may requiere more operations, but I don't remember from the top of my head.
Oh! I read about this when I first found ulab, but I had forgotten about it. Thanks for the reminder! A Kalman filter is generic and valid for a lot of use cases, so I will try to implement it this way so you can check it out and see if it's worth adding it to ulab. |
What would take the |
Yes, something similar to |
In principle, it's doable: in micropython-ulab/code/ndarray_operators.c Lines 182 to 186 in 63dfbd1
results to the matrix that out points to.
However, you mentioned matrix inversion, which will produce a new matrix, no matter what. If you want to save on that, then we would need to add https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.inv.html, which allows you to overwrite the data. |
I didn't notice the different approach in |
Many numpy functions support the "out=" argument, allowing you to specify an existing array that results will be written into. This saves time and memory allocations especially if the calculation is used repeatedly.
I couldn't find support for this in
ulab.numpy
, any chance to add this in a future release? I believe it could be extremely useful with microcontrollers.The text was updated successfully, but these errors were encountered: