Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving aggregated data types for performance #15

Open
BridgeAR opened this issue Dec 20, 2018 · 2 comments
Open

Improving aggregated data types for performance #15

BridgeAR opened this issue Dec 20, 2018 · 2 comments

Comments

@BridgeAR
Copy link

For me the most challenging type while implementing RESP in JS was arrays. The input comes as an event with partial data without knowing when the array ends. It would be great to know the actual size of the aggregated data up front similar to bulk strings so it's possible to wait with the parsing until all necessary data has been received. That requires Redis to actually calculate the size up front but the receiver could implement it more efficiently, especially for nested aggregated types. It's off course a trade-off but I believe it could improve the overall performance for these types quite a bit (mainly in case of many entries).

I am curious about your thoughts about this @antirez.

@antirez
Copy link
Owner

antirez commented Dec 20, 2018

Hello @BridgeAR, once you have the length of the array, isn't that enough to easily have a stop condition in your loop? I guess you mean you would like to know the total size in bytes of the protocol containing the array: that would be more complex in the server side indeed, and yet I cant' see how that would help in the client. Anyway reading-protocol-first parsing-later is not a good idea, and tracking the read bytes in the protocol implementation is even harder than just having a loop to process the N items. Is your problem related to implementing RESP in a non-blocking way, so that you need to take state about how much you parsed so far in order to continue?

@BridgeAR
Copy link
Author

Is your problem related to implementing RESP in a non-blocking way, so that you need to take state about how much you parsed so far in order to continue

That's exactly the issue. In JS there's no way to implement this blocking, so I have to parse everything again when a new chunk of data is incoming.
To prevent that I added a cache for the already parsed information and I start from the last fully parsed entry of the deepest nested array. However, handling arrays like this (they are sparse at that point) is not efficient and the state handling makes the implementation more complex.

You can look at my implementation here: https://github.com/NodeRedis/node-redis-parser/blob/master/lib/parser.js#L199-L276. https://github.com/NodeRedis/node-redis-parser/blob/master/lib/parser.js#L516-L522 is one of the entry points to get back to the correct entry to continue parsing.

tracking the read bytes in the protocol implementation is even harder than just having a loop to process the N items

It was not simple to implement that properly but without it the parser would be very slow.

reading-protocol-first parsing-later is not a good idea

That's not really what I meant, it's about being able to stop parsing until there's enough data if e.g., the bulk string length is bigger than the data I received so far. The very same principle could be used for arrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants