Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Events on root propagate pre-encoded value #36

Open
martinheidegger opened this issue Jun 17, 2022 · 9 comments
Open

Events on root propagate pre-encoded value #36

martinheidegger opened this issue Jun 17, 2022 · 9 comments
Labels
documentation Improvements or additions to documentation pull request welcome A pull request is welcome

Comments

@martinheidegger
Copy link

The events propagated on sub-level instances do not have their value encoded, unlike events triggered by sublevel pushes.
Is this correct behavior?

const db = new Level('db')
await db.open()
db.on('put', (key, value) => {
  // { key: 'hello', value: { world: 'foo' } }
  // { key: '!sub!hello', value: '[object Object]' }
  console.log({ key, value })
})
const sub = db.sublevel('sub')
await db.put('hello', { world: 'foo' })
console.log(await db.get('hello'))
await sub.put('hello', { world: 'foo' })
console.log(await sub.get('hello'))
@vweevers
Copy link
Member

If you were to listen for 'put' on the sublevel instance, you'd get pre-encoded values as well. That part is by design:

Any keys, values and range options in these events are the original arguments passed to the relevant operation that triggered the event, before having encoded them.
https://github.com/Level/abstract-level#events

And because sublevels have their own encodings, they encode data before passing it up to their parent db, which then (also) emits events. So in the above example, sub emits 'put' with an object value, encodes the value with 'utf8', calls db.put(key, '[object Object]', options), which then emits a string value.

What behavior would you like to see?

@martinheidegger
Copy link
Author

The expected behavior for me would be that no matter where I put the put operation, it would result in the same encoding showing up in the event handler. Probably the more expected case would be to see the encoded value [object Object] in all cases. But following the documentation, I expected value would be { world: 'foo' } for both cases which is probably difficult to implement.

@vweevers
Copy link
Member

The expected behavior for me would be that no matter where I put the put operation, it would result in the same encoding showing up in the event handler.

The problem there is that context matters. If a parent db emits values originating from a sublevel, I want to guarantee that the parent db is able to work with that value (meaning to do additional operations like updating some index). If a sublevel uses 'json' encoding, but the parent db uses 'utf8' encoding, the parent can't work with object values.

@martinheidegger
Copy link
Author

I understand the conundrum. The problem that I am having is that I would like to process the input the same way no matter if it comes through root or a sublevel 🤔 Maybe best to put the root value-encoding to view and then use the transcoder to transcode the events?

Short correction btw.: Before I mentioned [object Object] is the expected input but actually I would rather see a Uint8Array binary data, post encoding for both key and value.

@vweevers
Copy link
Member

vweevers commented Jun 17, 2022

The problem that I am having is that I would like to process the input the same way no matter if it comes through root or a sublevel

Can you explain the use case?

I would rather see a Uint8Array binary data, post encoding for both key and value.

Another challenge here is that we don't necessarily encode to binary data (unlike similar db interfaces like the hypercore modules). On classic-level for example, we prefer passing data to LevelDB as strings when possible, because it's faster. This means a db.put(..., { valueEncoding: 'buffer' } encodes the value to a buffer (poor example, that's a noop) but db.put(..., { valueEncoding: 'json' } encodes to a string. This might not be a problem, if you're consistent with encodings (i.e. don't set them per operation) and if you don't mind to forego the optimization.

@vweevers
Copy link
Member

On classic-level specifically, we could perhaps make this configurable, so that it'll always encode to the same type (but Uint8Array isn't supported natively yet, so your only choice would be buffers).

@martinheidegger
Copy link
Author

Can you explain the use case?

I have a single db that contains the state and results of an long-term execution. This includes open tasks to execute. On start I want to continue with the open tasks, during the execution I want to be able to queue additional tasks.
On start I iterate over open tasks. During execution I have an event listener to start further tasks.

At the moment I have settled on having a sublevel db and make sure that this works but it feels dangerous to forget about this limitation in future refactorings.

On classic-level for example, we prefer passing data to LevelDB as strings when possible, because it's faster.

Huh, another surprise. 🤯 Well, i guess nothing is as easy as it seems 😅 . It seems to me like this may be an aspect worth to expose in a more verbose fashion. on('put-binary', ...) / on('put-string', ...) that has a consistent type for the event data? With additionally maybe warning on the implementation specifics of on('put', ...).

A simple way to resolve this may be to just note in the documentation about the behavior when using sublevel API's as it was really unexpected to me.

@martinheidegger
Copy link
Author

After spending a little more time with this I am also surprised that batch operations don't trigger put events and even I am wondering if a util wouldn't be helpful:

// listen(level, event, normalizedValueType, handler)
const unlisten = listen(db, 'put', 'uint8', (key, value) => {})

That listens both put & batch events and makes sure that - no matter the platform - the value type is of the same type.

@vweevers
Copy link
Member

It seems to me like this may be an aspect worth to expose in a more verbose fashion

Currently, because the events don't use encoded values, I consider that to be an internal detail. And to some extent, directly using a db that has sublevels (as opposed to only interfacing with the sublevels) implies that you're willing to deal with the internals of sublevels (which includes prefixed keys, as well as this encoding story).

A simple way to resolve this may be to just note in the documentation about the behavior when using sublevel API's as it was really unexpected to me.

👍

if a util wouldn't be helpful

Let's discuss that in a new issue

@vweevers vweevers added documentation Improvements or additions to documentation pull request welcome A pull request is welcome labels Jun 17, 2022
@vweevers vweevers added this to Level Feb 10, 2023
@vweevers vweevers moved this to Backlog in Level Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation pull request welcome A pull request is welcome
Projects
Status: Backlog
Development

No branches or pull requests

2 participants