Add async read interface #30

cberner · 2021-07-09T04:58:07Z

Blocked on:

Feature request: read_at/write_at for tokio::fs::File tokio-rs/tokio#1529

cberner · 2021-08-15T04:36:43Z

It seems like read() even via IO uring is a lot slower than mmap, so I'm going to close this as won't fix.

Here are IO uring benchmarks: #61

udoprog · 2022-04-24T01:12:33Z

👋 Thanks for sharing your code!

So async support is one of the first things I look for when new persistent trees pop up in the Rust ecosystem and I'm a bit saddened by finding this issue as wontfix.

Now I/O uring as an I/O interface is rarely going to be "as fast" as using mmap and letting the kernel synchronize memory regions directly with no buffer management and little to no syscall overhead. But async support would be valuable in itself in that it would allow redb to cleanly integrate with the rest of the async ecosystem.

As it stands when integrating redb into a larger async application you'd run the hazard of spurious I/O blocking due to page faults, forcing you to utilize coping mechanisms like a blocking thread pool which comes with its own overhead.

Integrating async I/O uring support comes with its own challenges that I'm sympathetic towards. But I'm at least curious if you could be convinced to reconsider the status of this issue?

cberner · 2022-04-24T01:39:52Z

For sure, hope you or someone else finds it useful!

Ya, I think that's a good argument for reconsidering an async read interface. I'll take another look into this.

jeromegn · 2023-01-17T13:35:41Z

We're also interested in this. Not necessarily for the async nature of things, but for the promise of io_uring.

Given mmap is now optional (and not the default) in redb, this could be explored again? In your small benchmark, was io_uring faster than read syscalls?

Coincidently, I've stumbled on an interesting article regarding io_uring and its performance: https://itnext.io/modern-storage-is-plenty-fast-it-is-the-apis-that-are-bad-6a68319fbc1a

That article makes me think the only way to saturate I/O on modern NVMe drives is likely to do multi-threaded reads.

I know next to nothing about these things, just thought I'd ping this issue to see what's the current state of affairs.

cberner · 2023-03-28T18:41:03Z

notes to self:
I looked into implementing this and it's blocked on a couple things at the moment:

async in traits: Tracking Issue for async_fn_in_trait, return_position_impl_trait_in_trait rust-lang/rust#91611
Tokio does not support File::read_exact_at() or File::read_seek()

ozgrakkurt · 2023-04-01T23:49:42Z

Hey @cberner,

It is possible to use async-trait for blocker 1.
About blocker 2; not sure if you mean this, but tokio::fs::File has seek and read_exact

cberner · 2023-04-04T13:21:17Z

The problem with seek and read_seek is that they take &mut self. That would require introducing a lock on the File, or cloning the File for every new transaction

ozgrakkurt · 2023-04-04T13:24:23Z

Also, they use tokio::spawn_blocking internally for now so it doesn't make much sense to implement this for now I think. User can just do spawn_blocking when interacting with the db and it should be the same

SunDoge · 2023-05-31T07:39:57Z

In my opinion, the iouring_entries and VALUE_SIZE are both too small for iouring，which doesn't take advantage of the benefits of async I/O. With entries = 64 and value size = 20000, the speed of io_uring is comparable to that of read.

https://gist.github.com/SunDoge/2d361a289b75b7c06607c04e2230add9

lmdb-zero: Loaded 100000 items (1GiB) in 3485ms (547MiB/s)
lmdb: Random read 100000 items in 79ms
lmdb: Random read 100000 items in 36ms
lmdb: Random read 100000 items in 36ms
read()/write(): Loaded 100000 items (1GiB) in 4217ms (452MiB/s)
read()/write(): Random read 100000 items in 86ms
read()/write(): Random read 100000 items in 98ms
read()/write(): Random read 100000 items in 82ms
uring_read()/write(): Loaded 100000 items (1GiB) in 3240ms (589MiB/s)
uring_read()/write(): Random read 100000 items in 136ms
uring_read()/write(): Random read 100000 items in 132ms
uring_read()/write(): Random read 100000 items in 109ms
uring_overlap_read()/write(): Loaded 100000 items (1GiB) in 3112ms (613MiB/s)
uring_overlap_read()/write(): Random read 100000 items in 85ms
uring_overlap_read()/write(): Random read 100000 items in 80ms
uring_overlap_read()/write(): Random read 100000 items in 76ms
mmap(): Loaded 100000 items (1GiB) in 3184ms (599MiB/s)
mmap(): Random read 100000 items in 3ms
mmap(): Random read 100000 items in 2ms
mmap(): Random read 100000 items in 1ms
mmap(ANON): Loaded 100000 items (1GiB) in 683ms (2GiB/s)
mmap(ANON): Random read 100000 items in 2ms
mmap(ANON): Random read 100000 items in 2ms
mmap(ANON): Random read 100000 items in 2ms
vec[]: Loaded 100000 items (1GiB) in 561ms (3GiB/s)
vec[]: Random read 100000 items in 2ms
vec[]: Random read 100000 items in 2ms
vec[]: Random read 100000 items in 2ms

When reading multiple files concurrently, iouring can be even faster. I've implemented tfrecord reader with iouring and it performs really good, with 1.1 GiB/s throughput (1 thread) vs 500 MiB/s sync read (4 threads).

casey · 2024-01-07T01:13:31Z

I wanted to add some color here. ordinals.com performance has been horrible, and we finally figured out why.

The issue is that we're using an async web framework, so all of our endpoint functions are async. However, those functions then call into redb, which is not async. When those calls are slow, tokio kind of melts, since you have a bunch of async tasks which aren't yielding. If redb supported an async interface, then we could do everything in async, and it wouldn't be a problem.

However, the fix was very simple. We just used tokio::task::spawn_blocking inside of the async functions, and made calls to redb inside of the spawned threads, which are executed on a thread pool. This basically fixed everything, and even if redb had an async interface, we might not even use it, because async is relatively painful, and we would have to convert all of our synchronous index functions which access the database into async. So in our case, it still doesn't make sense, although it may make sense for other use-cases, or it might make sense for use if we run into scaling limits with threads.

jeromegn · 2024-01-07T01:30:08Z

@casey sounds about right! Have you tried block_in_place? No need to switch tasks / threads in many cases. We use it extensively for SQLite calls.

casey · 2024-01-07T01:37:47Z

@jeromegn Good suggestion! I didn't know about block_in_place, I'll give that a try. Could this cause issues if there are a bunch of concurrent tasks using block_in_place? I can imagine it could tie up all of tokio's threads used for running async tasks, unless tokio can spawn new threads as needed.

jeromegn · 2024-01-07T02:41:15Z

@casey the runtime will spawn more threads if the current thread blocks for too long (for some measure of long).

For things that block only for up to 1ms, it's probably not worth it to use block_in_place or spawn_blocking. I assume your use case has longer execution times.

The main benefit of block_in_place is not having to clone or use only Send + Sync + 'static types. You can pass references since it executes closure on the current thread.

dan-da · 2024-01-21T21:11:44Z

Perhaps redb could offer an async-friendly api that wraps the sync api in spawn-blocking or block-in-place?

dpc · 2024-01-22T03:58:25Z

Perhaps redb could offer an async-friendly api that wraps the sync api in spawn-blocking or block-in-place?

Is there anything redb would do that a separate crate couldn't? redb-tokio etc.

dan-da · 2024-01-22T05:34:07Z

Is there anything redb would do that a separate crate couldn't? redb-tokio etc.

I don't know. Probably not. For a second you got me excited thinking there might be a redb-tokio crate already, but anyway such a crate seems like a good way to do it.

cberner closed this as completed Aug 15, 2021

cberner added the wontfix This will not be worked on label Aug 15, 2021

cberner reopened this Apr 24, 2022

cberner removed the wontfix This will not be worked on label Apr 24, 2022

dan-da mentioned this issue Jan 21, 2024

Atomicity of database updates across block updates Neptune-Crypto/neptune-core#79

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add async read interface #30

Add async read interface #30

cberner commented Jul 9, 2021 •

edited

Loading

cberner commented Aug 15, 2021

udoprog commented Apr 24, 2022 •

edited

Loading

cberner commented Apr 24, 2022

jeromegn commented Jan 17, 2023

cberner commented Mar 28, 2023

ozgrakkurt commented Apr 1, 2023

cberner commented Apr 4, 2023

ozgrakkurt commented Apr 4, 2023

SunDoge commented May 31, 2023

casey commented Jan 7, 2024 •

edited

Loading

jeromegn commented Jan 7, 2024

casey commented Jan 7, 2024

jeromegn commented Jan 7, 2024

dan-da commented Jan 21, 2024

dpc commented Jan 22, 2024

dan-da commented Jan 22, 2024

Add async read interface #30

Add async read interface #30

Comments

cberner commented Jul 9, 2021 • edited Loading

cberner commented Aug 15, 2021

udoprog commented Apr 24, 2022 • edited Loading

cberner commented Apr 24, 2022

jeromegn commented Jan 17, 2023

cberner commented Mar 28, 2023

ozgrakkurt commented Apr 1, 2023

cberner commented Apr 4, 2023

ozgrakkurt commented Apr 4, 2023

SunDoge commented May 31, 2023

casey commented Jan 7, 2024 • edited Loading

jeromegn commented Jan 7, 2024

casey commented Jan 7, 2024

jeromegn commented Jan 7, 2024

dan-da commented Jan 21, 2024

dpc commented Jan 22, 2024

dan-da commented Jan 22, 2024

cberner commented Jul 9, 2021 •

edited

Loading

udoprog commented Apr 24, 2022 •

edited

Loading

casey commented Jan 7, 2024 •

edited

Loading