Fastest way to read a set of files (kernel v5.14) #1160

MartyMcFlyInTheSky · 2024-06-01T16:51:53Z

MartyMcFlyInTheSky
Jun 1, 2024

Our company uses a custom database that we maintain that stores up to 4 terabytes of data in memory. When the application is deployed, the data needs to be dumped on disk and reloaded and as you can imagine, loading 4TB from disk will take some time. My goal is now to make this as fast as possible.

The data is already categorised internally, so my thought was dumping every categories worth of data into seperate files, that could then be loaded (even seperately if wished).

Looking into this documentation already gave me some ideas how to go about it:

use liburing where possible. Create simple IORING_OP_READV and IORING_OP_WRITEV operations for loading and storing files respectively using the io_uring_prep_read() and io_uring_prep_write() functions.
Have n number of threads, one for each file to be read, which wait and poll the consumer queue after submitting to the submission queue. The polling IO apparently saves me some syscalls (here I don't really understand how I avoid the syscall since I the polling call io_uring_enter() seems to be a syscall? From the documentation:

Instead the application must actively find and reap these events by calling
io_uring_enter(2) with IORING_ENTER_GETEVENTS set and min_complete set to the desired number of events.

I need further clarification on how this is supposed to be meant?

Since the fileset to load is known beforehand, I think I can also use the io_uring_register() syscall to preregister them to further tickle out some performance. However, here I don't completely understand how the interplay between liburings higher level api and the lower level would exactly play out. Especially I have trouble understanding this part:

Once io_uring_register(2) completes successfully for a file-set registration, the application can use these files by assigning the index of the file descriptor in the array (instead of the actual file descriptor) to the sqe→fd field, and marking it as an file-set fd by setting IOSQE_FIXED_FILE in the sqe→flags field

I guess I just don't have a good enough grasp on what this array actually is and how it is looked at by the kernel conceptually. It would be helpful if someone could elaborate on this a bit.

For testing purposes some of our devs will run the application on their devices, some of which are MAC's. I suppose they wouldn't be able to compile the binary since the system calls are not supported, is that true? If so, my colleague made a reference implementation that uses the POSIX read and write commands, how could #ifdef conditionally compile for the availability of these features and fall back to the standard interface if they're not available?
Our HPC cluster currently runs with kernel version 5.14. Which of the above features can I use? Are there restrictions that apply or extensions that are not documented in the above script?

General feedback on my approach would be greatly appreciated! It's really an exciting new interface that I'd like to see in production in our HPC cluster :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fastest way to read a set of files (kernel v5.14) #1160

{{title}}

Replies: 0 comments

Select a reply

Fastest way to read a set of files (kernel v5.14) #1160

MartyMcFlyInTheSky Jun 1, 2024

Replies: 0 comments

MartyMcFlyInTheSky
Jun 1, 2024