Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fold index_mm_node and index_node_f abstractions #161

Open
evelikov opened this issue Sep 29, 2024 · 6 comments
Open

Fold index_mm_node and index_node_f abstractions #161

evelikov opened this issue Sep 29, 2024 · 6 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@evelikov
Copy link
Collaborator

Currently we have two nearly identical abstractions - index_mm_node and index_node_f. The difference is that one uses mmap'd file, while the other FILE*.

AFAICT they both operate on the exact same files with the mmap variant used exclusively (I think) for the kmod internal resources API.

Off the top of my head, we can exclusively use index_mm_node and remove the rest. This means @stoeckmann doesn't need to audit, bugfix and/or rewrite the file variants all together.

@lucasdemarchi do we need both abstractions, what am I missing?

@evelikov evelikov added enhancement New feature or request question Further information is requested labels Sep 29, 2024
@stoeckmann
Copy link
Contributor

The memory mapping is faster, although the comparison itself is a bit unfair because the FILE version lacks some optimizations which have been made for the memory mapped version.

I just want to point out here that memory mapping itself is rather unsafe. It can be done if it's guaranteed 100 % that the file content won't change and that the file size won't change. If the former does, it is undefined behavior what happens if the memory is accessed again, and the latter might lead to SIGBUS if file size is reduced.

Talking about files in /lib/modules, I would say that both can be almost guaranteed. At least if kmod is exclusively used, files are replaced and not modified. But this depends on the package manager of a distribution as well.

But having a different base directory through -b or -d option (depending on tool), it's a different story. If a user runs these tools and another can modify the files in question, the previously mentioned issues can occur.

So, I'm not against the mm version, as long as it's used in a well defined environment. It would benefit all tools (except modprobe) to have a faster FILE version.

Talking about it: You can see performance differences already if you fmemopen the mm area and perform the rest of the FILE operations. It's only a rather tiny fraction occurring through the FILE overhead. Especially because the mm version lacks boundary checks. Adding them might make the diff smaller (and mm safer, but see above when it comes to safety/security).

@evelikov
Copy link
Collaborator Author

Just to make it clear: the goal here it not about performance, but to highlight that we're operating on the exact same files via different abstraction models. Which means that apart from code duplication, we inherently get two semi-different sets of bugs.

Which one we pick depends on the target and potential attack vectors we consider relevant. In either case, adding some flock/lockf/fcntl might be a good idea IMHO.

Fwiw the distribution package managers are a non-issue... For the rest I'm not sure bth.

@stoeckmann
Copy link
Contributor

I fully agree: One solution would be awesome. If the FILE solution for files could become a "one size fits all" solution, I would be totally in for it. So, you already know my preference. ;)

Right now I try to inspect a way to do exactly that: Get rid of memory mapped algorithm which only makes sense from a performance point of view, which I highlighted the pros and cons of this one. So I try to improve FILE performance.

@lucasdemarchi
Copy link
Contributor

Just to make it clear: the goal here it not about performance,

The goal for having the mmap variant is about performance. And it's the one "backend" mostly used during boot by users of libkmod. For a one-off lookup with modprobe it doesn't matter much (and it's likely worse with mmap). It does matter when you are actually using the index for hundreds of queries. I don't see we replacing that any time soon by the FILE variant.... unless we parse the index and load it in memory in a hash table or copy the trie structure to memory. But I don't see a pressing need for it.

@evelikov
Copy link
Collaborator Author

Thanks for the background. So any work should benchmark both boot/libkmod users and modprobe.

As you said, it's not pressing but let's leave this task open for a rainy day.

@lucasdemarchi
Copy link
Contributor

libkmod is the most important one:

a/

  1. create a context
  2. load resources -> causes the mm backend to be used
  3. 2 thousand random lookups
  4. close context

b/

  1. create a context
  2. 2 thousand random lookups
  3. close context

If you need to load resources to transfer it to a separate in-memory structure then it may not be that compelling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants