Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Configurable Page Size Policy to MmapRegion #127

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

EmeraldShift
Copy link

@EmeraldShift EmeraldShift commented Dec 12, 2020

This PR adds a PageSizePolicy enum which can be used to configure the behavior of MmapRegion instances upon creation. The current options are:

  • BasePages, which uses the default mmap behavior
  • TransparentHugepages, which uses THP via madvise
  • ExplicitHugepages, which passes MAP_HUGETLB to mmap

The page size policy is at a per-mapping granularity, so a given VM can have different policies for different mappings, if desired. A helper function, from_ranges_with_policy, allows the VMM to configure mapping policies when constructing mappings from address ranges.

The behavior is stubbed out on the mmap_windows.rs implementation, but the arguments are received there to allow the file to compile.

The intention of this PR is to implement the functionality required for hugepage support in vm-memory (for #118), but without invasively modifying the external API, pending further discussion. This is mostly accomplished by adding extra constructors/functions which take a PageSizePolicy as an argument, but leaving existing functions alone.

These changes were experimentally tested for Firecracker by modifying Firecracker's local fork of vm-memory to include similar changes. Empirically, we were able to boot an Alpine microvm up to ~37% faster, and an Ubuntu microvm up to ~30% faster, reduce on-boot page faults by ~70×, all with negligible extra memory overhead. We would like to implement these changes here, so that other VMMs derived from vm-memory can hopefully reap the same benefit.

A few design questions we're not too sure about:

  1. What is the correct way to expose this configuration to VMMs? The way we've chosen is to add a special from_ranges function in mmap.rs that accepts a policy per tuple. However, as of recently, Firecracker is using a slightly modified local fork of vm-memory to implement dirty bit tracking, in addition to this repo. The local fork complicates how (and where) classes are used/defined (for example, it overrides mmap.rs), making it difficult to utilize these changes downstream. We're not sure how other downstreams of this repo use the APIs, either. Advice to improve the API is much appreciated.

  2. At which level of abstraction should this policy belong? We decided to place the enum in mmap.rs since paging policy is closely related to mmapped regions, and not so closely related to GuestMemory in general. However, by attaching a policy to all MmapRegions, we're requiring a Windows implementation (which, for now, is just a no-op). Would it make sense to implement an ExplicitHugepages policy for Windows, if the OS provides that kind of control?

Signed-off-by: Michael Jarrett <[email protected]>
@alexandruag
Copy link
Collaborator

Thanks for the PR and sorry for the lack of replies so far. I'll start having a better look sometime this week and will return with more comments.

@alexandruag
Copy link
Collaborator

Hi, after having a better look at this and thinking some more about it, it looks like a good opportunity to introduce a more forward looking and extensible Options (actual name TBD) abstraction that we'll be able to use as additional settings appear in the future. For example, how about introducing a struct (let's just call it Options for now) that holds both file_offset and page_size_policy (all the current non-mandatory memory region parameters, instead of just the page size policy). Then, we change the bounds of A from GuestMemoryMmap::from_ranges_with_options to be A: Borrow<(GuestAddress, usize, Options)>.

It now becomes simpler to implement more configuration possibilities by adding more fields to Options. We can further avoid breaking changes by, for example, keeping the current signature of build from mmap_unix and similarly introducing a new method that is more configurable. Not sure precisely what its signature should be so we can harmonize it with the existing build and build_raw with minimal additions, but we can def find something if we spend a bit more time here. What do you think about the high level approach?

@EmeraldShift
Copy link
Author

I like the high level idea of a compact Options struct to consolidate optional arguments when mapping guest memory. The approach feels much more easily extensible than having to add new functions with new options. A few thoughts:

  1. Backward-compatibility of the interface <=> backward-compatibility of the Options struct. Renaming fields will break user code, and users will have to be careful to instantiate options as Options { .file_offset = ..., ..Options::default() }; or some similar construct to ensure that adding new fields to Options does not break existing user code. Maybe we could write a macro constructor with default arguments.

  2. Questions arise about how high/low in the interface these Options should live. An mmap_unix-specific set of options would allow a lot more customizability to guest mappings on *nix hosts, likely similar for Windows. A generic mmap set of options would need to either miss some OS-specific mapping feature, or contain options for them anyway, bloating the struct. A similar story occurs between mmap and guest_memory. For example, Firecracker seems to exclusively use GuestMemoryMmap, ignoring the higher-level abstractions. I think Options' most likely home would be at the mmap level, containing the intersection of useful features from the different OSes (or maybe an OS-specific option when it's really useful).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants