Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from firecracker-microvm:main #282

Open
wants to merge 2,362 commits into
base: main
Choose a base branch
from

Conversation

pull[bot]
Copy link

@pull pull bot commented May 11, 2023

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

roypat and others added 28 commits September 25, 2024 15:58
Add a benchmark that attempts to capture the latency of a single page
fault in a guest memory configuration that we also use in production.
This should catch regressions such as the 1.6 memfd/vhost post
snapshot-restore latency regression, and does indeed do so:

MAP_ANONYMOUS:
     Running benches/memory_access.rs
page_fault    time:   [2.8611 µs 2.9327 µs 3.0170 µs]
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe

memfd_create:
     Running benches/memory_access.rs
page_fault    time:   [5.7449 µs 5.8779 µs 6.0450 µs]
              change: [+85.649% +92.231% +98.074%] (p = 0.00 < 0.05)
              Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

Add the same benchmark for huge pages, just in case. Funnily enough,
there is no vhost-tax to be paid for huge pages.

Signed-off-by: Patrick Roy <[email protected]>
We're observing performance instability on m6i/5.10 which is causing A/B
performance test failures. To suppress false positives, pin AMI to the
last known good one only for m6i/5.10.

Signed-off-by: Takahiro Itazuri <[email protected]>
Bumps the firecracker group with 12 updates:

| Package | From | To |
| --- | --- | --- |
| [syn](https://github.com/dtolnay/syn) | `2.0.77` | `2.0.79` |
| [libc](https://github.com/rust-lang/libc) | `0.2.158` | `0.2.159` |
| [cargo_toml](https://gitlab.com/lib.rs/cargo_toml) | `0.20.4` | `0.20.5` |
| [regex](https://github.com/rust-lang/regex) | `1.10.6` | `1.11.0` |
| [autocfg](https://github.com/cuviper/autocfg) | `1.3.0` | `1.4.0` |
| [cc](https://github.com/rust-lang/cc-rs) | `1.1.21` | `1.1.23` |
| [once_cell](https://github.com/matklad/once_cell) | `1.19.0` | `1.20.1` |
| [regex-automata](https://github.com/rust-lang/regex) | `0.4.7` | `0.4.8` |
| [regex-syntax](https://github.com/rust-lang/regex) | `0.8.4` | `0.8.5` |
| [serde_spanned](https://github.com/toml-rs/toml) | `0.6.7` | `0.6.8` |
| [toml_edit](https://github.com/toml-rs/toml) | `0.22.21` | `0.22.22` |
| [winnow](https://github.com/winnow-rs/winnow) | `0.6.18` | `0.6.20` |


Updates `syn` from 2.0.77 to 2.0.79
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](dtolnay/syn@2.0.77...2.0.79)

Updates `libc` from 0.2.158 to 0.2.159
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Changelog](https://github.com/rust-lang/libc/blob/0.2.159/CHANGELOG.md)
- [Commits](rust-lang/libc@0.2.158...0.2.159)

Updates `cargo_toml` from 0.20.4 to 0.20.5
- [Commits](https://gitlab.com/lib.rs/cargo_toml/compare/v0.20.4...v0.20.5)

Updates `regex` from 1.10.6 to 1.11.0
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](rust-lang/regex@1.10.6...1.11.0)

Updates `autocfg` from 1.3.0 to 1.4.0
- [Commits](cuviper/autocfg@1.3.0...1.4.0)

Updates `cc` from 1.1.21 to 1.1.23
- [Release notes](https://github.com/rust-lang/cc-rs/releases)
- [Changelog](https://github.com/rust-lang/cc-rs/blob/main/CHANGELOG.md)
- [Commits](rust-lang/cc-rs@cc-v1.1.21...cc-v1.1.23)

Updates `once_cell` from 1.19.0 to 1.20.1
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](matklad/once_cell@v1.19.0...v1.20.1)

Updates `regex-automata` from 0.4.7 to 0.4.8
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/commits)

Updates `regex-syntax` from 0.8.4 to 0.8.5
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/commits)

Updates `serde_spanned` from 0.6.7 to 0.6.8
- [Commits](toml-rs/toml@serde_spanned-v0.6.7...serde_spanned-v0.6.8)

Updates `toml_edit` from 0.22.21 to 0.22.22
- [Commits](toml-rs/toml@v0.22.21...v0.22.22)

Updates `winnow` from 0.6.18 to 0.6.20
- [Changelog](https://github.com/winnow-rs/winnow/blob/main/CHANGELOG.md)
- [Commits](winnow-rs/winnow@v0.6.18...v0.6.20)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: cargo_toml
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: autocfg
  dependency-type: indirect
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: cc
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: once_cell
  dependency-type: indirect
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: regex-automata
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: regex-syntax
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: serde_spanned
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: toml_edit
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: winnow
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
...

Signed-off-by: dependabot[bot] <[email protected]>
Tap offload features configuration was moved from the device creation
time to the device activation time by the following commit:

commit 1e5d3db
Author: Nikita Zakirov <[email protected]>
Date:   Fri Jan 19 15:48:21 2024 +0000

    fix(net): Apply only supported TAP offloading features

Since device activation code is only called on the boot path, the
features were not automatically configured on the restore path.
This change configures them on the restore path as well.

The change does not include a unit test as we do not have a mockable
interface for the tap device.
The change does not include an integration test as we have not yet found
a way to reproduce the issue using the existing test framework.

Signed-off-by: Nikita Kalyazin <[email protected]>
Update maintainer and codeowner lists.

Signed-off-by: Nikita Kalyazin <[email protected]>
Sort maintainer list.

Signed-off-by: Nikita Kalyazin <[email protected]>
The benchmark tries to write to guest physical address 0, but on ARM
guest memory doesn't start at 0, so when resolving the address we
panicked.

Fix this by instead simply using whatever address guest memory starts
with.

Signed-off-by: Patrick Roy <[email protected]>
With having more benchmarks, we started hitting the 600s pytest timeout:
https://buildkite.com/firecracker/firecracker-pr-optional/builds/9105.
Increase it to 900s.

Signed-off-by: Patrick Roy <[email protected]>
These tests do not assert anything, and are only used for us to collect
metrics. As such, it suffices to run them once at night (especially
because we do not ever actually look at the metrics emitted by the PR
runs).

Signed-off-by: Patrick Roy <[email protected]>
Mention v1.9.1 as the latest patch in the release status.

Signed-off-by: Nikita Kalyazin <[email protected]>
Add a ring buffer type that is tailored for holding `struct iovec`
objects that point to guest memory for IO. The `struct iovec` objects
represent the memory that the guest passed to us as `Descriptors` in a
VirtIO queue for performing some I/O operation.

We plan to use this type to describe the guest memory we have available
for doing network RX. This should facilitate us in optimizing the
reception of data from the TAP device using `readv`, thus avoiding a
memory copy.

Co-authored-by: Egor Lazarchuk <[email protected]>
Signed-off-by: Babis Chalios <[email protected]>
Allow IoVecBufferMut objects to store multiple DescriptorChain objects,
so that we can describe guest memory meant to be used for receiving data
(for example memory used for network RX) as a single (sparse) memory
region.

This will allow us to always keep track all the available memory we have
for performing RX and use `readv` for copying memory from the TAP device
inside guest memory avoiding the extra copy. In the future, it will also
facilitate the implementation of mergeable buffers for the RX path of
the network device.

Co-authored-by: Egor Lazarchuk <[email protected]>
Signed-off-by: Babis Chalios <[email protected]>
Right now, we are performing two copies for writing a frame from the TAP
device into guest memory. We first read the frame in an array held by
the Net device and then copy that array in a DescriptorChain.

In order to avoid the double copy use the readv system call to read
directly from the TAP device into the buffers described by
DescriptorChain.

The main challenge with this is that DescriptorChain objects describe
memory that is at least 65562 bytes long when guest TSO4, TSO6 or UFO
are enabled or 1526 otherwise and parsing the chain includes overhead
which we pay even if the frame we are receiving is much smaller than
these sizes.

PR #4748 reduced
the overheads involved with parsing DescriptorChain objects. To further
avoid this overhead, move the parsing of DescriptorChain objects out of
the hot path of process_rx() where we are actually receiving a frame
into process_rx_queue_event() where we get the notification that the
guest added new buffers for network RX.

Signed-off-by: Babis Chalios <[email protected]>
Now, that we pre-process the buffers that guest provides for performing
RX, we need to save them in the VM state snapshot file, for networking
to work correctly post snapshot resume.

Implement Persist for RxBuffers and and plug them in the
(de)serialization logic of the network device.

Co-authored-by: Egor Lazarchuk <[email protected]>
Signed-off-by: Babis Chalios <[email protected]>
IoVecBufferMut type now uses IovDeque as its backing memory. IovDeque is
performing a custom memory allocation, using memfd_create() and a
combination of mmap() calls in order to provide a memory layout where
the iovec objects stored in the IovDeque will always be in consecutive
memory.

kani doesn't really get along with these system calls, which breaks our
proof for IoVecBufferMut::write_volatile_at. Substitute memory
allocation and deallocation with plain calls to std::alloc::(de)alloc
when we run kani proofs. Also provide a stub for IovDeque::push_back to
provide the same memory layout invariants.

Signed-off-by: Babis Chalios <[email protected]>
This is a regression test for the following fix:

commit a9e5f13
Author: Nikita Kalyazin <[email protected]>
Date:   Mon Sep 30 13:05:28 2024 +0000

    fix(net): set tap offload features on restore

The test verifies that tap offload features are configured for both
booted and restored VMs.

Signed-off-by: Nikita Kalyazin <[email protected]>
This updates firecracker.yaml to reflect the implementation.

cpu-config field was added by:

commit 9da9619
Author: Takahiro Itazuri <[email protected]>
Date:   Wed Apr 26 13:55:21 2023 +0000

    feat: Add "cpu-config" to config file

entropy field was added by:

commit 250036d
Author: Babis Chalios <[email protected]>
Date:   Wed Nov 23 09:52:28 2022 +0000

    feat(entropy): plug in with API

Signed-off-by: Nikita Kalyazin <[email protected]>
This reverts commit bc0ba43.

Signed-off-by: Babis Chalios <[email protected]>
This reverts commit 667aba4.

Signed-off-by: Babis Chalios <[email protected]>
… objects"

This reverts commit 14e6e33.

Signed-off-by: Babis Chalios <[email protected]>
Add rx_packet and tx_packet members to the Vsock
type and reuse them for RX and TX processing. This removed
overhead of creating new packets for each request.

As an additional improvement, we split VsockPacket
type into 2 types: VsockPacketRx and VsockPacketTx
which allows us to separate logic for RX and TX
processing.

Signed-off-by: Egor Lazarchuk <[email protected]>
As we don't create IoVecBuffer(Mut) types
at runtime, but reuse existing ones in both
virtio-net and virtio-vsock, we don't need to
use SmallVec type anymore.
With this we remove the type alias.

Signed-off-by: Egor Lazarchuk <[email protected]>
It doesn't end up much shorter but at least it is more readable.

Signed-off-by: Pablo Barbáchano <[email protected]>
Rewrite the only test that needs pandas.

pandas is a heavy dependency that weighs in around 50MB. It also
recently started printing a warning that in the future it will require
pyarrow, which is another 40 MB.

Pandas is great, however our usage of it is minimal.

Size of devctr does not change much, but we avoid the warning, and not
having to deal with pyarrow in the future.

Example output:
```
MSR removed 0x13 before=0x0
MSR changed 0x179 before=0x20ffff after=0x20
MSR added 0x17 after=0x0
MSR added 0x11 after=0x25ba008
```

Signed-off-by: Pablo Barbáchano <[email protected]>
Capture stdout in those tests. There is no reason why we should run
those tests differently than the others.

Also add worksteal option to xdist.

Signed-off-by: Pablo Barbáchano <[email protected]>
This was reverted in 157b739, but since
then we have added more Rust toolchains, so it adds up. Removing reduces
the image size quite a bit:

Uncompressed: 4.57GiB -> 3.31 GiB (-28%)
Compressed: 1213MiB -> 1058MiB (-13%)

Signed-off-by: Pablo Barbáchano <[email protected]>
pb8o and others added 30 commits December 11, 2024 19:34
seccompiler --basic filters are deprecated

Signed-off-by: Pablo Barbáchano <[email protected]>
Add a test to validate that a seccomp filter works as defined in the
JSON description.

To do this we use a simple C program that just loads a given seccomp
filter and calls a syscall also given in the arguments.

Signed-off-by: Pablo Barbáchano <[email protected]>
My python version does not accept the precense of double quotes inside a
double-quoted f-string. Use single quotes instead.

Signed-off-by: Patrick Roy <[email protected]>
If a command times out in utils.run_cmd, we kill the subprocess and try
to get whatever partial output it wrote so far to generate an error
message. We do this using `communicate`, which waits for stdout/stderr
to close.

Some processes pass their stdout and stderr to their children. In this
case, killing the parent won't close the stdout/stderr, and so the
proc.communicate() call in run_cmd will wait indefinitely (or until
whatever is holding on the the other end of the pipe exits). Avoid this
by explicitly closing our end of the stdout/stderr pipes.

Signed-off-by: Patrick Roy <[email protected]>
We asserted twice that the guest is not responsive anymore after
pausing, and zero times that it is responsive again after resuming. Fix
this by doing each once.

Signed-off-by: Patrick Roy <[email protected]>
It's a fairly tight timeout, and I was hitting it quite frequently while
running functional tests with high parallelism. I don't really see any
reason why we should be concerned about this specific command simply
hitting the pytest timeout when things go wrong, so let's just remove
the timeout.

Signed-off-by: Patrick Roy <[email protected]>
Opportunistically set a 100s timeout. We don't have anything that should
run this long, and by manually setting a timeout we avoid hitting the
pytest timeout of 300s in case something _does_ go wrong (and hitting
the pytest timeout gives significantly less debug information, so a
manual timeout is preferred).

Signed-off-by: Patrick Roy <[email protected]>
Remove the logging from this function, because SSHConnection._exec
already does this.

Signed-off-by: Patrick Roy <[email protected]>
Instead of creating new SSH connections every time we want to run a
command inside the microvm, open a single daemonized ssh connection in
the constructor of `SSHConnection`, and reuse it until we kill the
microvm.

Realize this using openssh's ControlMaster/ControlPersist functionality,
which allows us to persist the first connection opened to a microvm and
multiplex all subsequent commands over this one connection.

We need some slight changes in test_pause_restore.py and test_net.py. In
the former, since we no longer reconnect to the VM on every ssh command,
we will now observe a timeout instead of a connection failure. For the
latter, it turns out that @lru_cache treats .ssh_iface() and
.ssh_iface(0) as distinct invokations, despire the iface_idx argument of
ssh_iface defaulting to 0 (meaning they are actually the same function
call). This caused a re-intialization of the SSHConnection object, which
then triggered the assertion in _init_connection about the socket file
not existing.

Signed-off-by: Patrick Roy <[email protected]>
With the ControlMaster/ControlPersist approach, these are no needed
anymore. I'm not reverting the commit that added them because that also
contained an update of poetry.lock. I'm also not rebuilding the devctr
for this because the dependencies will simply be dropped the next time
we're rebuilding it.

Signed-off-by: Patrick Roy <[email protected]>
Building the A/B binaries happens in a tmpdir, which we try to delete at
the end. But we create files in the tmpdir from a privileged docker
container, so to delete this tmpdir we also need elevated privileges

Signed-off-by: Patrick Roy <[email protected]>
These run in a nightly pipeline and apparently the read/write MSR
scripts take longer than the 100s default timeout added in b99abe1.

Fixes: 36448e9 ("test: set default timeout of 100s for ssh commands")
Signed-off-by: Patrick Roy <[email protected]>
Since these repeat commit titles, they can get longer than the
configured body line length. So just ignore them.

Signed-off-by: Patrick Roy <[email protected]>
In addition do a few more cleanups to the test:

- Use `atol` to read longs

- musl makes an ioctl that is not permitted in seccomp. We didn't use
  the return code anyway, so remove it.

Signed-off-by: Pablo Barbáchano <[email protected]>
Use a try-finally block inside check_guest_connection to make sure that
socat gets killed even if something inside check_guest_connection fails
and raises an exception.

Signed-off-by: Patrick Roy <[email protected]>
Now that the assertion on the `run` return code is paired with the .run
call itself, combine them into .check_output.

Also use check_output further up.

Signed-off-by: Patrick Roy <[email protected]>
Use `tenancy.Retrying` instead of the home-grown retry loop. This has
the advantage that we don't simply give up after 3 attempts, but instead
raise a meaningful error message (whereas before I guess we would just
fail further down the function when trying to do something that assumes
the file exists).

Signed-off-by: Patrick Roy <[email protected]>
Add a `nightly` flag to the `cargo` function, which causes the requested
command to be executing using the nightly toolchain.

We have a nightly toolchain in our container due to kani. But since the
toolchain version is prescribed by kani, we need to dynamically
determine its name from `rustup toolchain list`, and cannot just use
`+nightly`.

Signed-off-by: Patrick Roy <[email protected]>
From: Jonathan Woollett-Light <[email protected]>

Adds test for unused cargo dependencies.

Signed-off-by: Jonathan Woollett-Light <[email protected]>
Co-authored-by: Patrick Roy <[email protected]>
Signed-off-by: Patrick Roy <[email protected]>
The new test caught these. yay!

Signed-off-by: Patrick Roy <[email protected]>
Since Firecracker uses the musl target, it is statically compiled
against all of its dependencies. This means that all syscalls that
Firecracker can possibly do, are represented by `syscall`/`svc #0x0`
instructions inside its binary. By doing some primitive static analysis,
it is possible to determine, for each of these instructions, what the
actual syscall number is (the idea being: it will probably be a constant
loaded into a specific register, determined by Linux's syscall
convention, very near the instruction itself). On a best-effort basis,
we can also try to determine the values of all registers that might
holds syscall arguments (for things like ioctls, this works very well,
because the ioctl number is almost always a constant loaded very closely
to the syscall instruction).

The static_analysis.py file implements the logic for this analysis. We
essentially start at each syscall instruction, and then work through the
preceding instructions in reverse order, trying to symbolically execute
the program in reverse until we figure out the value of the register
that holds the syscall number. Thankfully, because of the above
mentioned locally, this means we only need to really understand a
handful of instructions (based on the assumption that we can ignore
instructions that do not explicitly refer to registers that we care
about).

For example, on x86_64, the syscall number is stored in the eax
register, so if we encounter the following instruction sequence

xor %ebx,%ebx
mov %ebx,%eax
syscall

we first symbolically execute the "mov", and determine that to figure
out the syscall number, we actually need to determine the value of the
ebx register. So we go back one more instruction, and find out that %ebx
is explicitly set to 0. Thus, forward propagating this 0 to the syscall
instruction again, we find that we're doing a `read` syscall.

Things get a bit more complicated on ARM, where we don't have to only
care about register-to-register transfers, but also things like `add`
instructions. For these, we must also keep a "log" of manipulations to
apply during forward propagation. For example, on ARM the syscall number
is passed in the w8 register, so in the following sequence

mov w19, #0x6
add w8, w19, #0x1
svc #0x0

we determine that to compute the syscall number, we must first determine
the value of the w19 register, but also we must remember that when we
forward propagate the value 0x6 from the w19 register, we must add 1 to
it when propagating into the w8 register.

Sometimes, we additionally have to deal with wrapper functions around
syscalls (e.g. `libc::syscall`). For these, we need to translate the
list of syscall number and syscall arguments that the wrapper received
according to  to the architectures C calling convention (cdecl on
x86_64, w0-w7 for arguments on ARM) into what the wrapper will
eventually pass to the syscall instruction. On x86_64 this comes with
the complication that cdecl only allows passing 6 arguments in
registers, but syscall nr + all arguments are up to 7 values. Thus, when
a syscall is invoked via libc::syscall, we can only determine up to 5
syscall arguments (the final one would be on the stack, and stack
analysis would significantly complicate this script).

Lastly, we sometimes also have wrappers for specific syscalls (e.g.
ioctl). Here, we do something very similar to the above, however since
now the syscall number is "implicit" (e.g. not an argument to the
wrapper function, but rather hard-coded into the wrapper), we no not
actually run into problems with argument counts on x86_64. We implement
this logic because it allows us to determine more arguments (for
example, if we did not explicitly look at callsites of the ioctl
wrapper, we would not be able to determine which ioctl numbers are
actually invoked, greatly reducing the usefulness, as our seccomp
filters are large ioctls).

After determining all the syscalls, we then compare it to our seccomp
filters. We look at all our allowed rules, and compute the set of rules
that does not match _any_ of the actual syscalls invoked by the binary.
If any such are found, we fail inside a new integration test.

There are fairly big limitations to this approach. Because we're doing a
global analysis of the firecracker binary, we cannot differentiate
between syscalls on different threads, and also not between syscalls
done before and after actual application of seccomp filters. For this,
we would need some sort of control flow graph analysis, which is made
significantly harder due to the presence of dynamic dispatch.
Nevertheless, this simple approach already showed 4 redundant syscall
rules (and playing around with applying the script to reduced
compilation units showed a lot more unneeded rules for the vcpu thread).

For all of this to work, we need to compile Firecracker as a
non-position independent executable (to get the compiler to general
direct calls with absolute addresses, instead of instruction pointer
relative ones). Since all dependencies also need to be compiled this
way, we have to use cargo's unstable `-Zbuild-std` feature, which thus
means we need a nightly toolchain.

Signed-off-by: Patrick Roy <[email protected]>
The nightly toolchain in our devctr is too old to successfully compile
non-PIE, statically linked aarch64 executable. I have manually verified
newer nightlies to work (2024-11-16 and newer), however our nightly
toolchain is dictated by kani, so we need to wait for kani to release a
new version that updates past this toolchain.

Therefore, temporarily disable the test on aarch64.

Signed-off-by: Patrick Roy <[email protected]>
These have been determined by static analysis of a Firecracker binary
(see also follow up commits): The removed seccomp rules here trigger
syscalls that are either not present at all in the entire Firecracker
binary, or are not reachable from the entry point of a specific
Firecracker thread (this analysis has only been done for the vcpu thread
for now, due to being fairly tricky).

Some explanations for why some of these entries are no longer needed can
be found below

- The `uname` syscall was used back when we supported 4.14, to query the
  host kernel version and disable specific Firecracker features that
  were not supported pre-5.10 (io_uring and hugepages for memfd). With
  4.14 support dropped, there are no such checks anymore.
- Various KVM_SET_* ioctls do not need to be allowed on the vcpu thread,
  because they are all called _before_ the vcpu seccomp filters are
  installed (as they are only used during snapshot restore / when
  preparing KVM state for boot).
- on aarch64, we can additionally remove KVM_{SET,GET}VCPU_EVENTS, as we
  never call this ioctl on arm (only on x86)

Signed-off-by: Patrick Roy <[email protected]>
Bumps the firecracker group with 7 updates:

| Package | From | To |
| --- | --- | --- |
| [thiserror](https://github.com/dtolnay/thiserror) | `2.0.6` | `2.0.7` |
| [serde](https://github.com/serde-rs/serde) | `1.0.215` | `1.0.216` |
| [serde_derive](https://github.com/serde-rs/serde) | `1.0.215` | `1.0.216` |
| [semver](https://github.com/dtolnay/semver) | `1.0.23` | `1.0.24` |
| [proptest](https://github.com/proptest-rs/proptest) | `1.5.0` | `1.6.0` |
| [cc](https://github.com/rust-lang/cc-rs) | `1.2.3` | `1.2.4` |
| [home](https://github.com/rust-lang/cargo) | `0.5.9` | `0.5.11` |


Updates `thiserror` from 2.0.6 to 2.0.7
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](dtolnay/thiserror@2.0.6...2.0.7)

Updates `serde` from 1.0.215 to 1.0.216
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](serde-rs/serde@v1.0.215...v1.0.216)

Updates `serde_derive` from 1.0.215 to 1.0.216
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](serde-rs/serde@v1.0.215...v1.0.216)

Updates `semver` from 1.0.23 to 1.0.24
- [Release notes](https://github.com/dtolnay/semver/releases)
- [Commits](dtolnay/semver@1.0.23...1.0.24)

Updates `proptest` from 1.5.0 to 1.6.0
- [Release notes](https://github.com/proptest-rs/proptest/releases)
- [Changelog](https://github.com/proptest-rs/proptest/blob/main/CHANGELOG.md)
- [Commits](https://github.com/proptest-rs/proptest/commits)

Updates `cc` from 1.2.3 to 1.2.4
- [Release notes](https://github.com/rust-lang/cc-rs/releases)
- [Changelog](https://github.com/rust-lang/cc-rs/blob/main/CHANGELOG.md)
- [Commits](rust-lang/cc-rs@cc-v1.2.3...cc-v1.2.4)

Updates `home` from 0.5.9 to 0.5.11
- [Changelog](https://github.com/rust-lang/cargo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/cargo/commits)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: serde_derive
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: semver
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: proptest
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: cc
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: home
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
...

Signed-off-by: dependabot[bot] <[email protected]>
In test_drive_rate_limiter, use `fio` for measuring latencies of writing
a fixed number of bytes to a block device, instead of `dd` (which the
test itself notes is unreliable). Should fix intermitted failures we're
been seeing in this test, hopefully.

Signed-off-by: Patrick Roy <[email protected]>
In #4955 the executable to check the ssh connection liveliness was
changed from `true` to `/usr/bin/true`, but that is not its path in all
rootfs, causing failures in the `test-populat-containers` suite.

Also, since the error is retried but the daemon is not cleaned
up, subsequent retries would fail for the assertion.

This change fixes both issues by using the binary name `true` and
stopping the daemon on error before the next retry.

Fixes: 3b2c2d4 ("test: use single SSH connection for lifetime of microvm")
Signed-off-by: Riccardo Mancini <[email protected]>
It seem that pytest_runtest_logreport is triggered for the worker where
the test happens and the main process.

So do not emit metrics unless we are in the main process.

Signed-off-by: Pablo Barbáchano <[email protected]>
In commit 9364472 we started to ignore
tx_read_fails/tx_write_fails, but should also have included
tx_flush_fails.

Add it here.

Fixes: 9364472

Signed-off-by: Pablo Barbáchano <[email protected]>
In some cases we may want to pass a complex value to a step. For
example:

  .buildkite/pipeline_pr.py \
    --step-param 'retry/automatic=[{"exit_status": "*", "limit": 2}]'

Signed-off-by: Pablo Barbáchano <[email protected]>
microcode_ctl only covers Intel firmware files. Capture AMD and any
other possible ones.

Signed-off-by: Pablo Barbáchano <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.