Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS vm virtiofs concurrency issue #23061

Closed
cynecx opened this issue Jun 20, 2024 · 10 comments
Closed

macOS vm virtiofs concurrency issue #23061

cynecx opened this issue Jun 20, 2024 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. machine macos MacOS (OSX) related remote Problem is in podman-remote stale-issue

Comments

@cynecx
Copy link

cynecx commented Jun 20, 2024

Issue Description

It seems like there is a virtiofs issue somewhere (probably a bug in Virtualization.framework's virtiofs implementation) that causes file operations in the virtiofs mount to fail with "No such file or directory" errors.

Docker-for-mac had this same issue: docker/for-mac#7059. They've apparently fixed this somehow.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Install the rust toolchain on your host
  2. cargo new hello && cd hello
  3. podman image pull rust:1-bookworm
  4. podman run --rm -it -v $(pwd):/app -w /app rust:1-bookworm
  5. (inside the container) cargo clean && cargo build

Describe the results you received

root@ac6bdfdf5032:/app# ls
Cargo.lock  Cargo.toml	Cross.toml  ok	ok.go  ok2.go  src  target
root@ac6bdfdf5032:/app# cargo clean
     Removed 1 file, 356B total
root@ac6bdfdf5032:/app# cargo build
   Compiling test-cross v0.1.0 (/app)
error: linking with `cc` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/bin:/usr/local/cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" VSLANG="1033" "cc" "/tmp/rustcGycfuq/symbols.o" "/app/target/debug/deps/test_cross-cf78f0f12436d1cb.1oxqyngo104fea0j.rcgu.o" "/app/target/debug/deps/test_cross-cf78f0f12436d1cb.254lozn1433bgkt8.rcgu.o" "/app/target/debug/deps/test_cross-cf78f0f12436d1cb.2vigihdxvt1iz5nz.rcgu.o" "/app/target/debug/deps/test_cross-cf78f0f12436d1cb.36du1bcdi6arfzvn.rcgu.o" "/app/target/debug/deps/test_cross-cf78f0f12436d1cb.3rk7zzedbxatx9yo.rcgu.o" "/app/target/debug/deps/test_cross-cf78f0f12436d1cb.74co9m6suyktnzj.rcgu.o" "/app/target/debug/deps/test_cross-cf78f0f12436d1cb.117wt34ou4acjtak.rcgu.o" "-Wl,--as-needed" "-L" "/app/target/debug/deps" "-L" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libstd-9db51037f7732c7f.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libpanic_unwind-43f7084971578043.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libobject-55e3c3e99d7ea57c.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libmemchr-fb53010b8d947b31.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libaddr2line-6f437829797b59f9.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libgimli-c465d68cd448aa2e.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/librustc_demangle-12979ddb857b6856.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libstd_detect-ecc5f92a35a5dcae.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libhashbrown-09b6240c5d3892f5.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-e35646347e036948.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libminiz_oxide-893df93494354b60.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libadler-297a87c8b999e355.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libunwind-eb6321afc60f0508.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libcfg_if-e65240dca34fcbd0.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/liblibc-4b653d72a90009a1.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/liballoc-58548d24e1f3d56c.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/librustc_std_workspace_core-84c459117cd1fdc9.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libcore-d6e05faaecef4023.rlib" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libcompiler_builtins-c35031b3bb3289ef.rlib" "-Wl,-Bdynamic" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/usr/local/rustup/toolchains/1.79.0-aarch64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib" "-o" "/app/target/debug/deps/test_cross-cf78f0f12436d1cb" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-nodefaultlibs"
  = note: /usr/bin/ld: cannot find /app/target/debug/deps/test_cross-cf78f0f12436d1cb.1oxqyngo104fea0j.rcgu.o: No such file or directory
          /usr/bin/ld: cannot find /app/target/debug/deps/test_cross-cf78f0f12436d1cb.254lozn1433bgkt8.rcgu.o: No such file or directory
          /usr/bin/ld: cannot find /app/target/debug/deps/test_cross-cf78f0f12436d1cb.2vigihdxvt1iz5nz.rcgu.o: No such file or directory
          /usr/bin/ld: cannot find /app/target/debug/deps/test_cross-cf78f0f12436d1cb.36du1bcdi6arfzvn.rcgu.o: No such file or directory
          /usr/bin/ld: cannot find /app/target/debug/deps/test_cross-cf78f0f12436d1cb.3rk7zzedbxatx9yo.rcgu.o: No such file or directory
          /usr/bin/ld: cannot find /app/target/debug/deps/test_cross-cf78f0f12436d1cb.74co9m6suyktnzj.rcgu.o: No such file or directory
          collect2: error: ld returned 1 exit status


error: could not compile `test-cross` (bin "test-cross") due to 1 previous error

Describe the results you expected

The build commands should all run successfully. The build artifacts should be properly "visible".

podman info output

Client:       Podman Engine
Version:      5.1.1
API Version:  5.1.1
Go Version:   go1.22.3
Git Commit:   bda6eb03dcbcf12a5b7ae004c1240e38dd056d24
Built:        Tue Jun  4 21:54:07 2024
OS/Arch:      darwin/arm64

Server:       Podman Engine
Version:      5.1.1
API Version:  5.1.1
Go Version:   go1.22.3
Built:        Tue Jun  4 02:00:00 2024
OS/Arch:      linux/arm64
host:
  arch: arm64
  buildahVersion: 1.36.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.fc40.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: '
  cpuUtilization:
    idlePercent: 98.61
    systemPercent: 0.38
    userPercent: 1
  cpus: 1
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: coreos
    version: "40"
  eventLogger: journald
  freeLocks: 2046
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.8.11-300.fc40.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 109060096
  memTotal: 993263616
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.11.0-1.20240531102943328308.main.4.g6838c50.fc40.aarch64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.0-dev
    package: netavark-1.11.0-1.20240606174759319307.main.8.gfebe31a.fc40.aarch64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.0-dev
  ociRuntime:
    name: crun
    package: crun-1.15-1.20240607090105650503.main.32.gea54402.fc40.aarch64
    path: /usr/bin/crun
    version: |-
      crun version UNKNOWN
      commit: 7cfd0aeb40e4605b6b0ee0afd9cfca80f9c5f68a
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240510.g7288448-1.fc40.aarch64
    version: |
      pasta 0^20240510.g7288448-1.fc40.aarch64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-2.fc40.aarch64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 3h 56m 53.00s (Approximately 0.12 days)
  variant: v8
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 1
    stopped: 1
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 36975915008
  graphRootUsed: 6269661184
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 24
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 5.1.1
  Built: 1717459200
  BuiltTime: Tue Jun  4 02:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.3
  Os: linux
  OsArch: linux/arm64
  Version: 5.1.1


### Podman in a container

No

### Privileged Or Rootless

Rootless

### Upstream Latest Release

Yes

### Additional environment details

Additional environment details

### Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting
@cynecx cynecx added the kind/bug Categorizes issue or PR as related to a bug. label Jun 20, 2024
@github-actions github-actions bot added macos MacOS (OSX) related remote Problem is in podman-remote labels Jun 20, 2024
@Luap99 Luap99 added the machine label Jun 21, 2024
@Luap99
Copy link
Member

Luap99 commented Jun 21, 2024

Has anyone reported this to apple?

@baude
Copy link
Member

baude commented Jun 21, 2024

It has been reported and evidently fixed in 15? rust-lang/docker-rust#161

@cynecx
Copy link
Author

cynecx commented Jun 21, 2024

Okay after a bit of digging I actually found out what the docker devs did (Huge Thank You 🙏).

It's basically this kernel patch:

[PATCH 9/9] hardlinks: drop the cache of the existing entry.
From 85589340b82880f5e52bee98f96ef11ae412dbb5 Mon Sep 17 00:00:00 2001
From: Docker <[email protected]>
Date: Sun, 11 Feb 2024 09:06:12 +0000
Subject: [PATCH 9/9] hardlinks: drop the cache of the existing entry.

Under virtualization.framework's virtiofs, the inode numbers are synthetic.
When the hardlink is created, the inode of the original file sometimes
changes too which means cached lookups will fail.

Work around by dropping the cache of the existing entry.

This doesn't affect grpcfuse as the host inode numbers are exposed in the VM.

Tested with a reproducer like this:
```
touch a
mkdir b

cat > main.c << EOT

int one(){
    int ret;
    struct stat st;
    ret = stat("b/b", &st);
    if (ret == 0) {
        unlink("b/b");
    }
    if (!stat("b/b", &st)){
        fprintf(stderr, "b/b exists\n");
        return 0;
    }
    if (stat("a", &st)){
        fprintf(stderr, "a does not exist\n");
        return 0;
    }
    struct timespec begin, end;
    clock_gettime(CLOCK_MONOTONIC_RAW, &begin);
    ret = link("a", "b/b");
    if(ret == -1){
        perror("link");
        if (stat("a", &st)){
            fprintf(stderr, "a does not exist\n");
        } else {
            fprintf(stderr, "a exists\n");
        }
        if (stat("b", &st)){
            fprintf(stderr, "b does not exist\n");
        } else {
            fprintf(stderr, "b exists\n");
        }
        if (stat("b/b", &st)){
            fprintf(stderr, "b/b does not exist\n");
        } else {
            fprintf(stderr, "b/b exists\n");
        }
        int retries = 0;
        do {
            ret = link("a", "b/b");
            if (ret == 0) {
                break;
            }
            retries++;
        } while (1);
        clock_gettime(CLOCK_MONOTONIC_RAW, &end);
        double time_spent = (end.tv_nsec - begin.tv_nsec) / 1000000000.0 + (end.tv_sec  - begin.tv_sec);
        fprintf(stderr, "link succeeded after %d retries in %f seconds\n", retries, time_spent);
        return 0;
    }
    int fd = open("b/b", O_RDONLY);
    if(fd == -1){
        perror("open");
        return 0;
    }
    sendfile(1, fd, NULL, 16777216);
    close(fd);

    ret = unlink("b/b");
    if(ret == -1){
        perror("unlink");
        return 0;
    }
    return 1;
}

void main(){
    for (int i = 0;; i++){
        if (!one()){
            fprintf(stdout, "\nFailed on iteration %d\n", i);
            break;
        };
        fprintf(stdout, ".");
        fflush(stdout);
    }
}
EOT

Signed-off-by: Docker <[email protected]>
---
 fs/fuse/dir.c     | 20 +++++++++++++++++---
 fs/fuse/readdir.c |  3 +++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index d707e6987..05fd0fa27 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -258,6 +258,10 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 		fuse_change_attributes(inode, &outarg.attr, NULL,
 				       ATTR_TIMEOUT(&outarg),
 				       attr_version);
+		if ((inode->i_nlink > 1) && (!S_ISDIR(inode->i_mode))){
+			/* This case happens a lot when using hardlinks */
+			outarg.entry_valid = 0;
+		}
 		fuse_change_entry_timeout(entry, &outarg);
 	} else if (inode) {
 		fi = get_fuse_inode(inode);
@@ -442,9 +446,12 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
 		goto out_err;

 	entry = newent ? newent : entry;
-	if (outarg_valid)
+	if (outarg_valid) {
+		if (inode && (inode->i_nlink > 1) && (!S_ISDIR(inode->i_mode))){
+			outarg.entry_valid = 0;
+		}
 		fuse_change_entry_timeout(entry, &outarg);
-	else
+	} else
 		fuse_invalidate_entry_cache(entry);

 	if (inode)
@@ -690,6 +697,9 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	}
 	kfree(forget);
 	d_instantiate(entry, inode);
+	if ((inode->i_nlink > 1) && (!S_ISDIR(inode->i_mode))){
+		outentry.entry_valid = 0;
+	}
 	fuse_change_entry_timeout(entry, &outentry);
 	fuse_dir_changed(dir);
 	err = finish_open(file, entry, generic_file_open);
@@ -819,7 +829,9 @@ static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args,
 	d = d_splice_alias(inode, entry);
 	if (IS_ERR(d))
 		return PTR_ERR(d);
-
+	if (args->opcode == FUSE_LINK){
+		outarg.entry_valid = 0;
+	}
 	if (d) {
 		fuse_change_entry_timeout(d, &outarg);
 		dput(d);
@@ -1101,6 +1113,7 @@ static int fuse_link(struct dentry *entry, struct inode *newdir,
 	struct fuse_link_in inarg;
 	struct inode *inode = d_inode(entry);
 	struct fuse_mount *fm = get_fuse_mount(inode);
+	struct fuse_entry_out not_valid = {0,0};
 	FUSE_ARGS(args);

 	memset(&inarg, 0, sizeof(inarg));
@@ -1111,6 +1124,7 @@ static int fuse_link(struct dentry *entry, struct inode *newdir,
 	args.in_args[0].value = &inarg;
 	args.in_args[1].size = newent->d_name.len + 1;
 	args.in_args[1].value = newent->d_name.name;
+	fuse_change_entry_timeout(entry, &not_valid);
 	err = create_new_entry(fm, &args, newdir, newent, inode->i_mode);
 	if (!err)
 		fuse_update_ctime_in_cache(inode);
diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index 9e6d587b3..ab282724f 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -256,6 +256,9 @@ static int fuse_direntplus_link(struct file *file,
 	}
 	if (fc->readdirplus_auto)
 		set_bit(FUSE_I_INIT_RDPLUS, &get_fuse_inode(inode)->state);
+	if ((inode->i_nlink > 1) && (!S_ISDIR(inode->i_mode))) {
+		o->entry_valid = 0;
+	}
 	fuse_change_entry_timeout(dentry, o);

 	dput(dentry);
--
2.43.0

@cynecx
Copy link
Author

cynecx commented Jun 22, 2024

Quick update: I've built a local kernel with that patch applied (applies cleanly on 6.9.6) and it seems that it works as a temporary workaround.

@kasperk81
Copy link

@cynecx do you have a link to that git commit or mailing list entry? i can't find it anywhere

@cynecx
Copy link
Author

cynecx commented Jun 24, 2024

@kasperk81 You can find all the patches embedded in the docker vm itself under /usr/src :)

@cfergeau
Copy link
Contributor

Steps to reproduce the issue

Steps to reproduce the issue

1. Install the rust toolchain on your host

2. `cargo new hello && cd hello`

3. `podman image pull rust:1-bookworm`

4. `podman run --rm -it -v $(pwd):/app -w /app rust:1-bookworm`

5. (inside the container) `cargo clean && cargo build`

For what it's worth, I was not able to reproduce with these steps on my mac, but I also did not get a cross build as in your logs Compiling test-cross v0.1.0 (/app). I guess I should try with a bigger rust project.

@cynecx
Copy link
Author

cynecx commented Jun 25, 2024

@cfergeau Oh yeah sorry about that. I was just being lazy and pasted another very related test-case's build output. But it's still really just a hello world with a single println!("hello world"). You just might have to run it (cargo clean && cargo build) in a loop or so...

Copy link

A friendly reminder that this issue had no activity for 30 days.

@Luap99
Copy link
Member

Luap99 commented Sep 6, 2024

I move this to a discussion as this is not a podman bug and we just ship the default fedora kernel so we have no way of patching the kernel.

@containers containers locked and limited conversation to collaborators Sep 6, 2024
@Luap99 Luap99 converted this issue into discussion #23886 Sep 6, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
kind/bug Categorizes issue or PR as related to a bug. machine macos MacOS (OSX) related remote Problem is in podman-remote stale-issue
Projects
None yet
Development

No branches or pull requests

5 participants