Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix locktuil on windows #2729

Merged
merged 2 commits into from
Oct 13, 2024
Merged

Conversation

nirs
Copy link
Contributor

@nirs nirs commented Oct 13, 2024

Add the missing test, revealing that the windows implementation does not match the unix one. It allows multiple threads to acquire the lock at the same time.

Fix the lock flags to create a blocking exclusive lock, and improve the code to prevent future errors.

Tested using the unit tests.

@AkihiroSuda Looks like this code was copied from nerdctl and it needs the same test and fix:
https://github.com/containerd/nerdctl/tree/v0.13.0/pkg/lockutil

Verify that the lock works for threads in the same process.

The test fails on windows:

    === RUN   TestWithDirLock
        lockutil_test.go:54: Expected one writer, got [writer 19 writer 0]

(Keeping as separate commit to make it easy to reproduce)

Signed-off-by: Nir Soffer <[email protected]>
We use LOCKFILE_FAIL_IMMEDIATELY, creating a shared lock that fail
immediately if the lock cannot be acquired. This does not match the
behavior of the unix version and does not make sense.

Using now LOCKFILE_EXCLUSIVE_LOCK to acquire exclusive lock blocking
until the lock can be acquired.

Add constants for the flags and add comments for the argument names to
prevent future errors.

Signed-off-by: Nir Soffer <[email protected]>
@nirs nirs mentioned this pull request Oct 13, 2024
const (
// see https://msdn.microsoft.com/en-us/library/windows/desktop/aa365203(v=vs.85).aspx
LOCKFILE_EXCLUSIVE_LOCK = 0x00000002
LOCKFILE_FAIL_IMMEDIATELY = 0x00000001
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the same constant names used in windows docs to make it easy to find the flag in the docs or using search.

I see that the original code form bolt use nicer go style:
https://github.com/boltdb/bolt/blob/2f1ce7a837dcb8da3ec595b1dac9d0632f0f99e8/bolt_windows.go#L18

We can use this style.

nirs added a commit to nirs/lima that referenced this pull request Oct 13, 2024
Ad sub tests for parallel downloads with "with cache" and "cahing-only
mode". The test run 10 concurrent downloads with the current test http
server and ensure that downloads succeeded.

To make it possible to test using the same process, we include now a
counter in the perProcessTempfile. The file name is now:

    {filename}.pid.{count}

The test revealed an issue with clonefile, not seen when testing 3
parallel `limactl create`. When the data file is replaced during a
clonefile syscall, it may fail with:

    clonefile failed: no such file or directory

This smells like a darwin bug, but we can avoid this issue by using the
first download result. When a download finishes, we check if data file
exists, and return success if it does. We take a lock for the very
short time needed to check and rename the temporary data file to the
target file.

The parallel tests failed on windows - it seems that os.Rename() does
not work on windows if the target exist. Replacing atomicWrite with a
simpler version that takes a lock and write the file fixes the issue.

Tested using:

    % go test ./pkg/downloader -run 'TestDownloadRemote/with_cache/parallel' -count=1000
    ok  	github.com/lima-vm/lima/pkg/downloader	116.025s

    % go test ./pkg/downloader -run 'TestDownloadRemote/caching-only_mode/parallel' -count=1000
    ok  	github.com/lima-vm/lima/pkg/downloader	98.535s

Notes:
- Depends on lima-vm#2729, fixing the lockutil windows implementation.

Signed-off-by: Nir Soffer <[email protected]>
@AkihiroSuda AkihiroSuda added this to the v1.0 milestone Oct 13, 2024
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit 713e1cd into lima-vm:master Oct 13, 2024
26 checks passed
@nirs nirs deleted the fix-locktuil-windows branch October 13, 2024 17:15
nirs added a commit to nirs/lima that referenced this pull request Oct 13, 2024
Ad sub tests for parallel downloads with "with cache" and "cahing-only
mode". The test run 10 concurrent downloads with the current test http
server and ensure that downloads succeeded.

To make it possible to test using the same process, we include now a
counter in the perProcessTempfile. The file name is now:

    {filename}.pid.{count}

The test revealed an issue with clonefile, not seen when testing 3
parallel `limactl create`. When the data file is replaced during a
clonefile syscall, it may fail with:

    clonefile failed: no such file or directory

This smells like a darwin bug, but we can avoid this issue by using the
first download result. When a download finishes, we check if data file
exists, and return success if it does. We take a lock for the very
short time needed to check and rename the temporary data file to the
target file.

The parallel tests failed on windows - it seems that os.Rename() does
not work on windows if the target exist. Replacing atomicWrite with a
simpler version that takes a lock and write the file fixes the issue.

Tested using:

    % go test ./pkg/downloader -run 'TestDownloadRemote/with_cache/parallel' -count=1000
    ok  	github.com/lima-vm/lima/pkg/downloader	116.025s

    % go test ./pkg/downloader -run 'TestDownloadRemote/caching-only_mode/parallel' -count=1000
    ok  	github.com/lima-vm/lima/pkg/downloader	98.535s

Notes:
- Depends on lima-vm#2729, fixing the lockutil windows implementation.

Signed-off-by: Nir Soffer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants