Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global compose lock #3543

Merged
merged 1 commit into from
Oct 17, 2024
Merged

Global compose lock #3543

merged 1 commit into from
Oct 17, 2024

Conversation

apostasie
Copy link
Contributor

@apostasie apostasie commented Oct 15, 2024

Our compose implementation is inherently not safe to use concurrently.

The root issue is that we do shell out multiple times to nerdctl to manipulate a bunch of resources, while assuming a state of the world that has possibly changed meanwhile.
Because of that, we cannot leverage API locking mechanisms (eg: as implemented in VolStore, etc) across multiple distincts invocations of nerdctl.

This PR suggests that we put in place a global filesystem lock for compose.

This is not pretty - and it will definitely prevent running multiple nerdctl compose commands in parallel (they will be queued).
Also, this is not going to solve all concurrency / racy issues with compose - as normal nerdctl operations may still run in parallel with a compose invocation.

But at least, it will prevent some of the most egregious problems - until we fix the core issues.

Let me know your thoughts.

@apostasie apostasie force-pushed the lock-compose branch 2 times, most recently from 2c3deb8 to 554c2f8 Compare October 15, 2024 02:22
@apostasie apostasie marked this pull request as ready for review October 15, 2024 03:02
@AkihiroSuda AkihiroSuda added this to the v2.0.0 milestone Oct 15, 2024
if err != nil {
return nil, err
}

Copy link
Member

@AkihiroSuda AkihiroSuda Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the unlock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When nerdctl compose exits, it will release the lock - we do not release explicitly.

The reason I did no do it explicitly is because (I think) we would have to have that code inside all compose commands, which would make for a large/more intrusive patch.
This makes here for a much simpler patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add code comments to explain that

Copy link
Member

@AkihiroSuda AkihiroSuda Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also wondering if we can just add *Composer.Close().

The caller of compose.New should have defer c.Close()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also wondering if we can just add *Composer.Close().
The caller of compose.New should have defer c.Close()

Sure.

Is it worth the effort and extra code though?

"github.com/containerd/nerdctl/v2/pkg/netutil"
"github.com/containerd/nerdctl/v2/pkg/referenceutil"
"github.com/containerd/nerdctl/v2/pkg/signutil"
"github.com/containerd/nerdctl/v2/pkg/strutil"
)

//nolint:unused
var locked *os.File
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why global var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise the lock will get released once we exit the scope of New().
We could add the lock to the Composer struct, but I am not sure that would bring much value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add code comments?

Also, how is the file lock associated with the lifetime of the Go variable? I don’t think *os.File has a destructor method to unlock the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add code comments?

Sure.

Also, how is the file lock associated with the lifetime of the Go variable? I don’t think *os.File has a destructor method to unlock the file?

I was surprised too.

My (vague) explanation is that garbage collection does close the file descriptor - and yes, flock() will release the lock when the fd gets closed.

Signed-off-by: apostasie <[email protected]>
@AkihiroSuda
Copy link
Member

AkihiroSuda commented Oct 17, 2024

Do you plan to update this for Unlock()?

@apostasie
Copy link
Contributor Author

Do you plan to update this for Unlock()?

I would rather not - unless you think it would really be better? Just lmk.

As far as I can think about this, the meaning of this lock is to stay for the entire life of the command execution. Unlocking it manually does not seem to serve much purpose. This here overall is a ugly hack to alleviate a hard to fix problem - but should not stay in the long term.

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let’s merge

@AkihiroSuda AkihiroSuda merged commit 1122644 into containerd:main Oct 17, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants