Skip to content
This repository has been archived by the owner on Dec 3, 2024. It is now read-only.

When created multiple bucketclaims parallelly, later associated buckets are not getting deleted along with the bucketclaim #139

Open
vegullah opened this issue Oct 24, 2024 · 4 comments

Comments

@vegullah
Copy link

Bug Report

What happened:
I've created 5 bucketclaims in parallel, all the bucketclaims and associated buckets were created.
When I tried to delete those bucketclaims, all the bucketclaims got deleted, but some of the buckets were not deleted

What you expected to happen:
When deleted bucketclaim, associated buckets should get deleted successfully

How to reproduce this bug (as minimally and precisely as possible):

  1. Create 5-6 BucketClaims parallely(with help of any script & thread package).
  2. Delete the BucketClaim's together in a single command.
  3. Repeat step 1 and 2 few times
  4. Create 5-6 BucketClaims, with the same name as in step-1, parallely.
    Verify if this error is seen in the object controller pod logs.
I1023 11:56:15.533581       1 bucketclaim.go:32] "Add BucketClaim" name="bclc" ns="default" bucketClass="bc1"
E1023 11:56:15.545584       1 bucketclaim.go:197] "Failed to update status of BucketClaim" err="Operation cannot be fulfilled on bucketclaims.objectstorage.k8s.io \"bclc\": the object has been modified; please apply your changes to the latest version and try again" name=""
E1023 11:56:15.545618       1 bucketclaim.go:53] "name" err="Operation cannot be fulfilled on bucketclaims.objectstorage.k8s.io \"bclc\": the object has been modified; please apply your changes to the latest version and try again" bclc="ns" default="err" Operation cannot be fulfilled on bucketclaims.objectstorage.k8s.io "bclc": the object has been modified; please apply your changes to the latest version and try again="(MISSING)"
  1. Delete the BucketClaim's together in a single command(or) one after another.
    -> The bucketclaims would be seen as deleted, but few buckets would be remaining.
    -> Delete request for the buckets remained, is not seen in side-car
    -> Something makes the controller delete the bucketclaim, without waiting for the bucket to be deleted.
    cosi-provisioner-sidecar.log
    objectstorage-controller.log

Anything else relevant for this bug report?:

Environment:

  • Kubernetes version (use kubectl version), please list client and server:
    Client Version: v1.30.3
    Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
    Server Version: v1.30.0
  • Controller version (provide the release tag or commit hash):
    gcr.io/k8s-staging-sig-storage/objectstorage-controller:v20221027-v0.1.1-8-g300019f
  • Provisoner name and version (provide the release tag or commit hash):
    gcr.io/k8s-staging-sig-storage/objectstorage-sidecar:latest
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
    PRETTY_NAME="Ubuntu 22.04.4 LTS"
    NAME="Ubuntu"
    VERSION_ID="22.04"
    VERSION="22.04.4 LTS (Jammy Jellyfish)"
    VERSION_CODENAME=jammy
    ID=ubuntu
    ID_LIKE=debian
    HOME_URL="https://www.ubuntu.com/"
    SUPPORT_URL="https://help.ubuntu.com/"
    BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
    PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
    UBUNTU_CODENAME=jammy
  • Kernel (e.g. uname -a):
    Linux tnh-cosi-3 5.15.0-46-generic move to sigs.k8s.io, remove retry logic in cosi-controller #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@narayviv
Copy link

Hi @BlaineEXE ,

While going through the codes of 'controller' & 'sidecar', I see that both sidecar & controller are updating status of the bucketclaim (within the scope of method BucketClaimListener#provisionBucketClaimOperation).
https://github.com/kubernetes-sigs/container-object-storage-interface-provisioner-sidecar/blob/80979e8992a6a2b2166f3ff1e7d39b4ab03f045c/pkg/bucket/bucket_controller.go#L163
https://github.com/kubernetes-sigs/container-object-storage-interface-controller/blob/38b4915c1bbc6b63144fa81351a72d228184a34c/pkg/bucketclaim/bucketclaim.go#L204

In this scenario, the bucketclaim object with the method(provisionBucketClaimOperation) of controller gets outdated, once sidecar updates the bucketClaim CR's status.

A suggestion from our side is to follow sidecar's approach with controller as well.

Note:
We didnt notice this issue, when bucket claims were created sequentially, with adequate time-gap. And happened to notice it when bucketclaims were created parallely (and sequentially without adequate time gap through scripts/yaml's).

Lets us know, if we could help further.
CC: @vegullah

@BlaineEXE
Copy link
Contributor

Thanks @narayviv this sounds like something we should address in the v1alpha2 API updates as well. I created a new issue and started tracking it via the COSI kanban board

@BlaineEXE
Copy link
Contributor

BlaineEXE commented Oct 28, 2024

@vegullah it's not clear to me from reading your description if this is a permanent or temporary issue. Could you clarify?

If the issue is temporary, this is an issue that can sometimes happen with any controller. This error is how Kubernetes helps prevent multiple readers/writers from colliding. As long as the issue resolves itself eventually, I don't see a need to fix this urgently.

As @narayviv has mentioned, this might be due to controller and sidecar both editing the resource. We will look into this and see if we can make the error reported here less frequent at a minimum. While the error doesn't seem concerning based on my assumption that it's not preventing reconciliation, we also don't want to have this happen every time, spamming the logs.

@vegullah
Copy link
Author

@BlaineEXE, You don't see the issue for initial 2 to 3 tries.
Let's say you created 5 bucketclaims (bcl1, bcl2, bcl3, bcl4 and bcl5). You'll be successfully able to delete the created 5 bucketclaims and associated buckets. Mostly this cycle of creation and deletion(with same bucketclaim names as previous cycle) will work for 2nd time aswell.
Then when you try to do the same thing for 3rd/4th time - Create the 5 bucketclaims (bcl1, bcl2, bcl3, bcl4 and bcl5), And at this time, when you try to delete the bucketclaims, some of them will be deleted successfully and for other some, the buckets will be remain

The issue doesn't fix by itself, unless you delete controller and side-car pods

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants