When created multiple bucketclaims parallelly, later associated buckets are not getting deleted along with the bucketclaim #139

vegullah · 2024-10-24T08:29:02Z

Bug Report

What happened:
I've created 5 bucketclaims in parallel, all the bucketclaims and associated buckets were created.
When I tried to delete those bucketclaims, all the bucketclaims got deleted, but some of the buckets were not deleted

What you expected to happen:
When deleted bucketclaim, associated buckets should get deleted successfully

How to reproduce this bug (as minimally and precisely as possible):

Create 5-6 BucketClaims parallely(with help of any script & thread package).
Delete the BucketClaim's together in a single command.
Repeat step 1 and 2 few times
Create 5-6 BucketClaims, with the same name as in step-1, parallely.
Verify if this error is seen in the object controller pod logs.

I1023 11:56:15.533581       1 bucketclaim.go:32] "Add BucketClaim" name="bclc" ns="default" bucketClass="bc1"
E1023 11:56:15.545584       1 bucketclaim.go:197] "Failed to update status of BucketClaim" err="Operation cannot be fulfilled on bucketclaims.objectstorage.k8s.io \"bclc\": the object has been modified; please apply your changes to the latest version and try again" name=""
E1023 11:56:15.545618       1 bucketclaim.go:53] "name" err="Operation cannot be fulfilled on bucketclaims.objectstorage.k8s.io \"bclc\": the object has been modified; please apply your changes to the latest version and try again" bclc="ns" default="err" Operation cannot be fulfilled on bucketclaims.objectstorage.k8s.io "bclc": the object has been modified; please apply your changes to the latest version and try again="(MISSING)"

Delete the BucketClaim's together in a single command(or) one after another.
-> The bucketclaims would be seen as deleted, but few buckets would be remaining.
-> Delete request for the buckets remained, is not seen in side-car
-> Something makes the controller delete the bucketclaim, without waiting for the bucket to be deleted.
cosi-provisioner-sidecar.log
objectstorage-controller.log

Anything else relevant for this bug report?:

Environment:

Kubernetes version (use kubectl version), please list client and server:
Client Version: v1.30.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.0
Controller version (provide the release tag or commit hash):
gcr.io/k8s-staging-sig-storage/objectstorage-controller:v20221027-v0.1.1-8-g300019f
Provisoner name and version (provide the release tag or commit hash):
gcr.io/k8s-staging-sig-storage/objectstorage-sidecar:latest
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Kernel (e.g. uname -a):
Linux tnh-cosi-3 5.15.0-46-generic move to sigs.k8s.io, remove retry logic in cosi-controller #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

The text was updated successfully, but these errors were encountered:

narayviv · 2024-10-24T10:05:45Z

Hi @BlaineEXE ,

While going through the codes of 'controller' & 'sidecar', I see that both sidecar & controller are updating status of the bucketclaim (within the scope of method BucketClaimListener#provisionBucketClaimOperation).
https://github.com/kubernetes-sigs/container-object-storage-interface-provisioner-sidecar/blob/80979e8992a6a2b2166f3ff1e7d39b4ab03f045c/pkg/bucket/bucket_controller.go#L163
https://github.com/kubernetes-sigs/container-object-storage-interface-controller/blob/38b4915c1bbc6b63144fa81351a72d228184a34c/pkg/bucketclaim/bucketclaim.go#L204

In this scenario, the bucketclaim object with the method(provisionBucketClaimOperation) of controller gets outdated, once sidecar updates the bucketClaim CR's status.

A suggestion from our side is to follow sidecar's approach with controller as well.

Before updating bucket claim status in the given method (BucketClaimListener#provisionBucketClaimOperation).
https://github.com/kubernetes-sigs/container-object-storage-interface-controller/blob/38b4915c1bbc6b63144fa81351a72d228184a34c/pkg/bucketclaim/bucketclaim.go#L204
Fetch updated bucketClaim & set the values for fields 'bucketClaim.Status.BucketName' & 'bucketClaim.Status.BucketReady' in the latest bucketClaim object & then use it for UpdateStatus.
https://github.com/kubernetes-sigs/container-object-storage-interface-provisioner-sidecar/blob/80979e8992a6a2b2166f3ff1e7d39b4ab03f045c/pkg/bucket/bucket_controller.go#L154

Note:
We didnt notice this issue, when bucket claims were created sequentially, with adequate time-gap. And happened to notice it when bucketclaims were created parallely (and sequentially without adequate time gap through scripts/yaml's).

Lets us know, if we could help further.
CC: @vegullah

BlaineEXE · 2024-10-24T19:57:08Z

Thanks @narayviv this sounds like something we should address in the v1alpha2 API updates as well. I created a new issue and started tracking it via the COSI kanban board

BlaineEXE · 2024-10-28T17:02:39Z

@vegullah it's not clear to me from reading your description if this is a permanent or temporary issue. Could you clarify?

If the issue is temporary, this is an issue that can sometimes happen with any controller. This error is how Kubernetes helps prevent multiple readers/writers from colliding. As long as the issue resolves itself eventually, I don't see a need to fix this urgently.

As @narayviv has mentioned, this might be due to controller and sidecar both editing the resource. We will look into this and see if we can make the error reported here less frequent at a minimum. While the error doesn't seem concerning based on my assumption that it's not preventing reconciliation, we also don't want to have this happen every time, spamming the logs.

vegullah · 2024-10-29T03:52:38Z

@BlaineEXE, You don't see the issue for initial 2 to 3 tries.
Let's say you created 5 bucketclaims (bcl1, bcl2, bcl3, bcl4 and bcl5). You'll be successfully able to delete the created 5 bucketclaims and associated buckets. Mostly this cycle of creation and deletion(with same bucketclaim names as previous cycle) will work for 2nd time aswell.
Then when you try to do the same thing for 3rd/4th time - Create the 5 bucketclaims (bcl1, bcl2, bcl3, bcl4 and bcl5), And at this time, when you try to delete the bucketclaims, some of them will be deleted successfully and for other some, the buckets will be remain

The issue doesn't fix by itself, unless you delete controller and side-car pods

BlaineEXE mentioned this issue Oct 24, 2024

both sidecar & controller update BucketClaim status kubernetes-retired/container-object-storage-interface-api#101

Open

BlaineEXE added this to Container Object Storage Interface Oct 29, 2024

BlaineEXE moved this to To do for v1alpha2 in Container Object Storage Interface Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When created multiple bucketclaims parallelly, later associated buckets are not getting deleted along with the bucketclaim #139

When created multiple bucketclaims parallelly, later associated buckets are not getting deleted along with the bucketclaim #139

vegullah commented Oct 24, 2024

narayviv commented Oct 24, 2024

BlaineEXE commented Oct 24, 2024

BlaineEXE commented Oct 28, 2024 •

edited

Loading

vegullah commented Oct 29, 2024

When created multiple bucketclaims parallelly, later associated buckets are not getting deleted along with the bucketclaim #139

When created multiple bucketclaims parallelly, later associated buckets are not getting deleted along with the bucketclaim #139

Comments

vegullah commented Oct 24, 2024

Bug Report

narayviv commented Oct 24, 2024

BlaineEXE commented Oct 24, 2024

BlaineEXE commented Oct 28, 2024 • edited Loading

vegullah commented Oct 29, 2024

BlaineEXE commented Oct 28, 2024 •

edited

Loading