update kep-3926, how we address client cache inconsistency issue

kubernetes · Nov 5, 2024 · 3c01a2e · 3c01a2e
1 parent f431a1a
commit 3c01a2e
Showing 1 changed file with 58 additions and 1 deletion.
diff --git a/keps/sig-auth/3926-handling-undecryptable-resources/README.md b/keps/sig-auth/3926-handling-undecryptable-resources/README.md
@@ -93,6 +93,7 @@ tags, and then generate with `hack/update-toc.sh`.
     - [Retrieving a failing resource](#retrieving-a-failing-resource)
     - [Deleting a failing resource](#deleting-a-failing-resource)
     - [Protecting unconditional deletion](#protecting-unconditional-deletion)
+    - [Propagating with WATCH](#propagating-with-watch)
   - [Test Plan](#test-plan)
       - [Prerequisite testing updates](#prerequisite-testing-updates)
       - [Unit tests](#unit-tests)
@@ -405,7 +406,63 @@ deletions should therefore have their own extra admission.
 
 The unconditional deletion admission:
 1. checks if a "delete" request contains the `IgnoreStoreReadErrorWithClusterBreakingPotential` option
-2. if it does, it checks the RBAC of the request's user for the `delete-ignore-read-errors` verb of the given resource
+2. if it does, it checks the RBAC of the request's user for the `unsafe-delete-ignore-read-errors` verb of the given resource
+
+
+
+#### Propagating with WATCH
+When a corrupt object is deleted with an `unsafe-delete-ignore-read-errors`, a
+watcher of this resource will not be able to transform its old data from the
+storage, or decode it into an object [1]. This will cause the watch to throw
+an `watch.ERROR` event.
+A client backed up by an informer already has the object in its cache, since the
+client never receives a `watch.DELETED` event the object remains in the lsiter
+cache. This creates an inconsistency - retrieving the object from the cache
+yields the object, but if we get it from the storage we see a `corrupt object`
+error.
+
+To resolve this issue, we will change the watch to throw an `watch.ERROR` with
+a partial object of the type being watched that has the follwoing information:
+- Namespace/Name: identifies the object
+- ResourceVersion: it is the Revision of the key-value store after the Delete operation
+- Annotation: `"unsafe-delete-ignore-read-errors" : "true"` to indicate that the
+  object associated with the DELETED event from the storage was corrupt.
+
+
+We will change the reflector to interpret this error, and do the following:
+- delete the object from its store so clients are in sync with the current state
+- set the `LastSyncResourceVersion` to the `ResourceVersion` from the partial
+  object, this advances to the revision of the storage after the delete operation
+
+This is what it would look in the wire:
+```
+ {
+    type: "ERROR",
+    object: {
+      "ObjectMeta": {
+        "Name":"foo",
+        "Namespace":"test-ns",
+        "ResourceVersion":"3",
+		 "Annotations": {
+           "kubernetes.io/unsafe-delete-ignore-read-errors":"true"
+         }
+	 }
+  }
+}
+```
+
+Add an integration test to show that an informer is consistent with the state 
+in the storage after the corrupt resource is unsafe-deleted.
+
+There might be other alternatives:
+- a) add a new type of watch event to address this
+- b) send a watch.DELETED event with a partial object
+
+Question: how about the client that has not updated yet?
+
+[1] https://github.com/kubernetes/kubernetes/blob/2bb886ce2a6ae5a83031efd225e8a17dd7ecba03/staging/src/k8s.io/apiserver/pkg/storage/etcd3/watcher.go#L681-L697
+[2] https://github.com/kubernetes/kubernetes/pull/127513/commits/46c3fdc4eeda8d3ffb1839fff196c5fece75066d#diff-9ccdf713e010f73dbebd01e936cb0077fc63e4f5ab941d865ded42da219d84ec
+
 
 ### Test Plan