-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Way to specify posix userid for efs access points #393
Comments
@hussainsaify No, there isn't a way in dynamic provisioning to make access points be provisioned using a specific posix Id. |
@kbasv Thanks for the response. If it makes sense to add a feature to specify userid while creating access points then it would be great beacuse in our curent use case each of our micorservices access a common folder structure and we build uid/gid into our docker images whcih may make dynamic provisioning non-viable to use at this point. |
We have the same use case as @hussainsaify as mentioned and the posix gid is also getting incremented with each dynamically provisioned efs access points. It would be good to have a feature to set a fixed gid for each dynamically provisioned efs access point. |
I too would like to have multiple efs access points with the same posix ids. In our scenario, we have a stateful set where each pod has 2 efs mounts. One is shared amongst all pods in the stateful set (Shared directory) and the other is unique for each pod in the stateful set. As all pods run as the same user, it would make sense that all of those mounts have the same uid/gid. We were really hoping that this dynamic provisioning would allow us to do that without having to connect to each efs access point separately to update the permissions. Is there a technical reason this is not supported, or is it just a 'not yet' thing? |
I noticed #434 after I wrote my above comment so I see that this is being worked on. However, I have been digging further and did see some comments that the range of allowed uids should indeed be a part of the SC, but specifying the UID for any given PV should be done (optionally) at the PVC level. With the proposed changes in the above PR, then any dynamically created volume created with that SC would have the same uid/gid. This means that you would need a different SC for each application that could potentially be running as a different user. If this uid were taken from the PVC spec, then each pod could specify its own uid/gid. This uid/gid would then be applied to the new Access Point and the correct permissions would be applied to the created directory. As I have not dug deeply into the controller code yet, i'm not sure if it has access to spec or annotations from the PVC when creating the PV, so I don't know if this is possible... |
Thanks for the request here. I'm interested in hearing more about your use cases and other things you've tried. The intention of dynamic provisioning is to create new, empty, private volumes for applications to use. Logically, each PV/PVC represents a single volume or data set. When two pods need to share a volume, rather than having two PVs with the same configuration (path and identity), the best practice is to reference the PVC in both pod configurations. Further, if you already have a file system with specific data in a path, and you want to access that data with a particular identity, the best practice would be to create a static PV that references the filesystemid/accesspointid that has the appropriate path/identity, then your application can claim it. Could you help me understand if there are use cases that the above approaches do not satisfy? |
Hi Will, thanks for your interest in possible use-cases. I'm still working my way through some of this so it's entirely possible I have some things wrong, please correct me if this is the case or if there are preferred methods to get what I need. This could get long :) There are a few scenarios that I alluded to above, so I will attempt to split them up further down. First an overview of what we are currently doing and trying to do. What are we currently doing?Right now, we are using Static provisioning, either path based or Access Point based.
What are we trying to achieve?We were hoping that with Dynamic provisioning we would be able to avoid the steps of having to manually provision the PVs. That's the intent right? This way, there is reduced overhead on the K8s administration team to do these extra steps before new applications are deployed. Scenario 1 - Per-pod volumes.In this scenario, consider a single pod that needs persistent storage. This pod may be from some upstream vendor where we have no control over what the pod does. It may be that this pod decides to check the uid/gid of the persistent volume and then run chown commands on them to keep them in check. When the new uid of the chown command does not match the uid in the Access Point, it fails and can stop the pod from starting. This is apparently the case in some other issues I have read recently, particularly here: #300 If we were able to specify the uid/gid, then the chown commands have no effect (as the before/after uid is the same now) and they do not fail. Of course for applications that do not care about the ownership on these persistent volumes, there is no problem with the dynamically allocated uid. Some have suggested hacks like creating another user in the pod (based on the uid of the PV) and running the app as that, but it sounds hacky. Scenario 2 - Pods with multiple volumesIn this scenario, there may be a pod with multiple persistent volumes backed by multiple dynamically provisioned Access Points. One may be a private volume just for that pod, another may be a shared volume that is used by multiple pods (We will move on to that in a moment). In this scenario, each of the mounted volumes will have a different uid. We run into the same problems here as we do in Scenario 1, except now you cannot even use the hack like running as a new user with the same uid as the PV. Scenario 3 - Multiple pods using a single PV as a shared volume.This is not really any different to scenario 1, as it still just presents as a mismatch between the PV uid and the container uid. The reason I am including it is to clarify that earlier, I was not talking about sharing volumes between pods by having separate PVs mapping back to the same EFS AP. I was referring to pods that reference the same PVC that uses ReadWriteMany. The same uid mismatch problems will still occur though. Scenario 4 - Multiple applications using dynamically provisioned PVs on a single EFS FSI have seen reference to the use case where the K8s administrators will provision a single EFS FS and then use Dynamic Provisioning to create APs & PVs for multiple applications. A bit long, but I hope that helped shed some light on my thinking here. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
I too would like to have efs access points and the ability to specify the uid/gid for each. My current use case is that we'd like to backup our cluster using Velero and some of our applications are using EFS. |
Thanks all for the comments on what you're trying to do. I have an alternate proposal on how to solve this: instead of hardcoding the UID/GID of each PV to a value specified at the storage class level, we can add a 'trustPodIdentity' option to the storage class. When enabled, instead of creating access points with a set UID/GID, the driver omits these values in the access point, causing EFS to trust the UID/GID sent over the wire from the client. In cases where the container is opinionated about its identity (which seems to be the case here), that identity will be used for any read/write/chown/etc operations, allowing these applications to work out of the box. The benefit of this approach compared to the existing PR is 1/ the SC owner/creator doesn't have to know the applications UID/GID ahead of time and 2/ the SC can support multiple applications that want to assume different UID/GIDs. Now, in order for this to work we'll also have to create the access point's root directory with open (777) permissions because we won't know the UID/GID of the container ahead of time, but any container that does a chown/chmod on startup will be able to constrain this. This model also doesn't have the same level of security isolation that the default model with unique UID/GID per PV, snippit from EFS documentation: Security Model for Access Point Root Directories In the NFS protocol, servers generate file handles that are used by clients as unique references when accessing files. EFS securely generates file handles that are unpredictable and specific to an EFS file system. When a root directory override is in place, EFS doesn't disclose file handles for files outside the specified root directory. However, in some cases a user might get a file handle for a file outside of their access point by using an out-of-band mechanism. For example, they might do so if they have access to a second access point. If they do this, they can perform read and write operations on the file. File ownership and access permissions are always enforced, for access to files within and outside of a user's access point root directory. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I'm surprised you can't specify a uid\gid for each EFS volume in Kubernetes. Seems a glaring ommission when best practice is to run containers as non-root. Each app is unlikely to share uid\gid. Agree that it should occur in pvc, not sc. Pinning the uid\gid to a static\predictable one, such that chown works is an improvement. But best to do away with initcontainers\chown and permission volumes right in the first place. |
So it seems that we can specify uid and gid in the StorageClass now. Thanks @nicolas-geniteau et al. for the improvement. Is it not technically possible to specify uid and gid in the PersistentVolumeClaim instead of the StorageClass? |
@tavin Sadly no it isn't if you look at the API Documentation for PersistentVolumeClaims, they don't carry a huge amount of information (https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#persistentvolumeclaim-v1-core). StorageClasses have the nice property that the |
@jonathanrainer Could the driver be programmed to look at some custom annotations on the PVC? |
@tavin So unfortunately not, while it's a very good idea in principle it would require changes to the So in the short term the storageClass is really the only mechanism for passing that information through |
@kbasv I have installed the efs-csi-driver for dynamically creating volumes and access points. However, each time a pvc is created it is creating an access point with a different posix userid incrementally from the gid range specified in the storage class. Is there a way to specify the posix user id so that the access point is created with that posix id only?
K8s Version- 1.19(EKS)
Efs-csi-driver- 1.2.0
The text was updated successfully, but these errors were encountered: