From 5b71603b822e6b8385f47481a8ff4a1a4bcdf92a Mon Sep 17 00:00:00 2001 From: Eric Weber Date: Thu, 16 Dec 2021 11:06:14 -0600 Subject: [PATCH 1/5] Clarify orphan mounts troubleshooting --- docs/troubleshooting.md | 156 ++++++++++++++++++++++++++++------------ 1 file changed, 111 insertions(+), 45 deletions(-) diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index e58a575a..ec019153 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -1,7 +1,7 @@ # Troubleshooting Guide -## Contents +## Contents * [Overview](#overview) * [Kubernetes](#kubernetes) @@ -9,18 +9,19 @@ for a PVC](#k8s-determining-the-beegfs-client-conf-for-a-pvc) * [Orphaned BeeGFS Mounts Remain on Nodes](#orphan-mounts) -## Overview +## Overview + This section provides guidance and tips around troubleshooting issues that come up using the driver. For anything not covered here, please [submit an issue](https://github.com/NetApp/beegfs-csi-driver/issues) using the label "question". Suspected bugs should be submitted with the label "bug". -## Kubernetes +## Kubernetes -### Determining the BeeGFS Client Configuration for a PVC +### Determining the BeeGFS Client Configuration for a PVC BeeGFS Client configuration is specified in a Kubernetes ConfigMap, that is parsed out to generate the client configuration that applies to a PVC for a @@ -75,73 +76,138 @@ user@ictm1625h12:~$ sudo cat /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-3 quotaEnabled = true ``` + ### Orphaned Mounts Remain on Nodes - There are a number of circumstances that can cause orphaned mounts to remain on the nodes of a container orchestrator after a BeeGFS volume is deleted. These are largely outside the control of the BeeGFS CSI driver and occur due to how a particular container orchestrator interacts with CSI drivers in general. Starting in v1.2.1 the BeeGFS CSI driver introduced functionality that can -mitigate this behavior in most circumstances, but administrators should be aware -of the potential, and the measures taken by the driver to mitigate it. +mitigates common causes of this behavior in Kubernetes, but administrators +should be aware of the potential, and the measures taken by the driver to +mitigate it. + #### General Symptoms -The driver controller service logs indicate that the controller service waited -for the maximum allowed time before deleting a BeeGFS directory. This -indicates that Kubernetes called DeleteVolume before NodeUnpublish and -NodeUnstageVolume completed on at least one node and these operations never -completed before the timeout. +There are BeeGFS mounts on a worker node that are no longer associated +with existing Persistent Volumes. ```bash --> kubectl logs csi-beegfs-controller-0 | grep 380e4ac6 -... -I1105 17:54:00.421874 1 server.go:189] "msg"="GRPC call" "reqID"="001d" "method"="/csi.v1.Controller/DeleteVolume" "request"="{\"volume_id\":\"beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6\"}" -I1105 17:54:00.422079 1 beegfs_util.go:62] "msg"="Writing client files" "reqID"="001d" "path"="/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_e2e-test_dynamic_pvc-380e4ac6" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -I1105 17:54:00.423025 1 beegfs_util.go:208] "msg"="Mounting volume to path" "reqID"="001d" "mountOptions"=["rw","relatime","cfgFile=/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_e2e-test_dynamic_pvc-380e4ac6/beegfs-client.conf"] "path"="/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_e2e-test_dynamic_pvc-380e4ac6/mount" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -I1105 17:54:00.445937 1 controllerserver.go:441] "msg"="Waiting for volume to unstage from all nodes" "reqID"="001d" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -I1105 17:54:02.450102 1 controllerserver.go:441] "msg"="Waiting for volume to unstage from all nodes" "reqID"="001d" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -... -I1105 17:54:58.559323 1 controllerserver.go:441] "msg"="Waiting for volume to unstage from all nodes" "reqID"="001d" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -I1105 17:55:00.562626 1 controllerserver.go:446] "msg"="Deleting BeeGFS directory" "reqID"="001d" "path"="/e2e-test/dynamic/.csi/volumes/pvc-380e4ac6" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -I1105 17:55:00.565337 1 controllerserver.go:461] "msg"="Deleting BeeGFS directory" "reqID"="001d" "path"="/e2e-test/dynamic/pvc-380e4ac6" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -I1105 17:55:00.570070 1 beegfs_util.go:270] "msg"="Unmounting volume from path" "reqID"="001d" "path"="/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_e2e-test_dynamic_pvc-380e4ac6/mount" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -I1105 17:55:00.570734 1 mount_helper_common.go:71] "/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_e2e-test_dynamic_pvc-380e4ac6/mount" is a mountpoint, unmounting -I1105 17:55:01.871999 1 mount_helper_common.go:85] "/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_e2e-test_dynamic_pvc-380e4ac6/mount" is unmounted, deleting the directory -I1105 17:55:01.872095 1 beegfs_util.go:283] "msg"="Cleaning up path" "reqID"="001d" "path"="/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_e2e-test_dynamic_pvc-380e4ac6" "volumeID"="beegfs://10.113.4.46/e2e-test/dynamic/pvc-380e4ac6" -... +# On a workstation with kubectl access to the cluster: +-> kubectl get pv +No resources found + +# On a worker node: +-> mount | grep beegfs +tmpfs on /var/lib/kubelet/pods/d8acdcaf-38ab-46c1-ab46-bbec0ca67e0b/volumes/kubernetes.io~secret/csi-beegfs-node-sa-token-j6msh type tmpfs (rw,relatime,seclabel) +beegfs_nodev on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-12ff9a7a/globalmount/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-12ff9a7a/globalmount/beegfs-client.conf) +beegfs_nodev on /var/lib/kubelet/pods/72d39e7f-3685-4f17-9dd1-bd9796d92b75/volumes/kubernetes.io~csi/pvc-12ff9a7a/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-12ff9a7a/globalmount/beegfs-client.conf) +beegfs_nodev on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-1b4d5347/globalmount/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-1b4d5347/globalmount/beegfs-client.conf) +beegfs_nodev on /var/lib/kubelet/pods/92c15f5b-7c13-4f52-87a1-b4400603e990/volumes/kubernetes.io~csi/pvc-1b4d5347/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-1b4d5347/globalmount/beegfs-client.conf) +beegfs_nodev on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-5c645794/globalmount/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-5c645794/globalmount/beegfs-client.conf) +beegfs_nodev on /var/lib/kubelet/pods/92c15f5b-7c13-4f52-87a1-b4400603e990/volumes/kubernetes.io~csi/pvc-5c645794/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-5c645794/globalmount/beegfs-client.conf) +beegfs_nodev on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-af832ede/globalmount/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-af832ede/globalmount/beegfs-client.conf) +beegfs_nodev on /var/lib/kubelet/pods/c8cb9229-ea8d-4430-88c3-418281db59bf/volumes/kubernetes.io~csi/pvc-af832ede/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-af832ede/globalmount/beegfs-client.conf) ``` -BeeGFS mounts still exist on a node indefinitely. +The kubelet on the worker node reports errors while attempting to clean up +associated directories in `/var/lib/kubelet/pods`. Specific error messages vary +depending on the root cause of the issue. ```bash --> ssh root@some.node mount | grep beegfs | grep pvc-380e4ac6 -beegfs_nodev on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-380e4ac6/globalmount/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-380e4ac6/globalmount/beegfs-client.conf) -beegfs_nodev on /var/lib/kubelet/pods/06f28bf6-b4a2-4d4e-8104-e2d60a0682b8/volumes/kubernetes.io~csi/pvc-380e4ac6/mount type beegfs (rw,relatime,cfgFile=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-380e4ac6/globalmount/beegfs-client.conf) +-> journalctl -xe -t hyperkube +Sep 27 10:49:44 openshift-beegfs-rhel-worker-1 hyperkube: I0927 10:49:44.056774 2120 reconciler.go:196] "operationExecutor.UnmountVolume started for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/beegfs.csi.netapp.com^beegfs://scspa2058537001.rtp.openenglab.netapp.com/e2e-test/dynamic/pvc-5e90c0c8\") pod \"22e1f3ae-81b4-45e2-8010-95ccc3a2e9ad\" (UID: \"22e1f3ae-81b4-45e2-8010-95ccc3a2e9ad\") " +Sep 27 10:49:44 openshift-beegfs-rhel-worker-1 hyperkube: W0927 10:49:44.168265 2120 mount_helper_common.go:34] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/pods/22e1f3ae-81b4-45e2-8010-95ccc3a2e9ad/volume-subpaths/pvc-5e90c0c8/test-container-subpath-dynamicpv-ccs7/0 +Sep 27 10:49:44 openshift-beegfs-rhel-worker-1 hyperkube: E0927 10:49:44.168436 2120 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/beegfs.csi.netapp.com^beegfs://scspa2058537001.rtp.openenglab.netapp.com/e2e-test/dynamic/pvc-5e90c0c8 podName:22e1f3ae-81b4-45e2-8010-95ccc3a2e9ad nodeName:}" ``` -#### Outdated Driver or --node-unstage-timeout=0 + +#### Outdated Driver or Node Unstage Timeout Exceeded + +Prior to driver version 1.2.1, the primary cause of orphan mounts was the +possibility for Kubernetes to call DeleteVolume on the controller service before +calling NodeUnpublishVolume on all relevant node services. In this scenario, +if DeleteVolume succeeds, NodeUnpublishVolume becomes impossible, leaving orphan +mounts. + +Though the Kubernetes maintainers have been working to ensure that this +disruption in the order of CSI operations cannot occur in future Kubernetes +versions, we chose to add code to mitigate its effects in driver version 1.2.1. +This code is enabled by setting --node-unstage-timeout to something other than +0 (the deployment manifests do this automatically). The --node-unstage-timeout +flag causes a +`volDirBasePath/.csi/volumes/volumeName/nodes` directory to be created on the +BeeGFS file system for every new volume. When a node mounts a volume, its name +is added to this directory, and when a node unmounts a volume, its names is +removed from this directory. The controller service refuses to delete a volume +until either this directory is empty or --node-unstage-timeout is exceeded. + +On an older version of the driver or when the driver is deployed into Kubernetes +with --node-unstage-timeout=0, no attempt is made to wait when DeleteVolume is +called. The controller service does not log any issue, as it has no way of +knowing whether or not a BeeGFS volume is still staged on some node. This can +result in the [general symptoms](#orphan-mounts-general-symptoms) +described above and kubelet logs indicating the failure to clean up a +non-existent directory (DeleteVolume has already successfully deleted a BeeGFS +directory that is bind mounted into a Kubernetes container). Note that the +failing directory might be a primary mount directory +(e.g. `/var/lib/kubelet/pods/pod/volumes/...`) or a subpath directory (e.g. +`var/lib/kubelet/pods/pod/volume-subpaths/...`). -Code to mitigate the primary cause of orphan mounts was added in BeeGFS CSI -driver v1.2.1. This code is enabled by setting --node-unstage-timeout to -something other than 0 (the deployment manifests do this automatically). On an -older version of the driver or when the driver is deployed into Kubernetes with ---node-unstage-timeout=0, it is possible for Kubernetes to call DeleteVolume -before all nodes have called NodeUnpublishVolume. If the DeleteVolume succeeds, -NodeUnpublishVolume becomes impossible, leaving orphan mounts. Under these -circumstances, the controller service will NOT log anything about waiting for -the node service. +```bash +-> journalctl -xe -t hyperkube +Sep 27 10:49:44 openshift-beegfs-rhel-worker-1 hyperkube: I0927 10:49:44.056774 2120 reconciler.go:196] "operationExecutor.UnmountVolume started for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/beegfs.csi.netapp.com^beegfs://scspa2058537001.rtp.openenglab.netapp.com/e2e-test/dynamic/pvc-5e90c0c8\") pod \"22e1f3ae-81b4-45e2-8010-95ccc3a2e9ad\" (UID: \"22e1f3ae-81b4-45e2-8010-95ccc3a2e9ad\") " +Sep 27 10:49:44 openshift-beegfs-rhel-worker-1 hyperkube: W0927 10:49:44.168265 2120 mount_helper_common.go:34] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/pods/22e1f3ae-81b4-45e2-8010-95ccc3a2e9ad/volume-subpaths/pvc-5e90c0c8/test-container-subpath-dynamicpv-ccs7/0 +Sep 27 10:49:44 openshift-beegfs-rhel-worker-1 hyperkube: E0927 10:49:44.168436 2120 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/beegfs.csi.netapp.com^beegfs://scspa2058537001.rtp.openenglab.netapp.com/e2e-test/dynamic/pvc-5e90c0c8 podName:22e1f3ae-81b4-45e2-8010-95ccc3a2e9ad nodeName:}" +``` Follow the [cleanup](#orphan-mounts-cleanup) instructions to clean up. Then upgrade the driver or set --node-unstage-timeout to ensure the issue doesn't occur again. +Even the latest driver with a properly configured --node-unstage-timeout can +produce this issue in extreme circumstances. For example, it can occur if +Kubernetes calls DeleteVolume prematurely and some unforeseen issue on a node +delays unpublishing for the entirety of the timeout interval. In this case +DeleteVolume will successfully delete the BeeGFS directory, making further +NodeUnpublishVolume attempts impossible. Note that this is intentional behavior. +Without it, if a cluster node was permanently lost, a DeleteVolume call could +never succeed because the node would never have the opportunity to remove its +name from the BeeGFS `.csi` directory. Fortunately, the v1.2.1 changes make this +situation easy to identify. + +```bash +-> kubectl logs csi-beegfs-controller-0 +I1216 16:41:04.893012 1 server.go:192] "msg"="GRPC call" "reqID"="0006" "method"="/csi.v1.Controller/DeleteVolume" "request"="{\"volume_id\":\"beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493\"}" +I1216 16:41:04.893633 1 beegfs_util.go:62] "msg"="Writing client files" "reqID"="0006" "path"="/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493" "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +I1216 16:41:04.894929 1 beegfs_util.go:208] "msg"="Mounting volume to path" "reqID"="0006" "mountOptions"=["rw","relatime","cfgFile=/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493/beegfs-client.conf"] "path"="/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493/mount" "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +I1216 16:41:04.896100 1 mount_linux.go:146] Mounting cmd (mount) with arguments (-t beegfs -o rw,relatime,cfgFile=/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493/beegfs-client.conf beegfs_nodev /var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493/mount) +I1216 16:41:06.557851 1 controllerserver.go:452] "msg"="Waiting for volume to unstage from all nodes" "reqID"="0006" "secondsLeft"=59 "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +... +I1216 16:42:03.549724 1 controllerserver.go:452] "msg"="Waiting for volume to unstage from all nodes" "reqID"="0006" "secondsLeft"=2 "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +I1216 16:42:06.131629 1 controllerserver.go:464] "msg"="Volume did not unstage on all nodes; orphan mounts may remain" "reqID"="0006" "remainingNodes"=["some-node"] "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +I1216 16:42:06.131693 1 controllerserver.go:469] "msg"="Deleting BeeGFS directory" "reqID"="0006" "path"="/k8s/name/dyn/.csi/volumes/pvc-15ba5493" "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +I1216 16:42:06.556387 1 controllerserver.go:482] "msg"="Deleting BeeGFS directory" "reqID"="0006" "path"="/k8s/name/dyn/pvc-15ba5493" "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +I1216 16:42:07.039244 1 beegfs_util.go:270] "msg"="Unmounting volume from path" "reqID"="0006" "path"="/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493/mount" "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +I1216 16:42:07.145542 1 mount_helper_common.go:71] "/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493/mount" is a mountpoint, unmounting +I1216 16:42:07.145638 1 mount_linux.go:238] Unmounting /var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493/mount +I1216 16:42:08.977916 1 mount_helper_common.go:85] "/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493/mount" is unmounted, deleting the directory +I1216 16:42:08.978110 1 beegfs_util.go:283] "msg"="Cleaning up path" "reqID"="0006" "path"="/var/lib/kubelet/plugins/beegfs.csi.netapp.com/10.113.4.46_k8s_name_dyn_pvc-15ba5493" "volumeID"="beegfs://10.113.4.46/k8s/name/dyn/pvc-15ba5493" +I1216 16:42:08.978688 1 server.go:202] "msg"="GRPC response" "reqID"="0006" "method"="/csi.v1.Controller/DeleteVolume" "response"="{}" +``` + +First, take steps to determine why --node-unstage-timeout was exceeded. Then, +follow the [cleanup](#orphan-mounts-cleanup) instructions to clean up. + #### Missing vol_data.json -Along with the general symptoms, the journal on an affected node indicates that -UnmountVolume is continuously failing due to a missing vol_data.json file. +Orphan mounts can also occur in older versions of Kubernetes for a reason that +the driver cannot mitigate. Along with the general symptoms, the kubelet logs on +an affected node indicate that UnmountVolume is continuously failing due to a +missing vol_data.json file. ```bash -> ssh root@some.node @@ -167,8 +233,8 @@ This bug is fixed in the following Kubernetes versions: Follow the [cleanup](#orphan-mounts-cleanup) instructions to clean up. Then upgrade Kubernetes to ensure the issue doesn't occur again. + #### Cleanup - On each node with orphan mounts, identify and unmount them. From 13f346ab18d7f4c2fa16a84e4f3e1e54ccaa29a9 Mon Sep 17 00:00:00 2001 From: Eric Weber Date: Thu, 16 Dec 2021 11:18:00 -0600 Subject: [PATCH 2/5] Move anchors in markdown files --- README.md | 39 ++++++++++++++++++++----------- deploy/k8s/README.md | 6 ++--- docs/deployment.md | 53 ++++++++++++++++++++++++++---------------- docs/developer-docs.md | 6 ++--- docs/quotas.md | 22 +++++++++++------- docs/usage.md | 35 ++++++++++++++-------------- 6 files changed, 96 insertions(+), 65 deletions(-) diff --git a/README.md b/README.md index b54ad01b..c6af9a32 100644 --- a/README.md +++ b/README.md @@ -4,8 +4,8 @@ [![Docker pulls](https://img.shields.io/docker/pulls/netapp/beegfs-csi-driver)](https://hub.docker.com/r/netapp/beegfs-csi-driver) [![Go report card](https://goreportcard.com/badge/github.com/netapp/beegfs-csi-driver)](https://goreportcard.com/report/github.com/netapp/beegfs-csi-driver) -## Contents +## Contents * [Overview](#overview) * [Getting Started](#getting-started) @@ -15,8 +15,9 @@ * [License](#license) * [Maintainers](#maintainers) -## Overview +## Overview + The BeeGFS Container Storage Interface (CSI) driver provides high performing and scalable storage for workloads running in container orchestrators like Kubernetes. This driver allows containers to access existing datasets or request @@ -29,8 +30,9 @@ configure) and day-2 (reconfigure/update) tasks for the driver. This especially discovery and installation from Operator Lifecycle Manger (OLM) enabled clusters like Red Hat OpenShift. -### Notable Features +### Notable Features + * Integration of [Storage Classes in Kubernetes](docs/usage.md#create-a-storage-class) with [storage pools](https://doc.beegfs.io/latest/advanced_topics/storage_pools.html) in BeeGFS, allowing different tiers of storage within the same file system to be @@ -51,8 +53,8 @@ OpenShift. share access to the same working directories and enable multi-user/application access to common datasets. -### Interoperability and CSI Feature Matrix +### Interoperability and CSI Feature Matrix | beegfs.csi.netapp.com | K8s Versions | Red Hat OpenShift Versions | BeeGFS Versions | CSI Version | | ---------------------- | -------------------------------- | -------------------------- | --------------- | ------------ | @@ -147,11 +149,12 @@ the use of the BeeGFS CSI driver with HashiCorp Nomad, we may not test with Nomad for every driver release and do not currently consider Nomad to be a supported container orchestrator. -## Getting Started +## Getting Started -### Prerequisite(s) +### Prerequisite(s) + * Deploying the driver requires access to a terminal with kubectl. * The [BeeGFS DKMS client](https://doc.beegfs.io/latest/advanced_topics/client_dkms.html) must be @@ -169,8 +172,9 @@ supported container orchestrator. nodes over a TCP/IP and/or RDMA (InfiniBand/RoCE) capable network (not required to deploy the driver). -### Quick Start +### Quick Start + The steps in this section allow you to get the driver up and running quickly. For production use cases or air-gapped environments it is recommended to read through the full [kubectl deployment guide](docs/deployment.md) or [operator @@ -193,8 +197,9 @@ pods -n kube-system | grep csi-beegfs` Provided all Pods are running the driver is now ready for use. See the following sections for how to get started using the driver. -## Basic Use +## Basic Use + This section provides a quick summary of basic driver use and functionality. Please see the full [usage documentation](docs/usage.md) for a complete overview of all available functionality. The driver was designed to support @@ -206,28 +211,32 @@ sections for how to get started using the driver. isolation between multiple applications and users using the same BeeGFS file system when desired. -### Dynamic Storage Provisioning: +### Dynamic Storage Provisioning: + Administrators create a Storage Class in Kubernetes referencing at minimum a specific BeeGFS file system and parent directory within that file system. Users can then submit PVCs against the Storage Class, and are provided isolated access to new directories under the parent specified in the Storage Class. -### Static Provisioning: +### Static Provisioning: + Administrators create a PV and PVC representing an existing directory in a BeeGFS file system. This is useful for exposing some existing dataset or shared directory to Kubernetes users and applications. -### Examples +### Examples + [Example Kubernetes manifests](examples/k8s/README.md) of how to use the driver are provided. These are meant to be repurposed to simplify creating objects related to the driver including Storage Classes, Persistent Volumes, and Persistent Volume Claims in your environment. -## Requesting Enhancements and Reporting Issues +## Requesting Enhancements and Reporting Issues + If you have any questions, feature requests, or would like to report an issue please submit them at https://github.com/NetApp/beegfs-csi-driver/issues. @@ -236,12 +245,14 @@ The BeeGFS CSI Driver maintainers welcome improvements from the BeeGFS and open source community! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for how to get started. -## License +## License + Apache License 2.0 -## Maintainers +## Maintainers + * Eric Weber (@ejweber). * Jason Eastburn * Joe McCormick (@iamjoemccormick). diff --git a/deploy/k8s/README.md b/deploy/k8s/README.md index b9954210..cb09af01 100644 --- a/deploy/k8s/README.md +++ b/deploy/k8s/README.md @@ -1,13 +1,13 @@ # Kustomize Specific Deployment Details -## Contents +## Contents * [Basics](#basics) * [Upgrading to v1.2.0](#upgrade-1.2.0-kubernetes-deployment) -## Basics +## Basics The BeeGFS CSI driver uses [Kustomize](https://kustomize.io/) as its default deployment mechanism. While Kustomize CAN be downloaded, installed, and run as @@ -45,8 +45,8 @@ by the development team to the base manifests or version patches will be picked up when you pull a new version of the project and your custom modifications will continue to work unless otherwise noted. -### Upgrading to v1.2.0 +### Upgrading to v1.2.0 v1.2.0 includes changes to the structure of the deployment manifests. To upgrade from v1.1.0, follow these steps: diff --git a/docs/deployment.md b/docs/deployment.md index 2749ff61..df1ce90e 100644 --- a/docs/deployment.md +++ b/docs/deployment.md @@ -1,7 +1,7 @@ # BeeGFS CSI Driver Deployment -## Contents +## Contents * [Deploying to Kubernetes](#deploying-to-kubernetes) * [Kubernetes Node Preparation](#kubernetes-node-preparation) @@ -18,11 +18,12 @@ * [Security and Networking Considerations](#security-considerations) * [Removing the Driver from Kubernetes](#removing-the-driver-from-kubernetes) -## Deploying to Kubernetes +## Deploying to Kubernetes -### Kubernetes Node Preparation +### Kubernetes Node Preparation + The following MUST be completed on any Kubernetes node (master OR worker) that runs a component of the driver: * Download the [BeeGFS repository @@ -40,8 +41,9 @@ IMPORTANT: By default the driver uses the beegfs-client.conf file at */etc/beegfs/beegfs-client.conf* for base configuration. Modifying the location of this file is not currently supported without changing kustomization files. -### Kubernetes Deployment +### Kubernetes Deployment + Deployment manifests are provided in this repository under *deploy/k8s/* along with a default BeeGFS Client ConfigMap. The driver is deployed using `kubectl apply -k` (kustomize). For more detailed information on how the manifests are organized or @@ -113,8 +115,9 @@ Next Steps: * For a comprehensive introduction see the [BeeGFS CSI Driver Usage](usage.md) documentation. -### Air-Gapped Kubernetes Deployment +### Air-Gapped Kubernetes Deployment + This section provides guidance on deploying the BeeGFS CSI driver in environments where Kubernetes nodes do not have internet access. @@ -136,8 +139,9 @@ to them. Adjust the `images[].newTag` fields as necessary to ensure they either match images that exist on the Kubernetes nodes or reference your internal registry. Then follow the above commands for Kubernetes deployment. -### Deployment to Kubernetes Clusters With Mixed Nodes +### Deployment to Kubernetes Clusters With Mixed Nodes + In some Kubernetes clusters, not all nodes are capable of running the BeeGFS CSI driver (or it may not be desirable for all nodes to do so). For example: * A cluster may be shared by multiple departments and one department may not @@ -176,8 +180,8 @@ StatefulSets, Deployments) that depend on BeeGFS MUST be deployed with the same nodeAffinity assigned to the driver node service. Provide your users with the labels or nodes they must run their workloads on. -### Deployment to Kubernetes Using the Operator +### Deployment to Kubernetes Using the Operator An [operator](https://operatorframework.io/what/) can be used to deploy the BeeGFS CSI driver to a cluster and manage its configuration/state within that @@ -192,8 +196,9 @@ operator to install the driver directly from this repository. See the [BeeGFS CSI Driver Operator](../operator/README.md) documentation for details. -## Example Application Deployment +## Example Application Deployment + Verify that a BeeGFS file system is accessible from the Kubernetes nodes. Minimally, verify that the BeeGFS servers are up and listening from a workstation that can access them: @@ -349,8 +354,8 @@ Next Steps: documentation. * For additional examples see *examples/k8s/README.md*. -## Managing BeeGFS Client Configuration +## Managing BeeGFS Client Configuration Currently the only tested and supported container orchestrator (CO) for the BeeGFS CSI driver is Kubernetes. Notes in the General Configuration section @@ -358,8 +363,8 @@ below would apply to other COs if supported. For Kubernetes the preferred method to apply desired BeeGFS Client configuration is using a Kubernetes ConfigMap and Secret, as described in [Kubernetes Configuration](#kubernetes-configuration). -### General Configuration +### General Configuration The driver is ready to be used right out of the box, but many environments may either require or benefit from additional configuration. @@ -462,8 +467,9 @@ nodeSpecificConfigs: # OPTIONAL fileSystemSpecificConfigs: # as above ``` -#### ConnAuth Configuration +#### ConnAuth Configuration + For security purposes, the contents of BeeGFS connAuthFiles are stored in a separate file. This file is optional, and should only be used if the connAuthFile configuration option is used on a file system's other services. @@ -483,8 +489,9 @@ NOTE: Unlike general configuration, connAuth configuration is only applied at a per file system level. There is no default connAuth and the concept of a node specific connAuth doesn't make sense. -### Kubernetes Configuration +### Kubernetes Configuration + When deployed into Kubernetes, a single Kubernetes ConfigMap contains the configuration for all Kubernetes nodes. When the driver starts up on a node, it uses the node's name to filter the global ConfigMap down to a node-specific @@ -516,16 +523,18 @@ all nodes so that it picks up the latest changes. NOTE: To validate the BeeGFS Client configuration file used for a specific PVC, see the [Troubleshooting Guide](troubleshooting.md#k8s-determining-the-beegfs-client-conf-for-a-pvc) -### BeeGFS Client Parameters (beegfsClientConf) +### BeeGFS Client Parameters (beegfsClientConf) + The following beegfs-client.conf parameters appear in the BeeGFS v7.2 [beegfs-client.conf file](https://git.beegfs.io/pub/v7/-/blob/7.2/client_module/build/dist/etc/beegfs-client.conf). Other parameters may exist for newer or older BeeGFS versions. The list a parameter falls under determines its level of support in the driver. -#### No Effect +#### No Effect + These parameters are specified elsewhere (a Kubernetes StorageClass, etc.) or are determined dynamically and have no effect when specified in the `beeGFSClientConf` configuration section. @@ -542,8 +551,9 @@ are determined dynamically and have no effect when specified in the nodes.) * `connPortShift` -#### Unsupported +#### Unsupported + These parameters are specified elsewhere and may exhibit undocumented behavior if specified here. @@ -555,14 +565,16 @@ if specified here. * `connTcpOnlyFilterFile` - Overridden by lists in the driver configuration file. -### Tested +### Tested + These parameters have been tested and verified to have the desired effect. * `quotaEnabled` - Documented in [Quotas](quotas.md). -#### Untested +#### Untested + These parameters SHOULD result in the desired effect but have not been tested. * `connHelperdPortTCP` @@ -596,11 +608,11 @@ These parameters SHOULD result in the desired effect but have not been tested. * `tuneUseGlobalFileLocks` * `sysACLsEnabled` -## Notes for Kubernetes Administrators +## Notes for Kubernetes Administrators -### Security and Networking Considerations +### Security and Networking Considerations **The driver must be allowed to mount and unmount file systems.** @@ -634,8 +646,9 @@ Inbound UDP traffic from nodes serving up BeeGFS file systems to arbitrary ports on all BeeGFS clients must be allowed. Each volume requires its own port and it is not currently possible to configure an allowed port range. -## Removing the Driver from Kubernetes +## Removing the Driver from Kubernetes + If you're experiencing any issues, find functionality lacking, or our documentation is unclear, we'd appreciate if you let us know: https://github.com/NetApp/beegfs-csi-driver/issues. diff --git a/docs/developer-docs.md b/docs/developer-docs.md index 2c5bb91a..a169d023 100644 --- a/docs/developer-docs.md +++ b/docs/developer-docs.md @@ -93,11 +93,11 @@ csi-beegfs-node-ntcpc 3/3 Running 0 2m27s csi-beegfs-socat-0 0/1 Pending 0 17h ``` -## Style Guidelines +## Style Guidelines -### YAML Files +### YAML Files YAML files in this project are formatted according to the following rules/restrictions: @@ -113,8 +113,8 @@ rules/restrictions: where a boolean is expected. * If it is necessary to quote a string, double quotes `"` are preferred. -## Frequently Asked Questions +## Frequently Asked Questions ### Why do we use sigs.k8s.io/yaml instead of gopkg.in/yaml? diff --git a/docs/quotas.md b/docs/quotas.md index 48f6442d..b66c34cf 100644 --- a/docs/quotas.md +++ b/docs/quotas.md @@ -1,15 +1,16 @@ # Using BeeGFS Quotas with the CSI Driver -## Contents +## Contents * [Overview](#overview) * [Prerequisites](#prerequisites) * [Enabling Quotas](#enabling-quotas) * [Tracking BeeGFS Consumption by Storage Class](#tracking-beegfs-consumption-by-sc) -## Overview +## Overview + To provide administrators visibility and control over file system utilization, BeeGFS supports both [quota tracking](https://doc.beegfs.io/latest/advanced_topics/quota.html#quota-tracking) @@ -24,8 +25,9 @@ configuration for the BeeGFS clients (i.e., Kubernetes nodes). This document will also demonstrate how to leverage quotas to track BeeGFS consumption with an example that can be extended to any number of use cases. -## Prerequisites +## Prerequisites + * The BeeGFS server nodes (Management, Metadata, and Storage) must already have quota tracking and optionally quota enforcement enabled. See the [BeeGFS documentation](https://doc.beegfs.io/latest/advanced_topics/quota.html) if you @@ -36,8 +38,9 @@ example that can be extended to any number of use cases. NetApp](https://blog.netapp.com/solution-support-for-beegfs-and-e-series/) for use in production. -## Enabling Quotas +## Enabling Quotas + In addition to the prerequisite steps for BeeGFS server nodes, to enable quotas each client must set `quotaEnabled = true` in the configuration file corresponding with the mount point for that file system. For container @@ -74,11 +77,12 @@ when existing volumes are remounted, for example if a pod moves between nodes. For any volumes that don't have this setting enabled, all I/O will continue to affect the quota consumption of the root user, instead of the actual caller. -## Tracking BeeGFS Consumption by Storage Class +## Tracking BeeGFS Consumption by Storage Class -### Introduction +### Introduction + If your containers are running with a meaningful user ID from a quota perspective, the above configuration is all that is needed to take advantage of BeeGFS quotas. However in some environments the user in a container may vary, @@ -113,8 +117,9 @@ documentation on [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) for ways to prevent users from running containers as root if desired. -### Linux User/Group IDs and Containers +### Linux User/Group IDs and Containers + Typically BeeGFS [recommends](https://doc.beegfs.io/latest/advanced_topics/quota.html#requirements-and-general-notes) ensuring the local systems of all nodes are configured correctly to query passwd @@ -133,8 +138,9 @@ storage nodes along with any client nodes used for administrative purposes to query/set BeeGFS quota information using `beegfs-ctl`, should have a synchronized view of this mapping to avoid confusion. -### Example Steps to setup a Storage Class that use setgid to set a specific group +### Example Steps to setup a Storage Class that use setgid to set a specific group + Note: Any steps specific to creating groups or querying quota information should take place on a node with a synchronized view of the user and group IDs as described in the last section. diff --git a/docs/usage.md b/docs/usage.md index 9007dc1a..9556a68d 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,7 +1,7 @@ # BeeGFS CSI Driver Usage -## Contents +## Contents * [Important Concepts](#important-concepts) * [Dynamic Provisioning Workflow](#dynamic-provisioning-workflow) @@ -10,11 +10,11 @@ * [Notes for BeeGFS Administrators](#notes-for-beegfs-administrators) * [Limitations and Known Issues](#limitations-and-known-issues) -## Important Concepts +## Important Concepts -### Definition of a "Volume" +### Definition of a "Volume" Within the context of this driver, a "volume" is simply a directory within a BeeGFS filesystem. When a volume is mounted by a Kubernetes Pod, only files @@ -24,8 +24,8 @@ in the static provisioning workflow) or a single subdirectory many levels deep can be a volume (e.g. by specifying */a/very/deep/directory* as the *volDirBasePath* in the dynamic provisioning workflow). -### Capacity +### Capacity In this version, the driver ignores the capacity requested for a Kubernetes Persistent Volume. Consider the definition of a "volume" above. While an entire @@ -35,11 +35,11 @@ any POSIX) filesystem. The driver does provide integration with BeeGFS permissio and quotas which provides ways to limit the capacity consumed by containers. For more details refer to the documentation on [Quotas](quotas.md). -### Static vs Dynamic Provisioning +### Static vs Dynamic Provisioning -#### Dynamic Provisioning Use Case +#### Dynamic Provisioning Use Case As a user, I want a volume to use as high-performance scratch space or semi-temporary storage for my workload. I want the volume to be empty when my @@ -53,8 +53,8 @@ the driver automatically creates a new subdirectory underneath that parent directory and binds it to the PVC. To the user and/or workload, the subdirectory is the entire volume. It exists as long as the PVC exists. -#### Static Provisioning Use Case +#### Static Provisioning Use Case As an administrator, I want to make a directory within an existing BeeGFS file system available to be mounted by multiple users and/or workloads. This @@ -69,8 +69,8 @@ creates a PV and PVC representing an existing BeeGFS file system directory. Multiple users and/or workloads can mount that PVC and consume the data the directory contains. -### BeeGFS Version Compatibility +### BeeGFS Version Compatibility This version of the driver is ONLY tested for compatibility with BeeGFS v7.1.5 and v7.2. The BeeGFS filesystem services and the BeeGFS clients running on the @@ -85,8 +85,8 @@ versions before v7.1.4 do not include the beegfs-client-dkms package, which the driver uses to build the BeeGFS client kernel module and mount BeeGFS file systems. -### Client Configuration and Tuning +### Client Configuration and Tuning Depending on your topology, different nodes within your cluster or different BeeGFS file systems accessible by your cluster may need different client @@ -96,8 +96,8 @@ Managing BeeGFS Client Configuration in the [deployment guide](deployment.md) for detailed instructions on how to prepare your cluster to mount various BeeGFS file systems. -## Dynamic Provisioning Workflow +## Dynamic Provisioning Workflow ### Assumptions @@ -219,8 +219,8 @@ Who: A Kubernetes user Follow standard Kubernetes practices to deploy a Pod that consumes the newly created Kubernetes Persistent Volume Claim. -## Static Provisioning Workflow +## Static Provisioning Workflow ### Assumptions @@ -294,8 +294,8 @@ Who: A Kubernetes user Follow standard Kubernetes practices to deploy a Pod that consumes the newly created Kubernetes Persistent Volume Claim. -## Best Practices +## Best Practices * While multiple Kubernetes clusters can use the same BeeGFS file system, it is not recommended to have more than one cluster use the same `volDirBasePath` @@ -306,8 +306,8 @@ created Kubernetes Persistent Volume Claim. permissions, especially on static directories containing shared datasets (more details [below](#read-only-and-access-modes-in-kubernetes)). -## Notes for BeeGFS Administrators +## Notes for BeeGFS Administrators ### General @@ -316,8 +316,8 @@ created Kubernetes Persistent Volume Claim. location of this file is not currently supported without changing kustomization files. -### BeeGFS Mount Options +### BeeGFS Mount Options Except for `cfgFile` (which has to be set by the driver) mount options supported by BeeGFS can be specified on a [persistent @@ -341,8 +341,9 @@ is much more likely to be an issue on BeeGFS storage and metadata servers than the Kubernetes nodes themselves (since multiple clients connect to each server). Administrators are advised to spec out BeeGFS servers accordingly. -### Permissions +### Permissions + Note: See the section on [Creating a Storage Class](#create-a-storage-class) for how to set permissions using the CSI driver. @@ -391,8 +392,8 @@ parameters to allow access: * spec.container.securityContext.runAsUser * spec.container.securityContext.runAsGroup -#### fsGroup Behavior +#### fsGroup Behavior Some CSI drivers support a recursive operation in which the permissions and ownership of all files and directories in a provisioned volume are changed to @@ -415,8 +416,8 @@ required), the `csiDriver.spec.fsGroupPolicy` parameter can be used to disable this behavior for all volumes. The beegfs-csi-driver deploys with this parameter set to "None" in case it is deployed to a cluster that supports it. -## Limitations and Known Issues +## Limitations and Known Issues ### General From 9a46439bcd1c0caa1fc6f025364af0fa31d2cab2 Mon Sep 17 00:00:00 2001 From: Eric Weber Date: Fri, 17 Dec 2021 09:14:01 -0600 Subject: [PATCH 3/5] Fix CHANGELOG link typo --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9a139432..27ad4a79 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,7 +14,7 @@ Notable changes to the BeeGFS CSI driver will be documented in this file. These are being provided as an example for others who might want to experiment with using BeeGFS and Nomad, in particular anyone interested in [contributing](CONTRIBUTING.md) to any future efforts around Nomad. -- [Documentation](docs/deployment.md##mixed-kubernetes-deployment) on how to +- [Documentation](docs/deployment.md#mixed-kubernetes-deployment) on how to deploy the driver to Kubernetes clusters where some nodes can access BeeGFS volumes, and some cannot. - Support for BeeGFS v7.2.4, Kubernetes v1.21, and RedHat OpenShift v4.8. From 159f7bc83ff34caab68934a0b35c6ab6bc514a08 Mon Sep 17 00:00:00 2001 From: Joe McCormick Date: Sun, 19 Dec 2021 19:06:05 -0500 Subject: [PATCH 4/5] Add note for BeeGFS admins regarding .csi directories --- docs/troubleshooting.md | 2 +- docs/usage.md | 7 +++++++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index ec019153..849d9324 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -83,7 +83,7 @@ There are a number of circumstances that can cause orphaned mounts to remain on the nodes of a container orchestrator after a BeeGFS volume is deleted. These are largely outside the control of the BeeGFS CSI driver and occur due to how a particular container orchestrator interacts with CSI drivers in general. -Starting in v1.2.1 the BeeGFS CSI driver introduced functionality that can +Starting in v1.2.1 the BeeGFS CSI driver introduced functionality that mitigates common causes of this behavior in Kubernetes, but administrators should be aware of the potential, and the measures taken by the driver to mitigate it. diff --git a/docs/usage.md b/docs/usage.md index 9556a68d..65f5b280 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -315,6 +315,13 @@ created Kubernetes Persistent Volume Claim. */etc/beegfs/beegfs-client.conf* for base configuration. Modifying the location of this file is not currently supported without changing kustomization files. +* When using dynamic provisioning, if `--node-unstage-timeout` is set to a nonzero value + (default: 60) the driver will create a directory structure at `volDirBasePath/.csi/` + (in the BeeGFS filesystem) and use it to persist any state used by the driver, + [for example to prevent orphaned mounts](troubleshooting.md#orphan-mounts). + This behavior can optionally be disabled, but is strongly recommended for the driver + to function optimally. + ### BeeGFS Mount Options From 102a5429895de788ce60ae43e4d551db37d960d5 Mon Sep 17 00:00:00 2001 From: Joe McCormick Date: Sun, 19 Dec 2021 19:43:35 -0500 Subject: [PATCH 5/5] Update the changelog for v1.2.1 --- CHANGELOG.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 27ad4a79..69ec16b5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,28 @@ # Changelog Notable changes to the BeeGFS CSI driver will be documented in this file. +[1.2.1] - 2021-12-20 +-------------------- +### Added +- Support for BeeGFS v7.2.5, Kubernetes v1.22, and RedHat OpenShift v4.9. +- The ability to persist state in BeeGFS using a .csi/ directory structure that + exists alongside dynamically provisioned volumes in their `volDirBasePath`. + This is automatically enabled by default but can be [optionally + disabled](docs/usage.md#notes-for-beegfs-administrators). + +### Fixed +- Common causes of [orphaned BeeGFS + mounts](docs/troubleshooting.md#orphan-mounts) being left on Kubernetes nodes + (listed as a known issue in v1.2.0) by maintaining a record of nodes with + active BeeGFS mounts for each volume in the new .csi/ directory and falling + back on a newly added timeout (`--node-unstage-timeout`) when needed. + +### Security +Note: The BeeGFS CSI driver is written in Golang and does not import or +implement any functionality that makes it susceptible to the recent Log4j +vulnerability threat. For more details please refer to [NetApp's official +response](https://www.netapp.com/newsroom/netapp-apache-log4j-response/). + [1.2.0] - 2021-10-11 -------------------- ### Added