Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add how-to for upgrading to Ceph-CSI v3.9.0 #139

Merged
merged 2 commits into from
Nov 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions docs/modules/ROOT/pages/how-tos/upgrade-ceph-csi-v3.9.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
= Upgrading to Ceph-CSI v3.9.0

Starting from component version v6.0.0, the component deploys Ceph-CSI v3.9.0 by default.

Ceph-CSI has been updated to pass any mount options configured for CephFS volumes to the `NodeStageVolume` calls which create bind mounts for existing volumes.

We previously configured mount option `discard` for CephFS, which isn't a supported option for bind mounts.
However, the option is unnecessary for CephFS anyway, so we remove it completely from the generated CephFS storage class in component version v6.0.0.

This how-to is intended for users which are upgrading an existing Rook-Ceph setup to component version v6.0.0 from a previous component version.

== Prerequisites

* `cluster-admin` access to the cluster
* Access to Project Syn configuration for the cluster, including a method to compile the catalog
* `kubectl`
* `jq`


== Steps

. Check mount options for all CephFS volumes.
If this command shows custom mount options for any volumes, you'll want to handle those volumes separately.
+
[source,bash]
----
kubectl get pv -ojson | \
jq -r '.items[] | select(.spec.storageClassName=="cephfs-fspool-cluster") | "\(.metadata.name) \(.spec.mountOptions)" '
----

. Remove mount option `discard` from all existing CephFS volumes.
+
[source,bash]
----
for pv in $(kubectl get pv -ojson |\
jq -r '.items[] | select(.spec.storageClassName=="cephfs-fspool-cluster" and (.spec.mountOptions//[]) == ["discard"]) | .metadata.name');
do
kubectl patch --as=cluster-admin pv $pv --type=json \
-p '[{"op": "replace", "path": "/spec/mountOptions", "value": [] }]'
done
----

. Upgrade component to v6.0.0

. Check for any CephFS volumes which got provisioned between step 2 and the upgrade and remove mount option `discard` for those volumes.
+
[source,bash]
----
kubectl get pv -ojson | \
jq -r '.items[] | select(.spec.storageClassName=="cephfs-fspool-cluster" and (.spec.mountOptions//[]) == ["discard"]) | "\(.metadata.name) \(.spec.mountOptions)" '
----

. Finally, you should make sure you replace the existing CSI driver `holder` pods (if they're present on your cluster) with updated pods to ensure you're not getting any spurious DaemonSetRolloutStuck alerts.
+
IMPORTANT: This needs to be done for each node after a node drain to ensure no Ceph-CSI mounts are active on the node
+
[source,bash]
----
node_selector="node-role.kubernetes.io/worker" <1>
timeout=300s <2>
for node in $(kubectl get node -o name -l $node_selector); do
echo "Draining $node"
if !kubectl drain --ignore-daemonsets --delete-emptydir-data --timeout=$timeout $node
then
echo "Drain of $node failed... exiting"
break
fi
echo "Deleting holder pods for $node"
kubectl -n syn-rook-ceph-operator delete pods \
--field-selector spec.nodeName=${node//node\/} -l app=csi-cephfsplugin-holder
kubectl -n syn-rook-ceph-operator delete pods \
--field-selector spec.nodeName=${node//node\/} -l app=csi-rbdplugin-holder
echo "Uncordoning $node"
kubectl uncordon $node
done
----
<1> Adjust the node selector to the set of nodes you want to drain
<2> Adjust if you expect node drains to be slower or faster than 5 minutes
1 change: 1 addition & 0 deletions docs/modules/ROOT/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
* xref:how-tos/setup-cluster.adoc[Setup a PVC-based Ceph cluster]
* xref:how-tos/scale-cluster.adoc[Scale a PVC-based Ceph cluster]
* xref:how-tos/configure-ceph.adoc[Configuring and tuning Ceph]
* xref:how-tos/upgrade-ceph-csi-v3.9.adoc[]

.Alert runbooks

Expand Down
2 changes: 1 addition & 1 deletion tests/golden/cephfs/rook-ceph/rook-ceph/99_cleanup.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ spec:
env:
- name: HOME
value: /home
image: docker.io/bitnami/kubectl:1.28.3@sha256:1364cda0798b2c44f327265397fbd34a32e66d80328d6e50a2d10377d7e2ff6d
image: docker.io/bitnami/kubectl:1.28.3@sha256:0defec793112fa610a850a991ed4ad849c853c54fb2136b95bcdf41ff6f96c38
imagePullPolicy: IfNotPresent
name: cleanup-alertrules
ports: []
Expand Down
Loading