HPCC-30306 Allow arbitrary script based plane validation #17785

jakesmith · 2023-09-18T11:18:42Z

This can be used catch problems with plane mounts

Type of change:

This change is a bug fix (non-breaking change which fixes an issue).
This change is a new feature (non-breaking change which adds functionality).
This change improves the code (refactor or other change that does not change the functionality)
This change fixes warnings (the fix does not alter the functionality or the generated code)
This change is a breaking change (fix or feature that will cause existing behavior to change).
This change alters the query API (existing queries will have to be recompiled)

Checklist:

Smoketest:

Send notifications about my Pull Request position in Smoketest queue.
Test my draft Pull Request.

Testing:

github-actions · 2023-09-18T11:19:03Z

https://track.hpccsystems.com/browse/HPCC-30306
Jira updated

ghalliday

@jakesmith a few comments/questions.

ghalliday · 2023-09-19T08:40:18Z

helm/hpcc/values.schema.json

@@ -544,6 +544,9 @@
            "waitForMount": {
              "type": "boolean"
            },
+            "validatePlaneScript": {
+              "type": "array"


Should include a description about how this is used. Also is it worth restricting the element type:
"items": { "type": "string" }

Does it need an associated documentation jira. (And similar for exception hooks, are they documeted?)

exceptionHandler are documented in helm/hpcc/docs/expert.md - probably enough?

validatePlaneScript isn't - and it feels like an expert setting too, perhaps it should go under a new expert: level under the plane to make it clear it is of that ilk - and then be documented in expert.md ..

Should include a description about how this is used. Also is it worth restricting the element type:
"items": { "type": "string" }

will change and add a description

ghalliday · 2023-09-19T08:44:08Z

helm/hpcc/templates/_helpers.tpl

+  args:
+  - |
+{{- range $cmd := .cmds }}
+    {{ $cmd }}


Should there be a ; after each command?

no, it's because the block scalar passes it as a script (complete with line breaks) which bash interprets as individual commands (as it would in a script.
e.g. it's the equivalent of:

/bin/bash -c " echo 'command 1' echo 'command 2' echo 'command 3' "

The example in the JIRA is generated as:

initContainers: - args: - | echo "Hello, World!" ls -l df -h end_time=$((SECONDS + 1200)); while [ $SECONDS -lt $end_time ]; do if mountpoint -q '/var/lib/HPCCSystems/hpcc-data'; then device=$(df /var/lib/HPCCSystems/hpcc-data | awk 'NR==2 {print $1}'); if [[ "$device" == "/dev/md0" ]]; then echo "correct mount device"; exit 0; else echo "wrong mount device \"$device\", will look again!"; fi; fi; echo waiting for mount point; sleep 5; done; echo failed; exit 1 command: - /bin/bash - -c

.. and executes as expected.

ghalliday · 2023-09-19T08:46:45Z

helm/hpcc/templates/_helpers.tpl

+Pass in dict with volumeName, volumePath and cmds
+*/}}
+{{- define "hpcc.validatePlaneScript" -}}
+- name: {{ printf "validate-plane-script-container-%s" .volumeName }}


Would it be worth combining this into the wait for mount container if possible? What is the extra overhead starting a new container?

well both are a kludge, but I wouldn't see both being used in tandem.
I was talking to the person behind the TF module that provisions the raided nvme on the nodes, and he thinks that it may require a new CSI behind the persistent volume to correctly wait for the mount to be fully ready.
But in the short term, in the next release, he is adding a marker file in the mount this mechanism could look for.

All less than ideal, once k8s pods see the pvc, that should be it - we shouldn't have to wait or check like this, but I think the provisioning of these raided nvme's is unusual and not fully supported by AKS out of the box.

ghalliday · 2023-09-19T08:47:52Z

helm/hpcc/templates/_helpers.tpl

+{{- define "hpcc.validatePlaneScript" -}}
+- name: {{ printf "validate-plane-script-container-%s" .volumeName }}
+  {{- include "hpcc.addImageAttrs" . | nindent 2 }}
+  command: ["/bin/bash", "-c"]


minor: Slightly strange inconsistency between this and the wait for mount container - sh v bash and the way -c is passed.

agree, but there's about 50% use cases of /bin/sh vs /bin/bash in various places in _helpers.tpl
It is the only place that has "-c" as part of the command list though, I'll move it to 1st arg to be more consistent.

bash is more rich, I think it's okay to use here - perhaps should unify on it elsewhere?

jakesmith · 2023-09-19T12:43:23Z

@ghalliday - please see responses and new commit.

ghalliday

@jakesmith please squash

This can be used catch problems with plane mounts Signed-off-by: Jake Smith <[email protected]>

jakesmith requested a review from ghalliday September 18, 2023 15:49

ghalliday reviewed Sep 19, 2023

View reviewed changes

jakesmith requested a review from ghalliday September 19, 2023 12:42

ghalliday approved these changes Sep 21, 2023

View reviewed changes

HPCC-30306 Allow arbitrary script based plane validation

8f00aa5

This can be used catch problems with plane mounts Signed-off-by: Jake Smith <[email protected]>

jakesmith force-pushed the HPCC-30306-plane-validation branch from 4ec4ad2 to 8f00aa5 Compare September 21, 2023 16:23

ghalliday merged commit 83fe505 into hpcc-systems:candidate-9.2.x Sep 21, 2023
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPCC-30306 Allow arbitrary script based plane validation #17785

HPCC-30306 Allow arbitrary script based plane validation #17785

jakesmith commented Sep 18, 2023 •

edited

Loading

github-actions bot commented Sep 18, 2023

ghalliday left a comment

ghalliday Sep 19, 2023

jakesmith Sep 19, 2023

jakesmith Sep 19, 2023

ghalliday Sep 19, 2023

jakesmith Sep 19, 2023

ghalliday Sep 19, 2023

jakesmith Sep 19, 2023

ghalliday Sep 19, 2023

jakesmith Sep 19, 2023

jakesmith commented Sep 19, 2023

ghalliday left a comment

HPCC-30306 Allow arbitrary script based plane validation #17785

HPCC-30306 Allow arbitrary script based plane validation #17785

Conversation

jakesmith commented Sep 18, 2023 • edited Loading

Type of change:

Checklist:

Smoketest:

Testing:

github-actions bot commented Sep 18, 2023

ghalliday left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakesmith commented Sep 19, 2023

ghalliday left a comment

Choose a reason for hiding this comment

jakesmith commented Sep 18, 2023 •

edited

Loading