-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added docs for HPC on K8s. #17
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts/tweaks that ought to be made before this can be merged.
Also considering locating things within an existing section (perhaps a section within HPC Workflows titled "HPC with Kubernetes"?)
docs/docs/hpc-on-k8s/genome.md
Outdated
spec: | ||
containers: | ||
- name: dummy-job | ||
image: shubhamdang/fastqc_python_image:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still the image repository or has the openflight one been created now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated and image is present in openflighthpc dockerhub account.
docs/docs/hpc-on-k8s/genome.md
Outdated
- name: MINIO_AKEY | ||
value: "Mq6wmeNk0NOc0vD9Efut" | ||
- name: MINIO_SKEY | ||
value: "Z3ETBqC3GuIiU9PomjBbmmC5h8I5I7WgN1wNWlCG" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if these are sensitive secret keys or example ones but may be worth noting in the documentation what these keys should be set to (and maybe where to find them?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added doc in minio to create access key added the the link of minio in workload notes.
docs/docs/hpc-on-k8s/tensorflow.md
Outdated
spec: | ||
containers: | ||
- name: dummy-job | ||
image: shubhamdang/custom_tensorflow:1.15.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still the image repository or has the openflight one been created now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated and image is prsent in openflighthpc dockerhub account.
docs/docs/hpc-on-k8s/hadoop.md
Outdated
- name: MINIO_AKEY | ||
value: "Mq6wmeNk0NOc0vD9Efut" | ||
- name: MINIO_SKEY | ||
value: "Z3ETBqC3GuIiU9PomjBbmmC5h8I5I7WgN1wNWlCG" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if these are sensitive secret keys or example ones but may be worth noting in the documentation what these keys should be set to (and maybe where to find them?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added doc in minio to create access key added the the link of minio in workload notes.
docs/docs/hpc-on-k8s/pytorch.md
Outdated
- name: MINIO_AKEY | ||
value: "Mq6wmeNk0NOc0vD9Efut" | ||
- name: MINIO_SKEY | ||
value: "Z3ETBqC3GuIiU9PomjBbmmC5h8I5I7WgN1wNWlCG" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if these are sensitive secret keys or example ones but may be worth noting in the documentation what these keys should be set to (and maybe where to find them?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added doc in minio to create access key added the the link of minio in workload notes.
docs/docs/hpc-on-k8s/simpy.md
Outdated
- name: MINIO_AKEY | ||
value: "Mq6wmeNk0NOc0vD9Efut" | ||
- name: MINIO_SKEY | ||
value: "Z3ETBqC3GuIiU9PomjBbmmC5h8I5I7WgN1wNWlCG" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if these are sensitive secret keys or example ones but may be worth noting in the documentation what these keys should be set to (and maybe where to find them?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added doc in minio to create access key added the the link of minio in workload notes.
docs/docs/hpc-on-k8s/tensorflow.md
Outdated
- name: MINIO_AKEY | ||
value: "Mq6wmeNk0NOc0vD9Efut" | ||
- name: MINIO_SKEY | ||
value: "Z3ETBqC3GuIiU9PomjBbmmC5h8I5I7WgN1wNWlCG" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if these are sensitive secret keys or example ones but may be worth noting in the documentation what these keys should be set to (and maybe where to find them?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added doc in minio to create access key added the the link of minio in workload notes.
Updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more feedback on the docs with things to improve. I am yet to retest the hadoop workflow and will take a look at pytorch, simpy and tensorflow shortly but this gives some things to address in the meantime
docs/docs/hpc-on-k8s/minio.md
Outdated
- storageClass: | ||
- Minio rootUser | ||
- Minio rootPassword | ||
- Minio ServiceType | ||
- Minio API port | ||
- Minio console port |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth doing something with this section, either:
- Explaining what these configurations are
- Removing it entirely (as I don't see why we need to mention them here?)
Regardless, using the CLI arg names (e.g. rootUser
is what flags spellcheck errors however these can also be alleviated by wrapping them in backticks to make them in-line code). This'll also address removing these from codespell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
```bash | ||
helm repo add bitnami https://charts.bitnami.com/bitnami | ||
|
||
helm install -n default --set global.storageClass=longhorn --set auth.rootUser=admin --set auth.rootPassword=test123456 --set service.type=NodePort --set service.nodePorts.api=31100 --set service.nodePorts.console=31101 --set persistence.size=2Gi my-minio bitnami/minio --version 12.9.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that there were some issues in testing where the previous 8Gi persistent size was an issue but just changing it to 2 could also bring up issues with file storage.
Perhaps there should be a note or some information about considerations and constraints for the storage size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
@@ -0,0 +1,129 @@ | |||
# Steps to create bio-user in bio namespace | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this file needs some explanation of:
- What the users are for
- What the users have access to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
docs/docs/hpc-on-k8s/user.md
Outdated
kubectl config use-context bio-context | ||
``` | ||
|
||
## Steps to create physics-user in bio namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step being a H2 puts it at the same level as the rest of the docs. I have tripped up trying to add this user because I've followed the doc in order and switched to the bio-context
and then received errors.
Fixing the formatting and probably noting the differences between the different contexts (as mentioned in the comment earlier in this file) should clear this up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
docs/docs/hpc-on-k8s/genome.md
Outdated
kind: Job | ||
metadata: | ||
generateName: fastqc- | ||
namespace: default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed IRL I think it's worth this job using the bio-context
otherwise the users
section isn't being put to any use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
docs/docs/hpc-on-k8s/hadoop.md
Outdated
value: <minio ip like "10.151.15.78"> | ||
- name: MINIO_PORT | ||
value: <minio port like "31100"> | ||
# Minio Access Key | ||
- name: MINIO_AKEY | ||
value: <minio-access-key> | ||
# Minio Secret Key | ||
- name: MINIO_SKEY | ||
value: <minio-secret-key> | ||
- name: BUCKET_NAME | ||
value: <minio bucket name like "hadoop"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some issues experienced IRL with not quoting strings, perhaps putting the entire <>
section in quotes would make this a little clearer (especially as the access key and secret key don't have any indication of this being needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
|
||
- Docker must be installed on all nodes of the Kubernetes cluster. | ||
- Password-less SSH access must be set up from the rke node to all Kubernetes nodes. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experienced issues with bringing up rke
before having firewall disabled (or appropriate rules setup for internal network communication) which were only fixed after making firewall correct and then restarting docker.
It seems sensible then to make one of the prerequisites be firewall related
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
docs/docs/hpc-on-k8s/genome.md
Outdated
- name: MINIO_IP | ||
value: <minio ip like "10.151.15.78"> | ||
- name: MINIO_PORT | ||
value: <minio port like "31100"> | ||
# Minio Access Key | ||
- name: MINIO_AKEY | ||
value: <minio-access-key> | ||
# Minio Secret Key | ||
- name: MINIO_SKEY | ||
value: <minio-secret-key> | ||
- name: BUCKET_NAME | ||
value: <minio bucket name like "genome"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some issues experienced IRL with not quoting strings, perhaps putting the entire <> section in quotes would make this a little clearer (especially as the access key and secret key don't have any indication of this being needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated thanks
4a87c80
to
82e6f68
Compare
No description provided.