From 06d08e2b518c3a5b32a2fd509a24b4ee0d2cd58b Mon Sep 17 00:00:00 2001 From: XaverStiensmeier <36056823+XaverStiensmeier@users.noreply.github.com> Date: Thu, 26 Sep 2024 13:55:32 +0200 Subject: [PATCH] Dev quick fixes (#536) * fixed rule setting for security groups * fixed multiple network is now list causing error bugs. * trying to figure out why route applying only works once. * Added more echo's for better debugging. * updated most tests * fixed validate_configuration.py tests. * Updated tests for startup.py * fixed bug in terminate that caused assume_yes to work as assume_no * updated terminate_cluster tests. * fixed formatting improved pylint * adapted tests * updated return threading test * updated provider_handler * tests not finished yet * Fixed server regex issue * test list clusters updated * fixed too open cluster_id regex * added missing "to" * fixed id_generation tests * renamed configuration handler to please linter * removed unnecessary tests and updated remaining * fixed remaining "subnet list gets handled as a single subnet" bug and finalized multiple routes handling. * updated tests not finished yet * improved code style * fixed tests further. One to fix left. * fixed additional tests * fixed all tests for ansible configurator * fixed comment * fixed multiple tests * fixed a few tests * Fixed create * fixed some issues regarding * fixing test_provider.py * removed infrastructure_cloud.yml * minor fixes * fixed all tests * removed print * changed prints to log * removed log * fixed None bug where [] is expected when no sshPublicKeyFile is given. * removed master from compute if use master as compute is false * reconstructured role additional in order to make it easier to include. Added quotes for consistency. * Updated all tests (#448) * updated most tests * fixed validate_configuration.py tests. * Updated tests for startup.py * fixed bug in terminate that caused assume_yes to work as assume_no * updated terminate_cluster tests. * fixed formatting improved pylint * adapted tests * updated return threading test * updated provider_handler * tests not finished yet * Fixed server regex issue * test list clusters updated * fixed too open cluster_id regex * added missing "to" * fixed id_generation tests * renamed configuration handler to please linter * removed unnecessary tests and updated remaining * updated tests not finished yet * improved code style * fixed tests further. One to fix left. * fixed additional tests * fixed all tests for ansible configurator * fixed comment * fixed multiple tests * fixed a few tests * Fixed create * fixed some issues regarding * fixing test_provider.py * removed infrastructure_cloud.yml * minor fixes * fixed all tests * removed print * changed prints to log * removed log * Introduced yaml lock (#464) * removed unnecessary close * simplified update_hosts * updated logging to separate folder and file based on creation date * many small changes and introducing locks * restructured log files again. Removed outdated key warnings from bibigrid.yml * added a few logs * further improved logging hierarchy * Added specific folder places for temporary job storage. This might solve the "SlurmSpoolDir full" bug. * Improved logging * Tried to fix temps and tried update to 23.11 but has errors so commented that part out * added initial space * added existing worker deletion on worker startup if worker already exists as no worker would've been started if Slurm would've known about the existing worker. This is not the best solution. (#468) * made waitForServices a cloud specific key (#465) * Improved log messages in validate_configuration.py to make fixing your configuration easier when using a hybrid-/multi-cloud setup (#466) * removed unnecessary line in provider.py and added cloud information to every log in validate_configuration.py for easier fixing. * track resources for providers separately to make quota checking precise * switched from low level cinder to high level block_storage.get_limits() * added keyword for ssh_timeout and improved argument passing for ssh. * Update issue templates * fixed a missing LOG * removed overwritten variable instantiation * Update bug_report.md * removed trailing whitespaces * added comment about sshTimeout key * Create dependabot.yml (#479) * Code cleanup and minor improvement (#482) * fixed :param and :return to @param and @return * many spelling mistakes fixed * added bibigrid_version to common configuration * added timeout to common_configuration * removed debug verbosity and improved log message wording * fixed is_active structure * fixed pip dependabot.yml * added documentation. Changed timeout to 2**(2+attempts) to decrease number of unlikely to work attempts * 474 allow non on demandpermanent workers (#487) * added worker server start without anything else * added host entry for permanent workers * added state unknown for permanent nodes * added on_demand key for groups and instances for ansible templating * fixed wording * temporary solution for custom execute list * added documentation for onDemand * added ansible.cfg replacement * fixed path. Added ansible.cfg to the gitignore * updated default creation and gitignore. Fixed non-vital bug that didn't reset hosts for new cluster start. * Code cleanup (#490) * fixed :param and :return to @param and @return * many spelling mistakes fixed * added bibigrid_version to common configuration * attempted zabbix linting fix. Needs testing. * fixed double import * Slurm upgrade fixes (#473) * removed slurm errors * added bibilog to show output log of most recent worker start. Tried fixing the slurm23.11 bug. * fixed a few vpnwkr -> vpngtw remnants. Excluded vpngtw from slurm setup * improved comments regarding changes and versions * removed cgroupautomount as it is defunct * Moved explicit slurm start to avoid errors caused by resume and suspend programs not being copied to their final location yet * added word for clarification * Fixed non-fatal bug that lead to non 0 exits on runs without any error. * changed slurm apt package to slurm-bibigrid * set version to 23.11.* * added a few more checks to make sure everything is set up before installing packages * Added configuration pinning * changed ignore_error to failed_when false * fixed or ignored lint fatals * Update tests (#493) * updated tests * removed print * updated tests * updated tests * fixed too loose condition * updated tests * added cloudScheduling and userRoles in bibigrid.yml * added userRoles in documentation * added varsFiles and comments * added folder path in documentation * fixed naming * added that vars are optional * polished userRoles documentation * 439 additional ansible roles (#495) * added roles structure * updated roles_path * fixed upper lower case * improved customRole implementation * minor fixes regarding role_paths * improved variable naming of user_roles * added documentation for other configurations * added new feature keys * fixed template files not being j2 * added helpful comments and removed no longer used roles/additional/ * userRoles crashes if no role set * fixed ansible.cfg path '"' * implemented partition system * added keys customAnsibleCfg and customSlurmConf as keys that stop the automatic copying * improved spacing * added logging * updated documentation * updated tests. Improved formatting * fix for service being too fast for startup * fixed remote src * changed RESUME to POWER_DOWN and removed delete call which is now handled via Slurm that calls terminate.sh (#503) * Update check (#499) * updated validate_configuration.py in order to provide schema validation. Moved cloud_identifier setting even closer to program start in order to be able to log better when performing other actions than create. * small log change and fix of schema key vpnInstance * updated tests * removed no longer relevant test * added schema validation tests * fixed ftype. Errors with multiple volumes. * made automount bound to defined mountPoints and therefore customizable * added empty line and updated bibigrid.yml * fixed nfsshare regex error and updated check to fit to the new name mountpoint pattern * hotfix: folder creation now before accessing hosts.yml * fixed tests * moved dnsmasq installation infront of /etc/resolv removal * fixed tests * fixed nfs exports by removing unnecessary "/" at the beginning * fixed master running slurmd but not being listed in slurm.conf. Now set to drained. * improved logging * increased timeout. Corrected comment in slurm.j2 * updated info regarding timeouts (changed from 4 to 5). * added SuspendTimeout as optional to elastic_scheduling * updated documentation * permission fix * fixes #394 * fixes #394 (also for hybrid cluster) * increased ResumeTimeout by 5 minutes. yml to yaml * changed all yml to yaml (as preferred by yaml) * updated timeouts. updated tests * fixes #394 - remove host from zabbix when terminated * zabbix api no longer used when not set in configuration * pleased linting by using false instead of no * added logging of traceroute even if debug flag is not set when error is not known. Added a few other logs * Update action 515 (#516) * configuration update possible 515 * added experimental * fixed indentation * fixed missing newline at EOF. Summarized restarts. * added check for running workers * fixed multiple workers due to faulty update * updated tests and removed done todos * updated documentation * removed print * Added apt-reactivate-auto-update to reactivate updates at the end of the playbook run (#518) * changed theia to 900. Added apt-reactivate-auto-update as new 999. * added new line at end of file * changed list representation * added multiple configuration keys for boot volume handling * updated documentation * updated documentation for new volumes and for usually ignored keys * updated and added tests * Pleasing Dependabot * Linting now uses python 3.10 * added early termination when configuration file not found * added dontUploadCredentials documentation * fixed broken links * added dontUploadCredentials to schema valiation * fixed dontUploadCredential ansible start bug * prevented BiBiGrid from looking for other keys if created key doesn't work to spot key issues earlier * prevented BiBiGrid from looking for other keys if created key doesn't work to spot key issues earlier * updated requirements.txt * restricted clouds.yaml access * moved openstack credentials permission change to server only * added '' to 3.10 * converted implicit to explicit octet notation * added "" and fixed a few more implicit octets * added "" * added missing " * added allow_agent=False to further prevent BiBiGrid from looking for keys * removed hardcoded /vol/ * updated versions * removed unnecessary comments and commented out Workflow execution --------- Co-authored-by: Jan Krueger --- .github/workflows/linting.yml | 4 +-- bibigrid/core/actions/create.py | 8 ++--- .../utility/handler/configuration_handler.py | 3 ++ bibigrid/core/utility/handler/ssh_handler.py | 2 +- bibigrid/core/utility/validate_schema.py | 3 +- .../markdown/features/configuration.md | 23 +++++++----- requirements-dev.txt | 2 +- requirements-rest.txt | 14 ++++---- requirements.txt | 14 ++++---- .../bibigrid/tasks/000-add-ip-routes.yaml | 8 ++--- .../tasks/000-playbook-rights-server.yaml | 2 +- .../roles/bibigrid/tasks/001-apt.yaml | 2 +- .../bibigrid/tasks/002-wireguard-vpn.yaml | 4 +-- .../roles/bibigrid/tasks/003-dns.yaml | 6 ++-- .../roles/bibigrid/tasks/010-bin-server.yaml | 4 +-- .../bibigrid/tasks/011-zabbix-agent.yaml | 8 ++--- .../bibigrid/tasks/011-zabbix-server.yaml | 10 +++--- .../tasks/020-disk-server-automount.yaml | 4 +-- .../roles/bibigrid/tasks/020-disk-worker.yaml | 2 +- .../roles/bibigrid/tasks/020-disk.yaml | 8 ++--- .../roles/bibigrid/tasks/025-nfs-server.yaml | 2 +- .../roles/bibigrid/tasks/025-nfs-worker.yaml | 2 +- .../roles/bibigrid/tasks/030-docker.yaml | 2 +- .../bibigrid/tasks/042-slurm-server.yaml | 36 +++++++++++-------- .../roles/bibigrid/tasks/042-slurm.yaml | 16 ++++----- .../roles/bibigrid/tasks/900-theia.yaml | 10 +++--- .../tasks/999-apt-reactivate-auto-update.yaml | 2 +- .../resistance_nextflow/tasks/main.yml | 14 ++++---- 28 files changed, 116 insertions(+), 99 deletions(-) diff --git a/.github/workflows/linting.yml b/.github/workflows/linting.yml index 5035d0d0..ab0c8837 100644 --- a/.github/workflows/linting.yml +++ b/.github/workflows/linting.yml @@ -5,10 +5,10 @@ jobs: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - - name: Set up Python 3.8 + - name: Set up Python 3.10 uses: actions/setup-python@v4 with: - python-version: 3.8 + python-version: '3.10' - name: Install dependencies run: | python -m pip install --upgrade pip diff --git a/bibigrid/core/actions/create.py b/bibigrid/core/actions/create.py index 374a375d..dea3d45b 100644 --- a/bibigrid/core/actions/create.py +++ b/bibigrid/core/actions/create.py @@ -368,12 +368,12 @@ def upload_data(self, private_key, clean_playbook=False): ansible_configurator.configure_ansible_yaml(providers=self.providers, configurations=self.configurations, cluster_id=self.cluster_id, log=self.log) + ansible_start = ssh_handler.ANSIBLE_START + ansible_start[-1] = (ansible_start[-1][0].format(",".join(self.permanents)), ansible_start[-1][1]) + self.log.debug(f"Starting playbook with {ansible_start}.") if self.configurations[0].get("dontUploadCredentials"): - commands = ssh_handler.ANSIBLE_START + commands = ansible_start else: - ansible_start = ssh_handler.ANSIBLE_START - ansible_start[-1] = (ansible_start[-1][0].format(",".join(self.permanents)), ansible_start[-1][1]) - self.log.debug(f"Starting playbook with {ansible_start}.") commands = [ssh_handler.get_ac_command(self.providers, AC_NAME.format( cluster_id=self.cluster_id))] + ssh_handler.ANSIBLE_START if clean_playbook: diff --git a/bibigrid/core/utility/handler/configuration_handler.py b/bibigrid/core/utility/handler/configuration_handler.py index 608c8e42..1a73cc9a 100644 --- a/bibigrid/core/utility/handler/configuration_handler.py +++ b/bibigrid/core/utility/handler/configuration_handler.py @@ -3,6 +3,7 @@ """ import os +import sys import mergedeep import yaml @@ -31,8 +32,10 @@ def read_configuration(log, path, configuration_list=True): configuration = yaml.safe_load(stream) except yaml.YAMLError as exc: log.warning("Couldn't read configuration %s: %s", path, exc) + sys.exit(1) else: log.warning("No such configuration file %s.", path) + sys.exit(1) if configuration_list and not isinstance(configuration, list): log.warning("Configuration should be list. Attempting to rescue by assuming a single configuration.") return [configuration] diff --git a/bibigrid/core/utility/handler/ssh_handler.py b/bibigrid/core/utility/handler/ssh_handler.py index 0a742318..62fa312f 100644 --- a/bibigrid/core/utility/handler/ssh_handler.py +++ b/bibigrid/core/utility/handler/ssh_handler.py @@ -111,7 +111,7 @@ def is_active(client, paramiko_key, ssh_data, log): log.info(f"Attempt {attempts}/{ssh_data['timeout']}. Connecting to {ssh_data['floating_ip']}") client.connect(hostname=ssh_data['gateway'].get("ip") or ssh_data['floating_ip'], username=ssh_data['username'], pkey=paramiko_key, timeout=7, - auth_timeout=ssh_data['timeout'], port=port) + auth_timeout=ssh_data['timeout'], port=port, look_for_keys=False, allow_agent=False) establishing_connection = False log.info(f"Successfully connected to {ssh_data['floating_ip']}.") except paramiko.ssh_exception.NoValidConnectionsError as exc: diff --git a/bibigrid/core/utility/validate_schema.py b/bibigrid/core/utility/validate_schema.py index eb4c84b3..4b2e3295 100644 --- a/bibigrid/core/utility/validate_schema.py +++ b/bibigrid/core/utility/validate_schema.py @@ -33,7 +33,8 @@ Optional('zabbix'): bool, Optional('nfs'): bool, Optional('ide'): bool, Optional('useMasterAsCompute'): bool, Optional('useMasterWithPublicIp'): bool, Optional('waitForServices'): [str], Optional('bootVolume'): str, Optional('bootFromVolume'): bool, Optional('terminateBootVolume'): bool, Optional('volumeSize'): int, - Optional('gateway'): {'ip': str, 'portFunction': str}, Optional('fallbackOnOtherImage'): bool, + Optional('gateway'): {'ip': str, 'portFunction': str}, Optional('dontUploadCredentials'): bool, + Optional('fallbackOnOtherImage'): bool, Optional('localDNSLookup'): bool, Optional('features'): [str], 'workerInstances': [ WORKER], 'masterInstance': MASTER, diff --git a/documentation/markdown/features/configuration.md b/documentation/markdown/features/configuration.md index 37a644a5..34ba43da 100644 --- a/documentation/markdown/features/configuration.md +++ b/documentation/markdown/features/configuration.md @@ -107,9 +107,9 @@ nfsshares: `nfsShares` expects a list of folder paths to share over the network using nfs. In every case, `/vol/spool/` is always an nfsShare. -This key is only relevant if the [nfs key](#nfs-optional) is set `True`. +This key is only relevant if the [nfs key](#nfs-optionalfalse) is set `True`. -If you would like to share a [masterMount](#mastermounts-optional), take a look [here](../software/nfs.md#mount-volume-into-share). +If you would like to share a [masterMount](#mastermounts-optionalfalse), take a look [here](../software/nfs.md#mount-volume-into-share).
@@ -191,7 +191,7 @@ If `True`, [nfs](../software/nfs.md) is set up. Default is `False`. #### ide (optional:False) If `True`, [Theia Web IDE](../software/theia_ide.md) is installed. -After creation connection information is [printed](../features/create.md#prints-cluster-information). Default is `False`. +After creation connection information is [printed](../features/create.md). Default is `False`. #### useMasterAsCompute (optional:True) @@ -217,7 +217,15 @@ gateway: portFunction: 30000 + oct4 # variables are called: oct1.oct2.oct3.oct4 ``` -Using gateway also automatically sets [useMasterWithPublicIp](#usemasterwithpublicip-optional) to `False`. +Using gateway also automatically sets [useMasterWithPublicIp](#usemasterwithpublicip-optionaltrue) to `False`. + +#### dontUploadCredentials (optional:True) +Usually, BiBiGrid will upload your credentials to the cluster. This is necessary for on demand scheduling. +However, if all your nodes are permanent (i.e. not on demand), you do not need to upload your credentials. +In such cases you can set `dontUploadCredentials: True`. + +This also allows for external node schedulers by using the Slurm REST API to decide whether a new node should be started or not. +[SimpleVM](https://cloud.denbi.de/about/project-types/simplevm/) is scheduling that way. ### Local @@ -288,7 +296,7 @@ most recent release is active, you can use `Ubuntu 22.04 LTS \(.*\)` so it alway This regex will also be used when starting worker instances on demand and is therefore mandatory to automatically resolve image updates of the described kind while running a cluster. -There's also a [Fallback Option](#fallbackonotherimage-optional). +There's also a [Fallback Option](#fallbackonotherimage-optionalfalse). ##### Find your active `type`s `flavor` is just the OpenStack terminology for `type`. @@ -364,14 +372,14 @@ openstack subnet list --os-cloud=openstack #### features (optional) -Cloud-wide slurm [features](#whats-a-feature) that are attached to every node in the cloud described by the configuration. +Cloud-wide slurm features that are attached to every node in the cloud described by the configuration. If both [worker group](#workerinstances) or [master features](#masterInstance) and configuration features are defined, they are merged. If you only have a single cloud and therefore a single configuration, this key is not helpful as a feature that is present at all nodes can be omitted as it can't influence the scheduling. #### bootFromVolume (optional:False) If True, the instance will boot from a volume created for this purpose. Keep in mind that on demand scheduling can lead -to multiple boots of the same configurated node. If you don't make use of [terminateBootVolume](#terminatebootvolume-optionalfalse) +to multiple boots of the same configurated node. If you don't make use of [terminateBootVolume](#terminatebootvolume-optionaltrue) this will lead to many created volumes. #### volumeSize (optional:50) @@ -380,4 +388,3 @@ The created volume's size if you use [bootFromVolume](#bootfromvolume-optionalfa #### terminateBootVolume (optional:True) If True, once the instance is shut down, boot volume is destroyed. This does not affect other attached volumes. Only the boot volume is affected. - diff --git a/requirements-dev.txt b/requirements-dev.txt index ef9bc405..0aee6fe8 100644 --- a/requirements-dev.txt +++ b/requirements-dev.txt @@ -1,2 +1,2 @@ -ansible_lint==6.8.0 +ansible_lint==24.9.0 pylint==2.14.5 diff --git a/requirements-rest.txt b/requirements-rest.txt index 57bc62e8..5c48187c 100644 --- a/requirements-rest.txt +++ b/requirements-rest.txt @@ -1,14 +1,14 @@ openstacksdk==0.62 -mergedeep -paramiko +mergedeep~=1.3.4 +paramiko~=3.4.0 python-cinderclient python-keystoneclient python-novaclient python-openstackclient==6.0.0 -PyYAML -shortuuid -sshtunnel -fastapi +PyYAML~=6.0 +shortuuid~=1.0.13 +sshtunnel~=0.4.0 +fastapi~=0.113.0 python-multipart -uvicorn +uvicorn~=0.23.2 httpx \ No newline at end of file diff --git a/requirements.txt b/requirements.txt index f5c73d10..acb744f8 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,19 +1,19 @@ openstacksdk==0.62 mergedeep~=1.3.4 -paramiko~=2.12.0 +paramiko~=3.5.0 python-cinderclient python-keystoneclient python-novaclient python-openstackclient==6.0.0 PyYAML~=6.0 -shortuuid~=1.0.11 +shortuuid~=1.0.13 sshtunnel~=0.4.0 sympy~=1.12 -seedir~=0.4.2 -cryptography~=38.0.4 -uvicorn~=0.23.2 +seedir~=0.5.0 +cryptography~=43.0.1 +uvicorn~=0.30.6 fastapi~=0.101.0 -pydantic~=2.1.1 +pydantic~=2.9.2 keystoneauth1~=5.1.0 -filelock~=3.13.1 +filelock~=3.16.1 schema~=0.7.7 \ No newline at end of file diff --git a/resources/playbook/roles/bibigrid/tasks/000-add-ip-routes.yaml b/resources/playbook/roles/bibigrid/tasks/000-add-ip-routes.yaml index e193ccc6..7c10866c 100644 --- a/resources/playbook/roles/bibigrid/tasks/000-add-ip-routes.yaml +++ b/resources/playbook/roles/bibigrid/tasks/000-add-ip-routes.yaml @@ -15,7 +15,7 @@ dest: "{{ item.path }}.disabled" owner: root group: root - mode: 0644 + mode: "0o644" remote_src: true with_items: "{{ collected_files.files }}" - name: Remove collected files @@ -30,7 +30,7 @@ line: "network: {config: disabled}" owner: root group: root - mode: 0644 + mode: "0o644" create: true - name: Generate location specific worker userdata @@ -39,7 +39,7 @@ dest: "/etc/systemd/network/bibigrid_ens3.network" owner: root group: systemd-network - mode: 0640 + mode: "0o640" become: true notify: - systemd-networkd restart @@ -50,7 +50,7 @@ dest: "/etc/systemd/network/bibigrid_ens3.link" owner: root group: systemd-network - mode: 0640 + mode: "0o640" become: true notify: - systemd-networkd restart diff --git a/resources/playbook/roles/bibigrid/tasks/000-playbook-rights-server.yaml b/resources/playbook/roles/bibigrid/tasks/000-playbook-rights-server.yaml index 438e17ac..57ae7161 100644 --- a/resources/playbook/roles/bibigrid/tasks/000-playbook-rights-server.yaml +++ b/resources/playbook/roles/bibigrid/tasks/000-playbook-rights-server.yaml @@ -9,4 +9,4 @@ path: /opt/playbook/ state: directory recurse: true - mode: "0770" + mode: "0o770" diff --git a/resources/playbook/roles/bibigrid/tasks/001-apt.yaml b/resources/playbook/roles/bibigrid/tasks/001-apt.yaml index 0ca1c17b..2fda459f 100644 --- a/resources/playbook/roles/bibigrid/tasks/001-apt.yaml +++ b/resources/playbook/roles/bibigrid/tasks/001-apt.yaml @@ -8,7 +8,7 @@ dest: /etc/apt/apt.conf.d/20auto-upgrades owner: root group: root - mode: 0644 + mode: "0o644" - name: Wait for cloud-init / user-data to finish command: cloud-init status --wait diff --git a/resources/playbook/roles/bibigrid/tasks/002-wireguard-vpn.yaml b/resources/playbook/roles/bibigrid/tasks/002-wireguard-vpn.yaml index d70f10a5..3807c137 100644 --- a/resources/playbook/roles/bibigrid/tasks/002-wireguard-vpn.yaml +++ b/resources/playbook/roles/bibigrid/tasks/002-wireguard-vpn.yaml @@ -9,7 +9,7 @@ dest: /etc/systemd/network/wg0.netdev owner: root group: systemd-network - mode: 0640 + mode: "0o640" become: true notify: - systemd-networkd restart @@ -20,7 +20,7 @@ dest: /etc/systemd/network/wg0.network owner: root group: systemd-network - mode: 0640 + mode: "0o640" become: true notify: systemd-networkd restart diff --git a/resources/playbook/roles/bibigrid/tasks/003-dns.yaml b/resources/playbook/roles/bibigrid/tasks/003-dns.yaml index 3515f86e..f739a0f0 100644 --- a/resources/playbook/roles/bibigrid/tasks/003-dns.yaml +++ b/resources/playbook/roles/bibigrid/tasks/003-dns.yaml @@ -18,21 +18,21 @@ template: src: dns/hosts.j2 dest: /etc/dnsmasq.hosts - mode: '0644' + mode: "0o644" notify: dnsmasq - name: Adjust dnsmasq.resolv.conf template: src: dns/resolv.conf.j2 dest: /etc/dnsmasq.resolv.conf - mode: '0644' + mode: "0o644" notify: dnsmasq - name: Adjust dnsmasq conf template: src: dns/dnsmasq.conf.j2 dest: /etc/dnsmasq.conf - mode: '0644' + mode: "0o644" notify: dnsmasq - name: Flush handlers diff --git a/resources/playbook/roles/bibigrid/tasks/010-bin-server.yaml b/resources/playbook/roles/bibigrid/tasks/010-bin-server.yaml index 49256c1a..9712adeb 100644 --- a/resources/playbook/roles/bibigrid/tasks/010-bin-server.yaml +++ b/resources/playbook/roles/bibigrid/tasks/010-bin-server.yaml @@ -14,7 +14,7 @@ copy: src: ~/bin dest: /usr/local - mode: '0775' + mode: "0o775" - name: Delete origin folder file: path: ~{{ ansible_facts.env.SUDO_USER }}/bin @@ -24,4 +24,4 @@ template: src: "bin/bibiname.j2" dest: "/usr/local/bin/bibiname" - mode: '0775' + mode: "0o775" diff --git a/resources/playbook/roles/bibigrid/tasks/011-zabbix-agent.yaml b/resources/playbook/roles/bibigrid/tasks/011-zabbix-agent.yaml index 9a658689..c8aebe50 100644 --- a/resources/playbook/roles/bibigrid/tasks/011-zabbix-agent.yaml +++ b/resources/playbook/roles/bibigrid/tasks/011-zabbix-agent.yaml @@ -11,7 +11,7 @@ file: path: /etc/zabbix/zabbix_agentd.d/ state: directory - mode: 0755 + mode: "0o755" - name: Create zabbix_agent log directory file: @@ -19,20 +19,20 @@ state: directory owner: zabbix group: zabbix - mode: 0755 + mode: "0o755" - name: Adjust zabbix agent configuration template: src: zabbix/zabbix_agentd.conf.j2 dest: /etc/zabbix/zabbix_agentd.conf - mode: 0644 + mode: "0o644" notify: zabbix-agent - name: Copy Zabbix Host delete script copy: src: zabbix/zabbix_host_delete.py dest: /usr/local/bin/zabbix_host_delete.py - mode: 0755 + mode: "0o755" - name: Install zabbix python-api pip: diff --git a/resources/playbook/roles/bibigrid/tasks/011-zabbix-server.yaml b/resources/playbook/roles/bibigrid/tasks/011-zabbix-server.yaml index df2b5def..1a39ffd8 100644 --- a/resources/playbook/roles/bibigrid/tasks/011-zabbix-server.yaml +++ b/resources/playbook/roles/bibigrid/tasks/011-zabbix-server.yaml @@ -69,7 +69,7 @@ template: src: zabbix/zabbix_server.conf.j2 dest: /etc/zabbix/zabbix_server.conf - mode: 0644 + mode: "0o644" notify: zabbix-server - name: Start and Enable zabbix-server @@ -121,7 +121,7 @@ state: directory owner: root group: root - mode: '0755' + mode: "0o755" - name: Adjust zabbix web frontend configuration notify: apache2 @@ -130,12 +130,12 @@ template: src: zabbix/apache.conf.j2 dest: /etc/zabbix/apache.conf - mode: 0644 + mode: "0o644" - name: Adjust zabbix.conf template: src: zabbix/zabbix.conf.php.j2 dest: /etc/zabbix/web/zabbix.conf.php - mode: 0644 + mode: "0o644" - name: Start and enable apache web server systemd: @@ -147,7 +147,7 @@ copy: src: zabbix/index.html dest: /var/www/html/index.html - mode: 0644 + mode: "0o644" - name: Force all notified handlers to run at this point meta: flush_handlers diff --git a/resources/playbook/roles/bibigrid/tasks/020-disk-server-automount.yaml b/resources/playbook/roles/bibigrid/tasks/020-disk-server-automount.yaml index 8e4b5f49..fd8c4619 100644 --- a/resources/playbook/roles/bibigrid/tasks/020-disk-server-automount.yaml +++ b/resources/playbook/roles/bibigrid/tasks/020-disk-server-automount.yaml @@ -19,9 +19,9 @@ - name: Create mount folders if they don't exist file: - path: "/vol/{{ item.mount_point }}" + path: "{{ item.mount_point }}" state: directory - mode: '0755' + mode: "0o755" owner: root group: '{{ ansible_distribution | lower }}' diff --git a/resources/playbook/roles/bibigrid/tasks/020-disk-worker.yaml b/resources/playbook/roles/bibigrid/tasks/020-disk-worker.yaml index 4acf3b56..f33fcb6b 100644 --- a/resources/playbook/roles/bibigrid/tasks/020-disk-worker.yaml +++ b/resources/playbook/roles/bibigrid/tasks/020-disk-worker.yaml @@ -11,4 +11,4 @@ file: path: /vol/scratch state: directory - mode: 0777 + mode: "0o777" diff --git a/resources/playbook/roles/bibigrid/tasks/020-disk.yaml b/resources/playbook/roles/bibigrid/tasks/020-disk.yaml index 08b4b980..9948ff3c 100644 --- a/resources/playbook/roles/bibigrid/tasks/020-disk.yaml +++ b/resources/playbook/roles/bibigrid/tasks/020-disk.yaml @@ -4,7 +4,7 @@ state: directory owner: root group: '{{ ansible_distribution | lower }}' - mode: 0775 + mode: "0o775" - name: Create /vol/ directory with rights 0775 owned by root file: @@ -12,13 +12,13 @@ state: directory owner: root group: '{{ ansible_distribution | lower }}' - mode: 0775 + mode: "0o775" - name: Create /vol/spool/ directory with rights 0777 file: path: /vol/spool/ state: directory - mode: 0777 + mode: "0o777" - name: Change rights of /opt directory to 0775 and set group to ansible_distribution file: @@ -26,7 +26,7 @@ state: directory owner: root group: '{{ ansible_distribution | lower }}' - mode: 0775 + mode: "0o775" - name: Create link in '{{ ansible_distribution | lower }}' home file: diff --git a/resources/playbook/roles/bibigrid/tasks/025-nfs-server.yaml b/resources/playbook/roles/bibigrid/tasks/025-nfs-server.yaml index c7030971..03bce9ca 100644 --- a/resources/playbook/roles/bibigrid/tasks/025-nfs-server.yaml +++ b/resources/playbook/roles/bibigrid/tasks/025-nfs-server.yaml @@ -9,7 +9,7 @@ state: directory owner: root group: root - mode: 0777 + mode: "0o777" with_items: - "{{ nfs_mounts }}" diff --git a/resources/playbook/roles/bibigrid/tasks/025-nfs-worker.yaml b/resources/playbook/roles/bibigrid/tasks/025-nfs-worker.yaml index 71217302..c45c3e9a 100644 --- a/resources/playbook/roles/bibigrid/tasks/025-nfs-worker.yaml +++ b/resources/playbook/roles/bibigrid/tasks/025-nfs-worker.yaml @@ -16,7 +16,7 @@ state: directory owner: root group: root - mode: 0777 + mode: "0o777" with_items: - "{{ nfs_mounts }}" diff --git a/resources/playbook/roles/bibigrid/tasks/030-docker.yaml b/resources/playbook/roles/bibigrid/tasks/030-docker.yaml index 07830b2b..b4aad03f 100644 --- a/resources/playbook/roles/bibigrid/tasks/030-docker.yaml +++ b/resources/playbook/roles/bibigrid/tasks/030-docker.yaml @@ -13,7 +13,7 @@ dest: /etc/docker/daemon.json owner: root group: root - mode: 0644 + mode: "0o644" notify: docker diff --git a/resources/playbook/roles/bibigrid/tasks/042-slurm-server.yaml b/resources/playbook/roles/bibigrid/tasks/042-slurm-server.yaml index e3a28ac6..059e11ef 100644 --- a/resources/playbook/roles/bibigrid/tasks/042-slurm-server.yaml +++ b/resources/playbook/roles/bibigrid/tasks/042-slurm-server.yaml @@ -1,3 +1,9 @@ +- name: Change group ownership of OpenStack credentials file to slurm + file: + path: /etc/openstack/clouds.yaml + group: slurm + mode: "0o640" # (owner can read/write, group can read, others have no access) + - name: Create slurm db mysql_db: name: "{{ slurm_conf.db }}" @@ -18,7 +24,7 @@ dest: /etc/slurm/slurmdbd.conf owner: slurm group: root - mode: "0600" + mode: "0o600" - name: Generate random JWT Secret command: @@ -30,7 +36,7 @@ path: /etc/slurm/jwt-secret.key owner: slurm group: slurm - mode: "0600" + mode: "0o600" - name: Copy env file for configuration of slurmrestd copy: @@ -38,14 +44,14 @@ dest: /etc/default/slurmrestd owner: root group: root - mode: "0644" + mode: "0o644" - name: Create system overrides directories (slurmdbdm slurmrestd) file: path: "/etc/systemd/system/{{ item }}.service.d" group: root owner: root - mode: "0755" + mode: "0o755" state: directory with_items: - slurmdbd @@ -55,7 +61,7 @@ copy: src: "slurm/systemd/{{ item }}.override" dest: "/etc/systemd/system/{{ item }}.service.d/override.conf" - mode: "0644" + mode: "0o644" owner: root group: root with_items: @@ -75,13 +81,13 @@ group: ansible path: /opt/slurm/ state: directory - mode: "0770" + mode: "0o770" - name: Ensures /etc/slurm dir exists file: path: /etc/slurm/ state: directory - mode: 0755 + mode: "0o755" - name: Ensures /opt/slurm/.ssh/ dir exists file: @@ -89,7 +95,7 @@ group: slurm owner: slurm state: directory - mode: 0700 + mode: "0o700" - name: Copy private key (openstack keypair) copy: @@ -97,7 +103,7 @@ dest: /opt/slurm/.ssh/id_ecdsa owner: slurm group: slurm - mode: "0600" + mode: "0o600" - name: Copy create program script (power) copy: @@ -105,7 +111,7 @@ dest: /opt/slurm/create.sh owner: slurm group: ansible - mode: "0550" + mode: "0o550" - name: Copy terminate program script (power) copy: @@ -113,7 +119,7 @@ dest: /opt/slurm/terminate.sh owner: slurm group: ansible - mode: "0550" + mode: "0o550" - name: Copy fail program script (power) copy: @@ -121,7 +127,7 @@ dest: /opt/slurm/fail.sh owner: slurm group: ansible - mode: "0550" + mode: "0o550" - name: Copy "create_server.py" script copy: @@ -129,7 +135,7 @@ dest: /usr/local/bin/create_server.py owner: slurm group: ansible - mode: "0750" + mode: "0o750" - name: Copy "delete_server.py" script copy: @@ -137,7 +143,7 @@ dest: /usr/local/bin/delete_server.py owner: slurm group: ansible - mode: "0750" + mode: "0o750" - name: Install python dependencies pip: @@ -163,7 +169,7 @@ dest: "/opt/slurm/userdata_{{ hostvars[item].cloud_identifier }}.txt" owner: slurm group: ansible - mode: "0640" + mode: "0o640" with_items: "{{ groups.vpngtw + groups.master }}" - name: Enable slurmdbd and slurmrestd services diff --git a/resources/playbook/roles/bibigrid/tasks/042-slurm.yaml b/resources/playbook/roles/bibigrid/tasks/042-slurm.yaml index d80fb67c..c141009a 100644 --- a/resources/playbook/roles/bibigrid/tasks/042-slurm.yaml +++ b/resources/playbook/roles/bibigrid/tasks/042-slurm.yaml @@ -16,7 +16,7 @@ Pin: version 23.11.* Pin-Priority: 1001 dest: /etc/apt/preferences.d/slurm-bibigrid - mode: '0311' + mode: "0o311" - name: Install slurm-bibigrid package apt: @@ -34,7 +34,7 @@ dest: /etc/munge/munge.key owner: munge group: munge - mode: 0600 + mode: "0o600" notify: - munge @@ -42,7 +42,7 @@ file: path: "{{ item }}" state: directory - mode: 0775 + mode: "0o775" owner: root group: slurm with_items: @@ -55,7 +55,7 @@ path: "/etc/systemd/system/{{ item }}.service.d" group: root owner: root - mode: "0755" + mode: "0o755" state: directory with_items: - slurmd @@ -65,7 +65,7 @@ copy: src: "slurm/systemd/{{ item }}.override" dest: "/etc/systemd/system/{{ item }}.service.d/override.conf" - mode: "0644" + mode: "0o644" owner: root group: root with_items: @@ -89,7 +89,7 @@ dest: /etc/slurm/slurm.conf owner: slurm group: root - mode: 0444 + mode: "0o444" - name: Create Job Container configuration template: @@ -97,7 +97,7 @@ dest: /etc/slurm/job_container.conf owner: slurm group: root - mode: 0444 + mode: "0o444" - name: Slurm cgroup configuration copy: @@ -105,7 +105,7 @@ dest: /etc/slurm/cgroup.conf owner: slurm group: root - mode: 0444 + mode: "0o444" - name: Restart slurmd systemd: diff --git a/resources/playbook/roles/bibigrid/tasks/900-theia.yaml b/resources/playbook/roles/bibigrid/tasks/900-theia.yaml index af8b19a8..57c8de26 100644 --- a/resources/playbook/roles/bibigrid/tasks/900-theia.yaml +++ b/resources/playbook/roles/bibigrid/tasks/900-theia.yaml @@ -12,7 +12,7 @@ file: path: "{{ nvm_install_dir }}" state: directory - mode: 0755 + mode: "0o755" - name: Set fact 'theia_ide_user' when not defined set_fact: @@ -75,13 +75,13 @@ file: path: "{{ theia_ide_install_dir }}" state: directory - mode: 0755 + mode: "0o755" - name: Copy IDE configuration to IDE build dir template: src: theia/package.json.j2 dest: "{{ theia_ide_install_dir }}/package.json" - mode: 0644 + mode: "0o644" - name: Build ide shell: | @@ -98,13 +98,13 @@ template: src: theia/theia-ide.sh.j2 dest: "{{ theia_ide_install_dir }}/theia-ide.sh" - mode: 0755 + mode: "0o755" - name: Generate systemd service template: src: theia/theia-ide.service.j2 dest: /etc/systemd/system/theia-ide.service - mode: 0644 + mode: "0o644" - name: Enable and Start service systemd: diff --git a/resources/playbook/roles/bibigrid/tasks/999-apt-reactivate-auto-update.yaml b/resources/playbook/roles/bibigrid/tasks/999-apt-reactivate-auto-update.yaml index 8270a8fa..ee7fb676 100644 --- a/resources/playbook/roles/bibigrid/tasks/999-apt-reactivate-auto-update.yaml +++ b/resources/playbook/roles/bibigrid/tasks/999-apt-reactivate-auto-update.yaml @@ -4,4 +4,4 @@ dest: /etc/apt/apt.conf.d/20auto-upgrades owner: root group: root - mode: 0644 + mode: "0o644" diff --git a/resources/playbook/roles_user/resistance_nextflow/tasks/main.yml b/resources/playbook/roles_user/resistance_nextflow/tasks/main.yml index 0b497694..9501559c 100644 --- a/resources/playbook/roles_user/resistance_nextflow/tasks/main.yml +++ b/resources/playbook/roles_user/resistance_nextflow/tasks/main.yml @@ -11,8 +11,8 @@ - name: Install Java JRE on Debian/Ubuntu become: True apt: - name: default-jre # Package name for Java JRE on Debian-based systems - state: present # Ensure that the package is present, you can use "latest" as well + name: default-jre + state: present - name: Get Nextflow shell: wget -qO- https://get.nextflow.io | bash @@ -26,8 +26,8 @@ group: ubuntu mode: '0775' -- name: Execute Nextflow workflow - become_user: ubuntu - shell: ./nextflow run resFinder.nf -profile slurm - args: - chdir: "/vol/spool" # Change to the directory where your workflow resides +#- name: Execute Nextflow workflow +# become_user: ubuntu +# shell: ./nextflow run resFinder.nf -profile slurm # run your workflow +# args: +# chdir: "/vol/spool" # Change to the directory where your workflow resides