-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* renamed volumeSize to bootVolumeSize to avoid name issues * added implementation for adding volumes to permanent workers (they are not deleted) * implemented creating and terminating volumes without filesystem for permanent workers * fully working for permanent workers. masterMount is broken now, but also replaced. Will be fixed. * Added volume creation to create_server. Not yet working. * hostvar for each host * Fixed information handling and naming issues * fixed host yaml creation * removed unnecessary prints * improved readability fixed minor bugs * added volume deletion and set volume_ap_version explicitly * removed prints from test_provider.py * improved readability greatly. Fixed overwriting host vars bug * snapshot and existing volumes can now be attached to master and workers on startup * snapshot and existing volumes can now be attached to master and workers on startup * removed mountPoint from a log message in case no mount point is specified * fixed lsblk not finding item.device due to race condition * improved comments and naming * removed server automount. This is now handled by a single automount task for both master and workers * allows nor to start new permanent volumes if a name is given. One could consider adding tmp to not named volumes for additional clarity * fixed wrong function call * renamed nfs_mount to nfs_shares * added semipermanent as an option * fixed wrong method of default values for Ansible * started reworking * added volumes and changed bootVolumes * updated bibigrid.yaml and aligned naming of bootVolume and volume * added newline at end of file * removed superfluous provider paramter * pleased linting * removed argument from function call * moved host vars creation, vars deletion, added comments * largely reworked how volumes are attached to servers to be more explicit * small naming fixes * updated priority order of permanent and semiPermanent. Updated documentation to new explicit bool setup. Added type as a key. * fixed bug regarding dontUploadCredentials * updated schema validation * Update linting.yml * Update linting.yml * Update linting.yml * Update linting.yml * added __init__.py where appropriate * update bibigrid.yaml for more explicit volumes documentation * volumes are now validated and fixed old state of masterInstance in validate_schema.py * Update linting.yml * Update linting.yml * fixed longtime naming bug for unknown openstack exceptions * saves more info in .mem file * moved structure of tests and added a basic integration_test file that needs to be expanded and improved * moved tests * added "not ready yet" * updated bootVolume documentation * moved tests added __init__.py files for better discovery. minor fixes * updated tests and comments * updated tests and comments * updated tests, code and comments for ansible_configuration * updated tests for ansible_configurator * fixed test_ansible_configurator.py * fixed test_configuration_handler.py * improved exception messages * pleased ansible linter * fixed terminate return values test * improved naming * added tests to make sure that server regex only deletes bibigrid servers with fitting cluster id and same for volumes * pleased pylint * fixed validation issue when using exists in master * removed forgotten print * fixed description bug * final bugfixes * pleased linter * fixed too many positional arguments
- Loading branch information
1 parent
7bdb8f0
commit 7569163
Showing
53 changed files
with
1,507 additions
and
712 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,105 +1,126 @@ | ||
# See https://cloud.denbi.de/wiki/Tutorials/BiBiGrid/ (after update) | ||
# See https://github.com/BiBiServ/bibigrid/blob/master/documentation/markdown/features/configuration.md | ||
# First configuration also holds general cluster information and must include the master. | ||
# All other configurations mustn't include another master, but exactly one vpngtw instead (keys like master). | ||
# For an easy introduction see https://github.com/deNBI/bibigrid_clum | ||
# For more detailed information see https://github.com/BiBiServ/bibigrid/blob/master/documentation/markdown/features/configuration.md | ||
|
||
- infrastructure: openstack # former mode. Describes what cloud provider is used (others are not implemented yet) | ||
cloud: openstack # name of clouds.yaml cloud-specification key (which is value to top level key clouds) | ||
- # -- BEGIN: GENERAL CLUSTER INFORMATION -- | ||
# The following options configure cluster wide keys | ||
# Modify these according to your requirements | ||
|
||
# -- BEGIN: GENERAL CLUSTER INFORMATION -- | ||
# sshTimeout: 5 # number of attempts to connect to instances during startup with delay in between | ||
# cloudScheduling: | ||
# sshTimeout: 5 # like sshTimeout but during the on demand scheduling on the running cluster | ||
|
||
## sshPublicKeyFiles listed here will be added to access the cluster. A temporary key is created by bibigrid itself. | ||
#sshPublicKeyFiles: | ||
# - [public key one] | ||
## sshPublicKeyFiles listed here will be added to the master's authorized_keys. A temporary key is stored at ~/.config/bibigrid/keys | ||
# sshPublicKeyFiles: | ||
# - [public key one] | ||
|
||
## Volumes and snapshots that will be mounted to master | ||
#masterMounts: (optional) # WARNING: will overwrite unidentified filesystems | ||
# - name: [volume name] | ||
# mountPoint: [where to mount to] # (optional) | ||
# masterMounts: DEPRECATED -- see `volumes` key for each instance instead | ||
|
||
#nfsShares: /vol/spool/ is automatically created as a nfs | ||
# - [nfsShare one] | ||
# nfsShares: # list of nfs shares. /vol/spool/ is automatically created as an nfs if nfs is true | ||
# - [nfsShare one] | ||
|
||
# userRoles: # see ansible_hosts for all options | ||
## Ansible Related | ||
# userRoles: # see ansible_hosts for all 'hosts' options | ||
# - hosts: | ||
# - "master" | ||
# roles: # roles placed in resources/playbook/roles_user | ||
# - name: "resistance_nextflow" | ||
# varsFiles: # (optional) | ||
# - [...] | ||
|
||
## Uncomment if you don't want assign a public ip to the master; for internal cluster (Tuebingen). | ||
## If you use a gateway or start a cluster from the cloud, your master does not need a public ip. | ||
# useMasterWithPublicIp: False # defaults True if False no public-ip (floating-ip) will be allocated | ||
# gateway: # if you want to use a gateway for create. | ||
# ip: # IP of gateway to use | ||
# portFunction: 30000 + oct4 # variables are called: oct1.oct2.oct3.oct4 | ||
|
||
# deleteTmpKeypairAfter: False | ||
# dontUploadCredentials: False | ||
## Only relevant for specific projects (e.g. SimpleVM) | ||
# deleteTmpKeypairAfter: False # warning: if you don't pass a key via sshPublicKeyFiles you lose access! | ||
# dontUploadCredentials: False # warning: enabling this prevents you from scheduling on demand! | ||
|
||
## Additional Software | ||
# zabbix: False | ||
# nfs: False | ||
# ide: False # installs a web ide on the master node. A nice way to view your cluster (like Visual Studio Code) | ||
|
||
### Slurm Related | ||
# elastic_scheduling: # for large or slow clusters increasing these timeouts might be necessary to avoid failures | ||
# SuspendTimeout: 60 # after SuspendTimeout seconds, slurm allows to power up the node again | ||
# ResumeTimeout: 1200 # if a node doesn't start in ResumeTimeout seconds, the start is considered failed. | ||
|
||
# Other keys - these are default False | ||
# Usually Ignored | ||
##localFS: True | ||
##localDNSlookup: True | ||
# cloudScheduling: | ||
# sshTimeout: 5 # like sshTimeout but during the on demand scheduling on the running cluster | ||
|
||
#zabbix: True | ||
#nfs: True | ||
#ide: True # A nice way to view your cluster as if you were using Visual Studio Code | ||
# useMasterAsCompute: True | ||
|
||
useMasterAsCompute: True | ||
# -- END: GENERAL CLUSTER INFORMATION -- | ||
|
||
# bootFromVolume: False | ||
# terminateBootVolume: True | ||
# volumeSize: 50 | ||
# waitForServices: # existing service name that runs after an instance is launched. BiBiGrid's playbook will wait until service is "stopped" to avoid issues | ||
# -- BEGIN: MASTER CLOUD INFORMATION -- | ||
infrastructure: openstack # former mode. Describes what cloud provider is used (others are not implemented yet) | ||
cloud: openstack # name of clouds.yaml cloud-specification key (which is value to top level key clouds) | ||
|
||
# waitForServices: # list of existing service names that affect apt. BiBiGrid's playbook will wait until service is "stopped" to avoid issues | ||
# - de.NBI_Bielefeld_environment.service # uncomment for cloud site Bielefeld | ||
|
||
# master configuration | ||
## master configuration | ||
masterInstance: | ||
type: # existing type/flavor on your cloud. See launch instance>flavor for options | ||
image: # existing active image on your cloud. Consider using regex to prevent image updates from breaking your running cluster | ||
type: # existing type/flavor from your cloud. See launch instance>flavor for options | ||
image: # existing active image from your cloud. Consider using regex to prevent image updates from breaking your running cluster | ||
# features: # list | ||
# - feature1 | ||
# partitions: # list | ||
# bootVolume: None | ||
# bootFromVolume: True | ||
# terminateBootVolume: True | ||
# volumeSize: 50 | ||
|
||
# -- END: GENERAL CLUSTER INFORMATION -- | ||
# - partition1 | ||
# bootVolume: # optional | ||
# name: # optional; if you want to boot from a specific volume | ||
# terminate: True # whether the volume is terminated on server termination | ||
# size: 50 | ||
# volumes: # optional | ||
# - name: volumeName # empty for temporary volumes | ||
# snapshot: snapshotName # optional; to create volume from a snapshot | ||
# mountPoint: /vol/mountPath | ||
# size: 50 | ||
# fstype: ext4 # must support chown | ||
# type: # storage type; available values depend on your location; for Bielefeld CEPH_HDD, CEPH_NVME | ||
## Select up to one of the following options; otherwise temporary is picked | ||
# exists: False # if True looks for existing volume with exact name. count must be 1. Volume is never deleted. | ||
# permanent: False # if True volume is never deleted; overwrites semiPermanent if both are given | ||
# semiPermanent: False # if True volume is only deleted during cluster termination | ||
|
||
# fallbackOnOtherImage: False # if True, most similar image by name will be picked. A regex can also be given instead. | ||
|
||
# worker configuration | ||
## worker configuration | ||
# workerInstances: | ||
# - type: # existing type/flavor on your cloud. See launch instance>flavor for options | ||
# - type: # existing type/flavor from your cloud. See launch instance>flavor for options | ||
# image: # same as master. Consider using regex to prevent image updates from breaking your running cluster | ||
# count: # any number of workers you would like to create with set type, image combination | ||
# count: 1 # number of workers you would like to create with set type, image combination | ||
# # features: # list | ||
# # partitions: # list | ||
# # bootVolume: None | ||
# # bootFromVolume: True | ||
# # terminateBootVolume: True | ||
# # volumeSize: 50 | ||
|
||
# Depends on cloud image | ||
sshUser: # for example ubuntu | ||
|
||
# Depends on cloud site and project | ||
subnet: # existing subnet on your cloud. See https://openstack.cebitec.uni-bielefeld.de/project/networks/ | ||
# or network: | ||
|
||
# Uncomment if no full DNS service for started instances is available. | ||
# Currently, the case in Berlin, DKFZ, Heidelberg and Tuebingen. | ||
#localDNSLookup: True | ||
|
||
#features: # list | ||
|
||
# elastic_scheduling: # for large or slow clusters increasing these timeouts might be necessary to avoid failures | ||
# SuspendTimeout: 60 # after SuspendTimeout seconds, slurm allows to power up the node again | ||
# ResumeTimeout: 1200 # if a node doesn't start in ResumeTimeout seconds, the start is considered failed. | ||
# # partitions: # list of slurm features that all nodes of this group have | ||
# # bootVolume: # optional | ||
# # name: # optional; if you want to boot from a specific volume | ||
# # terminate: True # whether the volume is terminated on server termination | ||
# # size: 50 | ||
# # volumes: # optional | ||
# # - name: volumeName # optional | ||
# # snapshot: snapshotName # optional; to create volume from a snapshot | ||
# # mountPoint: /vol/mountPath # optional; not mounted if no path is given | ||
# # size: 50 | ||
# # fstype: ext4 # must support chown | ||
# # type: # storage type; available values depend on your location; for Bielefeld CEPH_HDD, CEPH_NVME | ||
# ## Select up to one of the following options; otherwise temporary is picked | ||
# # exists: False # if True looks for existing volume with exact name. count must be 1. Volume is never deleted. | ||
# # permanent: False # if True volume is never deleted; overwrites semiPermanent if both are given | ||
# # semiPermanent: False # if True volume is only deleted during cluster termination | ||
|
||
# Depends on image | ||
sshUser: # for example 'ubuntu' | ||
|
||
# Depends on project | ||
subnet: # existing subnet from your cloud. See https://openstack.cebitec.uni-bielefeld.de/project/networks/ | ||
# network: # only if no subnet is given | ||
|
||
# features: # list of slurm features that all nodes of this cloud have | ||
# - feature1 | ||
|
||
# bootVolume: # optional (cloud wide) | ||
# name: # optional; if you want to boot from a specific volume | ||
# terminate: True # whether the volume is terminated on server termination | ||
# size: 50 | ||
|
||
#- [next configurations] |
Empty file.
Empty file.
Empty file.
Oops, something went wrong.