From 040b92bcc703029604b4cabc4a65f9362f2b5a10 Mon Sep 17 00:00:00 2001 From: anjastrunk <119566837+anjastrunk@users.noreply.github.com> Date: Tue, 15 Oct 2024 10:29:28 +0200 Subject: [PATCH 1/7] Fix typo in Volume Type Standard Decision Record (#782) * Fix typo Signed-off-by: anjastrunk <119566837+anjastrunk@users.noreply.github.com> * Update Standards/scs-0111-v1-volume-type-decisions.md Co-authored-by: josephineSei <128813814+josephineSei@users.noreply.github.com> Signed-off-by: anjastrunk <119566837+anjastrunk@users.noreply.github.com> --------- Signed-off-by: anjastrunk <119566837+anjastrunk@users.noreply.github.com> Co-authored-by: josephineSei <128813814+josephineSei@users.noreply.github.com> --- Standards/scs-0111-v1-volume-type-decisions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Standards/scs-0111-v1-volume-type-decisions.md b/Standards/scs-0111-v1-volume-type-decisions.md index 28fc32e8b..aaf3e522f 100644 --- a/Standards/scs-0111-v1-volume-type-decisions.md +++ b/Standards/scs-0111-v1-volume-type-decisions.md @@ -7,7 +7,7 @@ track: IaaS ## Introduction -Volumes in OpenStack are virtual drives. They are managed by the storage service Cinder, which abstracts creation and usage of many different storage backends. While it is possible to use a backend like lvm which can reside on the same host as the hypervisor, the SCS wants to make a more clear differentiation between volumes and the ephemeral storage of a virtual machine. For all SCS deployments we want to assume that volumes are always residing in a storage backend that is NOT on the same host as a hypervisor - in short terms: Volumes are network storage. Ephemeral storage on the other hand is the only storage residing on a compute host. It is created by creating a VM directly from an Image and is automatically los as soon as the VM cease to exist. Volumes on the other hand have to be created from Images and only after that can be used for VMs. They are persistent and will remain in the last state a VM has written on them before they cease to exit. Being persistent and not relying on the host where the VM resides, Volumes can easily be attached to another VM in case of a node outage and VMs be migrated way more easily, because only metadata and data in RAM has to be shifted to another host, accelerating any migration or evacuation of a VM. +Volumes in OpenStack are virtual drives. They are managed by the storage service Cinder, which abstracts creation and usage of many different storage backends. While it is possible to use a backend like lvm which can reside on the same host as the hypervisor, this decision record wants to make a more clear differentiation between volumes and the ephemeral storage of a virtual machine. For all SCS deployments we want to assume that volumes are always residing in a storage backend that is NOT on the same host as a hypervisor - in short terms: Volumes are network storage. Ephemeral storage on the other hand is the only storage residing on a compute host. It is created by creating a VM directly from an Image and is automatically lost as soon as the VM cease to exist. Volumes on the other hand have to be created from Images and only after that can be used for VMs. They are persistent and will remain in the last state a VM has written on them before they cease to exit. Being persistent and not relying on the host where the VM resides, Volumes can easily be attached to another VM in case of a node outage and VMs be migrated way more easily, because only metadata and data in RAM has to be shifted to another host, accelerating any migration or evacuation of a VM. Volume Types are used to classify volumes and provide a basic decision for what kind of volume should be created. These volume types can sometimes very be backend-specific, and it might be hard for a user to choose the most suitable volume type, if there is more than one default type. Nevertheless, most of the configuration is done in the backends themselves, so volume types only work as a rough classification. From 2494eb11526762679e99226866d8b08ec4ecd349 Mon Sep 17 00:00:00 2001 From: Michal Gubricky Date: Tue, 15 Oct 2024 13:27:26 +0200 Subject: [PATCH 2/7] Add golang linter (#771) * Add golang linter Signed-off-by: michal.gubricky * Remove unnecessary comment in dev-prerequests Signed-off-by: michal.gubricky --------- Signed-off-by: michal.gubricky --- .github/workflows/lint-golang.yml | 28 ++++ Tests/kaas/kaas-sonobuoy-tests/Makefile | 17 +- .../scs_k8s_conformance_tests/main_test.go | 146 +++++++++--------- .../scs_0200_smoke_test.go | 11 +- 4 files changed, 122 insertions(+), 80 deletions(-) create mode 100644 .github/workflows/lint-golang.yml diff --git a/.github/workflows/lint-golang.yml b/.github/workflows/lint-golang.yml new file mode 100644 index 000000000..faf7fdc8c --- /dev/null +++ b/.github/workflows/lint-golang.yml @@ -0,0 +1,28 @@ +name: Check Go syntax + +on: + push: + paths: + - 'Tests/kaas/kaas-sonobuoy-tests/**/*.go' + - .github/workflows/lint-go.yml + +jobs: + lint-go-syntax: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - name: Set up Go + uses: actions/setup-go@v4 + with: + go-version: '1.23' + + # Install golangci-lint + - name: Install golangci-lint + run: | + curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.61.0 + + # Run golangci-lint + - name: Run golangci-lint + working-directory: Tests/kaas/kaas-sonobuoy-tests + run: golangci-lint run ./... -v diff --git a/Tests/kaas/kaas-sonobuoy-tests/Makefile b/Tests/kaas/kaas-sonobuoy-tests/Makefile index b7099cee7..dffc6d7a2 100644 --- a/Tests/kaas/kaas-sonobuoy-tests/Makefile +++ b/Tests/kaas/kaas-sonobuoy-tests/Makefile @@ -96,10 +96,19 @@ test-function: @echo "only run tests for: $${TESTFUNCTION_CODE}" DEVELOPMENT_MODE=createcluster go test -run=$${TESTFUNCTION_CODE} ./... || true - -lint: - @echo "NOT YET IMPLEMENTED" - +lint: check-golangci-lint + @echo "[Running golangci-lint...]" + @golangci-lint run ./... -v || true + +GOLANGCI_LINT_VERSION ?= v1.61.0 +check-golangci-lint: + @if ! [ -x "$$(command -v golangci-lint)" ]; then \ + echo "[golangci-lint not found, installing...]"; \ + go install github.com/golangci/golangci-lint/cmd/golangci-lint@$(GOLANGCI_LINT_VERSION); \ + echo "[golangci-lint installed]"; \ + else \ + echo "[golangci-lint is already installed]"; \ + fi dev-clean-result: @rm -rf *.tar.gz || true diff --git a/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/main_test.go b/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/main_test.go index 8359b1b0f..98d305a5b 100644 --- a/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/main_test.go +++ b/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/main_test.go @@ -3,9 +3,10 @@ package scs_k8s_tests import ( "context" "fmt" + "log" "os" "testing" - "log" + plugin_helper "github.com/vmware-tanzu/sonobuoy-plugins/plugin-helper" v1 "k8s.io/api/core/v1" "sigs.k8s.io/e2e-framework/pkg/env" @@ -13,11 +14,16 @@ import ( "sigs.k8s.io/e2e-framework/pkg/envfuncs" ) +// Define a custom type for the context key +type nsContextKey string + +// Define a custom type for context keys +type contextKey string const ( ProgressReporterCtxKey = "SONOBUOY_PROGRESS_REPORTER" - NamespacePrefixKey = "NS_PREFIX" - DevelopmentModeKey = "DEVELOPMENT_MODE" + NamespacePrefixKey = "NS_PREFIX" + DevelopmentModeKey = "DEVELOPMENT_MODE" ) var testenv env.Environment @@ -33,82 +39,82 @@ func TestMain(m *testing.M) { updateReporter := plugin_helper.NewProgressReporter(0) developmentMode := os.Getenv(DevelopmentModeKey) - log.Printf("Setup test enviornment for: %#v", developmentMode ) - - switch KubernetesEnviornment := developmentMode; KubernetesEnviornment { - - case "createcluster": - log.Println("Create kind cluster for test") - testenv = env.New() - kindClusterName := envconf.RandomName("gotestcluster", 16) - //~ namespace := envconf.RandomName("testnamespace", 16) - - testenv.Setup( - envfuncs.CreateKindCluster(kindClusterName), - ) - - testenv.Finish( - //~ envfuncs.DeleteNamespace(namespace), - envfuncs.DestroyKindCluster(kindClusterName), - ) - - case "usecluster": - log.Println("Use existing k8s cluster for the test") - log.Println("Not Yet Implemented") - //~ testenv = env.NewFromFlags() - //~ KubeConfig:= os.Getenv(KUBECONFIGFILE) - //~ testenv = env.NewWithKubeConfig(KubeConfig) - - default: - // Assume we are running in the cluster as a Sonobuoy plugin. - log.Println("Running tests inside k8s cluster") - testenv = env.NewInClusterConfig() - - testenv.Setup(func(ctx context.Context, config *envconf.Config) (context.Context, error) { - // Try and create the client; doing it before all the tests allows the tests to assume - // it can be created without error and they can just use config.Client(). - _,err:=config.NewClient() - return context.WithValue(ctx,ProgressReporterCtxKey,updateReporter) ,err - }) - - testenv.Finish( - func(ctx context.Context, cfg *envconf.Config) (context.Context, error) { - log.Println("Finished go test suite") - //~ if err := ???; err != nil{ - //~ return ctx, err - //~ } - return ctx, nil - }, - ) - - } + log.Printf("Setup test enviornment for: %#v", developmentMode) + + switch KubernetesEnviornment := developmentMode; KubernetesEnviornment { + + case "createcluster": + log.Println("Create kind cluster for test") + testenv = env.New() + kindClusterName := envconf.RandomName("gotestcluster", 16) + //~ namespace := envconf.RandomName("testnamespace", 16) + + testenv.Setup( + envfuncs.CreateKindCluster(kindClusterName), + ) + + testenv.Finish( + //~ envfuncs.DeleteNamespace(namespace), + envfuncs.DestroyKindCluster(kindClusterName), + ) + + case "usecluster": + log.Println("Use existing k8s cluster for the test") + log.Println("Not Yet Implemented") + //~ testenv = env.NewFromFlags() + //~ KubeConfig:= os.Getenv(KUBECONFIGFILE) + //~ testenv = env.NewWithKubeConfig(KubeConfig) + + default: + // Assume we are running in the cluster as a Sonobuoy plugin. + log.Println("Running tests inside k8s cluster") + testenv = env.NewInClusterConfig() + + testenv.Setup(func(ctx context.Context, config *envconf.Config) (context.Context, error) { + // Try and create the client; doing it before all the tests allows the tests to assume + // it can be created without error and they can just use config.Client(). + _, err := config.NewClient() + return context.WithValue(ctx, contextKey(ProgressReporterCtxKey), updateReporter), err + }) + + testenv.Finish( + func(ctx context.Context, cfg *envconf.Config) (context.Context, error) { + log.Println("Finished go test suite") + //~ if err := ???; err != nil{ + //~ return ctx, err + //~ } + return ctx, nil + }, + ) + + } testenv.BeforeEachTest(func(ctx context.Context, cfg *envconf.Config, t *testing.T) (context.Context, error) { - fmt.Println("BeforeEachTest") + fmt.Println("BeforeEachTest") updateReporter.StartTest(t.Name()) return createNSForTest(ctx, cfg, t, runID) }) testenv.AfterEachTest(func(ctx context.Context, cfg *envconf.Config, t *testing.T) (context.Context, error) { - fmt.Println("AfterEachTest") - updateReporter.StopTest(t.Name(),t.Failed(),t.Skipped(),nil) + fmt.Println("AfterEachTest") + updateReporter.StopTest(t.Name(), t.Failed(), t.Skipped(), nil) return deleteNSForTest(ctx, cfg, t, runID) }) /* - testenv.BeforeEachFeature(func(ctx context.Context, config *envconf.Config, info features.Feature) (context.Context, error) { - // Note that you can also add logic here for before a feature is tested. There may be - // more than one feature in a test. - fmt.Println("BeforeEachFeature") - return ctx, nil - }) - - testenv.AfterEachFeature(func(ctx context.Context, config *envconf.Config, info features.Feature) (context.Context, error) { - // Note that you can also add logic here for after a feature is tested. There may be - // more than one feature in a test. - fmt.Println("AfterEachFeature") - return ctx, nil - }) + testenv.BeforeEachFeature(func(ctx context.Context, config *envconf.Config, info features.Feature) (context.Context, error) { + // Note that you can also add logic here for before a feature is tested. There may be + // more than one feature in a test. + fmt.Println("BeforeEachFeature") + return ctx, nil + }) + + testenv.AfterEachFeature(func(ctx context.Context, config *envconf.Config, info features.Feature) (context.Context, error) { + // Note that you can also add logic here for after a feature is tested. There may be + // more than one feature in a test. + fmt.Println("AfterEachFeature") + return ctx, nil + }) */ os.Exit(testenv.Run(m)) @@ -136,6 +142,6 @@ func deleteNSForTest(ctx context.Context, cfg *envconf.Config, t *testing.T, run return ctx, cfg.Client().Resources().Delete(ctx, &nsObj) } -func nsKey(t *testing.T) string { - return "NS-for-%v" + t.Name() +func nsKey(t *testing.T) nsContextKey { + return nsContextKey("NS-for-" + t.Name()) } diff --git a/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/scs_0200_smoke_test.go b/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/scs_0200_smoke_test.go index 83909b41a..62ec43e3d 100644 --- a/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/scs_0200_smoke_test.go +++ b/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/scs_0200_smoke_test.go @@ -1,16 +1,15 @@ package scs_k8s_tests import ( + "os" "testing" - "os" ) - func Test_scs_0200_smoke(t *testing.T) { - // This test ensures that no DevelopmentMode was set - // when using this test-suite productively - developmentMode := os.Getenv(DevelopmentModeKey) + // This test ensures that no DevelopmentMode was set + // when using this test-suite productively + developmentMode := os.Getenv(DevelopmentModeKey) if developmentMode != "" { - t.Errorf("developmentMode is set to = %v; want None", developmentMode ) + t.Errorf("developmentMode is set to = %v; want None", developmentMode) } } From e2922eddd04ec2d3f2d09e9efdd854717b94b1f0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20B=C3=BCchse?= Date: Tue, 15 Oct 2024 13:02:51 +0000 Subject: [PATCH 3/7] Remove redundant clouds table (#778) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Matthias Büchse --- README.md | 15 +-------------- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/README.md b/README.md index 2685faddb..5052b25bf 100644 --- a/README.md +++ b/README.md @@ -1,23 +1,10 @@ - # Sovereign Cloud Stack – Standards and Certification SCS unifies the best of cloud computing in a certified standard. With a decentralized and federated cloud stack, SCS puts users in control of their data and fosters trust in clouds, backed by a global open-source community. ## SCS compatible clouds -This is a list of clouds that we test on a nightly basis against our `scs-compatible` certification level. - -| Name | Description | Operator | _SCS-compatible IaaS_ Compliance | HealthMon | -| -------------------------------------------------------------------------------------------------------------- | ------------------------------------------------- | ----------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------: | -| [gx-scs](https://github.com/SovereignCloudStack/docs/blob/main/community/cloud-resources/plusserver-gx-scs.md) | Dev environment provided for SCS & GAIA-X context | plusserver GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-gx-scs-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-gx-scs-v4.yml) | [HM](https://health.gx-scs.sovereignit.cloud:3000/) | -| [pluscloud open](https://www.plusserver.com/en/products/pluscloud-open)
- prod1
- prod2
- prod3
- prod4 | Public cloud for customers (4 regions) | plusserver GmbH |  
- prod1 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-pco-prod1-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-pco-prod1-v4.yml)
- prod2 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-pco-prod2-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-pco-prod2-v4.yml)
- prod3 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-pco-prod3-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-pco-prod3-v4.yml)
- prod4 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-pco-prod4-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-pco-prod4-v4.yml) |  
[HM1](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-pco)
[HM2](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod2)
[HM3](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod3)
[HM4](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod4) | -| [Wavestack](https://www.noris.de/wavestack-cloud/) | Public cloud for customers | noris network AG/Wavecon GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-wavestack-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-wavestack-v4.yml) | [HM](https://health.wavestack1.sovereignit.cloud:3000/) | -| [REGIO.cloud](https://regio.digital) | Public cloud for customers | OSISM GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-regio-a-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-regio-a-v4.yml) | broken | -| [CNDS](https://cnds.io/) | Public cloud for customers | artcodix GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-artcodix-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-artcodix-v4.yml) | [HM](https://ohm.muc.cloud.cnds.io/) | -| [aov.cloud](https://www.aov.de/) | Community cloud for customers | aov IT.Services GmbH | (soon) | [HM](https://health.aov.cloud/) | -| PoC WG-Cloud OSBA | Cloud PoC for FITKO (yaook-based) | Cloud&Heat Technologies GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-poc-wgcloud-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-poc-wgcloud-v4.yml) | [HM](https://health.poc-wgcloud.osba.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?var-mycloud=poc-wgcloud&orgId=1) | -| PoC KDO | Cloud PoC for FITKO | KDO Service GmbH / OSISM GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-poc-kdo-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-poc-kdo-v4.yml) | (soon) | -| [syseleven](https://www.syseleven.de/en/products-services/openstack-cloud/)
- dus2
- ham1 | Public OpenStack Cloud (2 SCS regions) | SysEleven GmbH |  
- dus2 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-syseleven-dus2-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-syseleven-dus2-v4.yml)
- ham1 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-syseleven-ham1-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-syseleven-ham1-v4.yml) |  
(soon)
(soon) | +See [Compliant clouds overview](https://docs.scs.community/standards/certification/overview) on our docs page. ## SCS standards overview From 4a275ae48f49d36b369fb109374746426c31fd97 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20B=C3=BCchse?= Date: Wed, 16 Oct 2024 02:33:57 +0000 Subject: [PATCH 4/7] Add visual clue to clouds table (#784) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit resolves #783 Signed-off-by: Matthias Büchse Co-authored-by: Kurt Garloff --- compliance-monitor/monitor.py | 28 +++++++++++++-------- compliance-monitor/templates/overview.md.j2 | 26 +++++++++---------- 2 files changed, 31 insertions(+), 23 deletions(-) diff --git a/compliance-monitor/monitor.py b/compliance-monitor/monitor.py index ff9ed5ce1..aa02cbae1 100755 --- a/compliance-monitor/monitor.py +++ b/compliance-monitor/monitor.py @@ -284,6 +284,10 @@ def evaluate(self, scope_results): # always include draft (but only at the end) relevant.extend(by_validity['draft']) passed = [vname for vname in relevant if version_results[vname]['result'] == 1] + if passed: + summary = 1 if self.versions[passed[0]].validity in ('effective', 'warn') else -1 + else: + summary = 0 return { 'name': self.name, 'versions': version_results, @@ -293,6 +297,7 @@ def evaluate(self, scope_results): vname + ASTERISK_LOOKUP[self.versions[vname].validity] for vname in passed ]), + 'summary': summary, } def update_lookup(self, target_dict): @@ -679,15 +684,17 @@ async def post_results( conn.commit() -def passed_filter(results, subject, scope): - """Jinja filter to pick list of passed versions from `results` for given `subject` and `scope`""" - subject_data = results.get(subject) - if not subject_data: - return "" - scope_data = subject_data.get(scope) - if not scope_data: - return "" - return scope_data['passed_str'] +def pick_filter(results, subject, scope): + """Jinja filter to pick scope results from `results` for given `subject` and `scope`""" + return results.get(subject, {}).get(scope, {}) + + +def summary_filter(scope_results): + """Jinja filter to construct summary from `scope_results`""" + passed_str = scope_results.get('passed_str', '') or '–' + summary = scope_results.get('summary', 0) + color = {1: '✅'}.get(summary, '🛑') # instead of 🟢🔴 (hard to distinguish for color-blind folks) + return f'{color} {passed_str}' def verdict_filter(value): @@ -721,7 +728,8 @@ def reload_static_config(*args, do_ensure_schema=False): if __name__ == "__main__": logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO) env.filters.update( - passed=passed_filter, + pick=pick_filter, + summary=summary_filter, verdict=verdict_filter, verdict_check=verdict_check_filter, markdown=markdown, diff --git a/compliance-monitor/templates/overview.md.j2 b/compliance-monitor/templates/overview.md.j2 index f37340479..36e3ced23 100644 --- a/compliance-monitor/templates/overview.md.j2 +++ b/compliance-monitor/templates/overview.md.j2 @@ -6,37 +6,37 @@ for the time being to have the highest degree of control | Name | Description | Operator | [SCS-compatible IaaS](https://docs.scs.community/standards/scs-compatible-iaas/) | HealthMon | |-------|--------------|-----------|----------------------|:----------:| | [gx-scs](https://github.com/SovereignCloudStack/docs/blob/main/community/cloud-resources/plusserver-gx-scs.md) | Dev environment provided for SCS & GAIA-X context | plusserver GmbH | -{#- #} [{{ results | passed('gx-scs', iaas) or '–' }}]({{ detail_url('gx-scs', iaas) }}) {# -#} +{#- #} [{{ results | pick('gx-scs', iaas) | summary }}]({{ detail_url('gx-scs', iaas) }}) {# -#} | [HM](https://health.gx-scs.sovereignit.cloud:3000/) | | [aov.cloud](https://www.aov.de/) | Community cloud for customers | aov IT.Services GmbH | -{#- #} [{{ results | passed('aov', iaas) or '–' }}]({{ detail_url('aov', iaas) }}) {# -#} +{#- #} [{{ results | pick('aov', iaas) | summary }}]({{ detail_url('aov', iaas) }}) {# -#} | [HM](https://health.aov.cloud/) | | [CNDS](https://cnds.io/) | Public cloud for customers | artcodix GmbH | -{#- #} [{{ results | passed('artcodix', iaas) or '–' }}]({{ detail_url('artcodix', iaas) }}) {# -#} +{#- #} [{{ results | pick('artcodix', iaas) | summary }}]({{ detail_url('artcodix', iaas) }}) {# -#} | [HM](https://ohm.muc.cloud.cnds.io/) | | [pluscloud open](https://www.plusserver.com/en/products/pluscloud-open)
(4 regions) | Public cloud for customers | plusserver GmbH | {# #} -{#- #}prod1: [{{ results | passed('pco-prod1', iaas) or '–' }}]({{ detail_url('pco-prod1', iaas) }}){# -#} +{#- #}prod1: [{{ results | pick('pco-prod1', iaas) | summary }}]({{ detail_url('pco-prod1', iaas) }}){# -#}
-{#- #}prod2: [{{ results | passed('pco-prod2', iaas) or '–' }}]({{ detail_url('pco-prod2', iaas) }}){# -#} +{#- #}prod2: [{{ results | pick('pco-prod2', iaas) | summary }}]({{ detail_url('pco-prod2', iaas) }}){# -#}
-{#- #}prod3: [{{ results | passed('pco-prod3', iaas) or '–' }}]({{ detail_url('pco-prod3', iaas) }}){# -#} +{#- #}prod3: [{{ results | pick('pco-prod3', iaas) | summary }}]({{ detail_url('pco-prod3', iaas) }}){# -#}
-{#- #}prod4: [{{ results | passed('pco-prod4', iaas) or '–' }}]({{ detail_url('pco-prod4', iaas) }}) {# -#} +{#- #}prod4: [{{ results | pick('pco-prod4', iaas) | summary }}]({{ detail_url('pco-prod4', iaas) }}) {# -#} | [HM1](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-pco)
[HM2](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod2)
[HM3](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod3)
[HM4](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod4) | | PoC KDO | Cloud PoC for FITKO | KDO Service GmbH / OSISM GmbH | -{#- #} [{{ results | passed('poc-kdo', iaas) or '–' }}]({{ detail_url('poc-kdo', iaas) }}) {# -#} +{#- #} [{{ results | pick('poc-kdo', iaas) | summary }}]({{ detail_url('poc-kdo', iaas) }}) {# -#} | (soon) | | PoC WG-Cloud OSBA | Cloud PoC for FITKO | Cloud&Heat Technologies GmbH | -{#- #} [{{ results | passed('poc-wgcloud', iaas) or '–' }}]({{ detail_url('poc-wgcloud', iaas) }}) {# -#} +{#- #} [{{ results | pick('poc-wgcloud', iaas) | summary }}]({{ detail_url('poc-wgcloud', iaas) }}) {# -#} | [HM](https://health.poc-wgcloud.osba.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?var-mycloud=poc-wgcloud&orgId=1) | | [REGIO.cloud](https://regio.digital) | Public cloud for customers | OSISM GmbH | -{#- #} [{{ results | passed('regio-a', iaas) or '–' }}]({{ detail_url('regio-a', iaas) }}) {# -#} +{#- #} [{{ results | pick('regio-a', iaas) | summary }}]({{ detail_url('regio-a', iaas) }}) {# -#} | [HM](https://apimon.services.regio.digital/public-dashboards/17cf094a47404398a5b8e35a4a3968d4?orgId=1&refresh=5m) | | [syseleven](https://www.syseleven.de/en/products-services/openstack-cloud/)
(2 SCS regions) | Public OpenStack Cloud | SysEleven GmbH | {# #} -{#- #}dus2: [{{ results | passed('syseleven-dus2', iaas) or '–' }}]({{ detail_url('syseleven-dus2', iaas) }}){# -#} +{#- #}dus2: [{{ results | pick('syseleven-dus2', iaas) | summary }}]({{ detail_url('syseleven-dus2', iaas) }}){# -#}
-{#- #}ham1: [{{ results | passed('syseleven-ham1', iaas) or '–' }}]({{ detail_url('syseleven-ham1', iaas) }}) {# -#} +{#- #}ham1: [{{ results | pick('syseleven-ham1', iaas) | summary }}]({{ detail_url('syseleven-ham1', iaas) }}) {# -#} | (soon)
(soon) | | [Wavestack](https://www.noris.de/wavestack-cloud/) | Public cloud for customers | noris network AG/Wavecon GmbH | -{#- #} [{{ results | passed('wavestack', iaas) or '–' }}]({{ detail_url('wavestack', iaas) }}) {# -#} +{#- #} [{{ results | pick('wavestack', iaas) | summary }}]({{ detail_url('wavestack', iaas) }}) {# -#} | [HM](https://health.wavestack1.sovereignit.cloud:3000/) | From ec6b4c28009ed06c636dc56f09b36b8bb266de61 Mon Sep 17 00:00:00 2001 From: josephineSei <128813814+josephineSei@users.noreply.github.com> Date: Wed, 16 Oct 2024 14:13:24 +0200 Subject: [PATCH 5/7] Rename scs-0120-w1-Availability-Zones-Standard.md to scs-0121-w1-Availability-Zones-Standard.md (#785) Signed-off-by: josephineSei <128813814+josephineSei@users.noreply.github.com> --- ...nes-Standard.md => scs-0121-w1-Availability-Zones-Standard.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename Standards/{scs-0120-w1-Availability-Zones-Standard.md => scs-0121-w1-Availability-Zones-Standard.md} (100%) diff --git a/Standards/scs-0120-w1-Availability-Zones-Standard.md b/Standards/scs-0121-w1-Availability-Zones-Standard.md similarity index 100% rename from Standards/scs-0120-w1-Availability-Zones-Standard.md rename to Standards/scs-0121-w1-Availability-Zones-Standard.md From 1e24ed4e103add941c9ac8305df06d89b15aad96 Mon Sep 17 00:00:00 2001 From: Kurt Garloff Date: Thu, 17 Oct 2024 18:31:54 -0400 Subject: [PATCH 6/7] Feat/add gpu vram (#780) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Add GPU table and VRAM into specification. To be done: Adjust flavor name parser, pretty printer, generator. * Version 3.2, adjust examples. * Fix empty line and extra space. * Minor addition of information for AMD and intel. * Typo. * Note about 1/7 uncertainties. Nvidia spelling. Also mention older generations ... * Appease markdownlint. * More appeasement for markdownlint. One true fix (broken link) And tweak the double-space test for tolerating two spaces after a | in a table. * One more fix against double spaces. * add vram and vramperf to GPU (retrofitting v1 and v2) * bugfix: use correct variable * appease flake8 * GPU VRAM always comes with CU spec: Use "and" wording. Just for a bit better readability. * More specific meaning of GPU h modifiers. h on the SMs/CUs/EUs: High frequency h on the VRAM: High bandwidth Again, this is really only to differentiate if a vendor has several otherwise similar models that have a material difference in frequencyor bandwidth, such as e.g. a GDDR6 vs an HBM2e veriant ... or a low-power, low-frequency variant. Signed-off-by: Kurt Garloff Signed-off-by: Matthias Büchse Co-authored-by: Matthias Büchse --- .markdownlint-cli2.jsonc | 5 +- Standards/scs-0100-v3-flavor-naming.md | 52 ++++++--- ...w1-flavor-naming-implementation-testing.md | 104 +++++++++++++++++- Tests/iaas/flavor-naming/cli.py | 2 +- Tests/iaas/flavor-naming/flavor_names.py | 20 ++-- 5 files changed, 155 insertions(+), 28 deletions(-) diff --git a/.markdownlint-cli2.jsonc b/.markdownlint-cli2.jsonc index 94fca3275..4d44024bd 100644 --- a/.markdownlint-cli2.jsonc +++ b/.markdownlint-cli2.jsonc @@ -43,9 +43,10 @@ { "name": "double-spaces", "message": "Avoid double spaces", - "searchPattern": "/([^\\s>]) ([^\\s|])/g", + "searchPattern": "/([^\\s>|]) ([^\\s|])/g", "replace": "$1 $2", - "skipCode": true + "skipCode": true, + "tables": false } ] } diff --git a/Standards/scs-0100-v3-flavor-naming.md b/Standards/scs-0100-v3-flavor-naming.md index 0c7fea124..587bde220 100644 --- a/Standards/scs-0100-v3-flavor-naming.md +++ b/Standards/scs-0100-v3-flavor-naming.md @@ -14,7 +14,7 @@ description: | ## Introduction -This is the standard v3.1 for SCS Release 5. +This is the standard v3.2 for SCS Release 8. Note that we intend to only extend it (so it's always backwards compatible), but try to avoid changing in incompatible ways. (See at the end for the v1 to v2 transition where we have not met that @@ -417,7 +417,7 @@ is more significant. ### [OPTIONAL] GPU support -Format: `_`\[`G/g`\]X\[N\]\[`-`M\]\[`h`\] +Format: `_`\[`G/g`\]X\[N\[`-`M\[`h`\]\[`-`V\[`h`\]\]\]\] This extension provides more details on the specific GPU: @@ -425,7 +425,9 @@ This extension provides more details on the specific GPU: - vendor (X) - generation (N) - number (M) of processing units that are exposed (for pass-through) or assigned; see table below for vendor-specific terminology -- high-performance indicator (`h`) +- high-frequency indicator (`h`) for compute units +- amount of video memory (V) in GiB +- an indicator for high-bandwidth memory Note that the vendor letter X is mandatory, generation and processing units are optional. @@ -440,13 +442,29 @@ for AMD GCN-x=0.x, RDNA1=1, C/RDNA2=2, C/RDNA3=3, C/RDNA3.5=3.5, C/RDNA4=4, ... for Intel Gen9=0.9, Xe(12.1/DG1)=1, Xe(12.2)=2, Arc(12.7/DG2)=3 ... (Note: This may need further work to properly reflect what's out there.) -The optional `h` suffix to the compute unit count indicates high-performance (e.g. high freq or special -high bandwidth gfx memory such as HBM); -`h` can be duplicated for even higher performance. +The optional `h` suffix to the compute unit count indicates high-frequency GPU compute units. +It is not normally recommended to use it except if there are several variants of cards within +a generation of GPUs and with similar number of SMs/CUs/EUs. +In case there are even more than two variants, the letter `h` can be duplicated for even +higher frquencies. -Example: `SCS-16V-64-500s_GNa-14h` -This flavor has a pass-through GPU nVidia Ampere with 14 SMs and either high-bandwidth memory or specially high frequencies. -Looking through GPU specs you could guess it's 1/4 of an A30. +Please note that there are GPUs from one generation and vendor that have vastly different sizes +(or different fractions are being passed to an instance with multi-instance-GPUs). The number +M allows to differentiate between them and have an indicator of the compute capability and +parallelism. M can not at all be compared between different generations let alone different +vendors. + +The amount of video memory dedicated to the instance can be indicated by V (in binary +Gigabytes). This number needs to be an integer - fractional memory sizes must be rounded +down. An optional `h` can be used to indicate high bandwidth memory (such as HBM2+) with +bandwidths well above 1GiB/s. + +Example: `SCS-16V-64-500s_GNa-14-6h` +This flavor has a pass-through GPU nVidia Ampere with 14 SMs and 6 GiB of high-bandwidth video +memory. Looking through GPU specs you could guess it's 1/4 of an A30. + +We have a table with common GPUs in the +[implementation hints for this standard](scs-0100-w1-flavor-naming-implementation-testing.md) ### [OPTIONAL] Infiniband @@ -490,14 +508,14 @@ an image is considered broken by the SCS team. ## Proposal Examples -| Example | Decoding | -| ------------------------- | ---------------------------------------------------------------------------------------------- | -| SCS-2C-4-10n | 2 dedicated cores (x86-64), 4GiB RAM, 10GB network disk | -| SCS-8Ti-32-50p_i1 | 8 dedicated hyperthreads (insecure), Skylake, 32GiB RAM, 50GB local NVMe | -| SCS-1L-1u-5 | 1 vCPU (heavily oversubscribed), 1GiB Ram (no ECC), 5GB disk (unspecific) | -| SCS-16T-64-200s_GNa-64_ib | 16 dedicated threads, 64GiB RAM, 200GB local SSD, Infiniband, 64 Passthrough nVidia Ampere SMs | -| SCS-4C-16-2x200p_a1 | 4 dedicated Arm64 cores (A76 class), 16GiB RAM, 2x200GB local NVMe drives | -| SCS-1V-0.5 | 1 vCPU, 0.5GiB RAM, no disk (boot from cinder volume) | +| Example | Decoding | +| ------------------------------ | ---------------------------------------------------------------------------------------------- | +| `SCS-2C-4-10n` | 2 dedicated cores (x86-64), 4GiB RAM, 10GB network disk | +| `SCS-8Ti-32-50p_i1` | 8 dedicated hyperthreads (insecure), Skylake, 32GiB RAM, 50GB local NVMe | +| `SCS-1L-1u-5` | 1 vCPU (heavily oversubscribed), 1GiB Ram (no ECC), 5GB disk (unspecific) | +| `SCS-16T-64-200s_GNa-72-24_ib` | 16 dedicated threads, 64GiB RAM, 200GB local SSD, Infiniband, 72 Passthrough nVidia Ampere SMs | +| `SCS-4C-16-2x200p_a1` | 4 dedicated Arm64 cores (A76 class), 16GiB RAM, 2x200GB local NVMe drives | +| `SCS-1V-0.5` | 1 vCPU, 0.5GiB RAM, no disk (boot from cinder volume) | ## Previous standard versions diff --git a/Standards/scs-0100-w1-flavor-naming-implementation-testing.md b/Standards/scs-0100-w1-flavor-naming-implementation-testing.md index 71756e07d..0783216d6 100644 --- a/Standards/scs-0100-w1-flavor-naming-implementation-testing.md +++ b/Standards/scs-0100-w1-flavor-naming-implementation-testing.md @@ -32,7 +32,8 @@ See the [README](https://github.com/SovereignCloudStack/standards/tree/main/Test for more details. The functionality of this script is also (partially) exposed via the web page -[https://flavors.scs.community/](https://flavors.scs.community/). +[https://flavors.scs.community/](https://flavors.scs.community/), which can both +parse SCS flavors names as well as generate them. With the OpenStack tooling (`python3-openstackclient`, `OS_CLOUD`) in place, you can call `cli.py -v parse v3 $(openstack flavor list -f value -c Name)` to get a report @@ -45,6 +46,107 @@ will create a whole set of flavors in one go. To that end, it provides different options: either the standard mandatory and possibly recommended flavors can be created, or the user can set a file containing his flavors. +### GPU table + +The most commonly used datacenter GPUs are listed here, showing what GPUs (or partitions +of a GPU) result in what GPU part of the flavor name. + +#### Nvidia (`N`) + +We show the most popular recent generations here. older one are of course possible as well. + +##### Ampere (`a`) + +One Streaming Multiprocessor on Ampere has 64 (A30, A100) or 128 Cuda Cores (A10, A40). + +GPUs without MIG (one SM has 128 Cude Cores and 4 Tensor Cores): + +| Nvidia GPU | Tensor C | Cuda Cores | SMs | VRAM | SCS name piece | +|------------|----------|------------|-----|-----------|----------------| +| A10 | 288 | 9216 | 72 | 24G GDDR6 | `GNa-72-24` | +| A40 | 336 | 10752 | 84 | 48G GDDR6 | `GNa-84-48` | + +GPUs with Multi-Instance-GPU (MIG), where GPUs can be partitioned and the partitions handed +out as as pass-through PCIe devices to instances. One SM corresponds to 64 Cuda Cores and +4 Tensor Cores. + +| Nvidia GPU | Fraction | Tensor C | Cuda Cores | SMs | VRAM | SCS GPU name | +|------------|----------|----------|------------|-----|-----------|----------------| +| A30 | 1/1 | 224 | 3584 | 56 | 24G HBM2 | `GNa-56-24` | +| A30 | 1/2 | 112 | 1792 | 28 | 12G HBM2 | `GNa-28-12` | +| A30 | 1/4 | 56 | 896 | 14 | 6G HBM2 | `GNa-14-6` | +| A30X | 1/1 | 224 | 3584 | 56 | 24G HBM2e | `GNa-56h-24h` | +| A100 | 1/1 | 432 | 6912 | 108 | 80G HBM2e | `GNa-108h-80h` | +| A100 | 1/2 | 216 | 3456 | 54 | 40G HBM2e | `GNa-54h-40h` | +| A100 | 1/4 | 108 | 1728 | 27 | 20G HBM2e | `GNa-27h-20h` | +| A100 | 1/7 | 60+ | 960+ | 15+| 10G HBM2e | `GNa-15h-10h`+ | +| A100X | 1/1 | 432 | 6912 | 108 | 80G HBM2e | `GNa-108-80h` | + +[+] The precise numbers for the 1/7 MIG configurations are not known by the author of +this document and need validation. + +##### Ada Lovelave (`l`) + +No MIG support, 128 Cuda Cores and 4 Tensor Cores per SM. + +| Nvidia GPU | Tensor C | Cuda Cores | SMs | VRAM | SCS name piece | +|------------|----------|------------|-----|-----------|----------------| +| L4 | 232 | 7424 | 58 | 24G GDDR6 | `GNl-58-24` | +| L40 | 568 | 18176 | 142 | 48G GDDR6 | `GNl-142-48` | +| L40G | 568 | 18176 | 142 | 48G GDDR6 | `GNl-142h-48` | +| L40S | 568 | 18176 | 142 | 48G GDDR6 | `GNl-142hh-48` | + +##### Grace Hopper (`g`) + +These have MIG support and 128 Cuda Cores and 4 Tensor Cores per SM. + +| Nvidia GPU | Fraction | Tensor C | Cuda Cores | SMs | VRAM | SCS GPU name | +|------------|----------|----------|------------|-----|------------|----------------| +| H100 | 1/1 | 528 | 16896 | 132 | 80G HBM3 | `GNg-132-80h` | +| H100 | 1/2 | 264 | 8448 | 66 | 40G HBM3 | `GNg-66-40h` | +| H100 | 1/4 | 132 | 4224 | 33 | 20G HBM3 | `GNg-33-20h` | +| H100 | 1/7 | 72+ | 2304+ | 18+| 10G HBM3 | `GNg-18-10h`+ | +| H200 | 1/1 | 528 | 16896 | 132 | 141G HBM3e | `GNg-132-141h` | +| H200 | 1/2 | 264 | 16896 | 66 | 70G HBM3e | `GNg-66-70h` | +| ... | + +[+] The precise numbers for the 1/7 MIG configurations are not known by the author of +this document and need validation. + +#### AMD Radeon (`A`) + +##### CDNA 2 (`2`) + +One CU contains 64 Stream Processors. + +| AMD Instinct| Stream Proc | CUs | VRAM | SCS name piece | +|-------------|-------------|-----|------------|----------------| +| Inst MI210 | 6656 | 104 | 64G HBM2e | `GA2-104-64h` | +| Inst MI250 | 13312 | 208 | 128G HBM2e | `GA2-208-128h` | +| Inst MI250X | 14080 | 229 | 128G HBM2e | `GA2-220-128h` | + +##### CDNA 3 (`3`) + +SRIOV partitioning is possible, resulting in pass-through for +up to 8 partitions, somewhat similar to Nvidia MIG. 4 Tensor +Cores and 64 Stream Processors per CU. + +| AMD GPU | Tensor C | Stream Proc | CUs | VRAM | SCS name piece | +|-------------|----------|-------------|-----|------------|----------------| +| Inst MI300X | 1216 | 19456 | 304 | 192G HBM3 | `GA3-304-192h` | +| Inst MI325X | 1216 | 19456 | 304 | 288G HBM3 | `GA3-304-288h` | + +#### intel Xe (`I`) + +##### Xe-HPC (Ponte Vecchio) (`12.7`) + +1 EU corresponds to one Tensor Core and contains 128 Shading Units. + +| intel DC GPU | Tensor C | Shading U | EUs | VRAM | SCS name piece | +|--------------|----------|-----------|-----|------------|-------------------| +| Max 1100 | 56 | 7168 | 56 | 48G HBM2e | `GI12.7-56-48h` | +| Max 1550 | 128 | 16384 | 128 | 128G HBM2e | `GI12.7-128-128h` | + ## Automated tests ### Errors diff --git a/Tests/iaas/flavor-naming/cli.py b/Tests/iaas/flavor-naming/cli.py index 86969cbbb..796b6a733 100755 --- a/Tests/iaas/flavor-naming/cli.py +++ b/Tests/iaas/flavor-naming/cli.py @@ -72,7 +72,7 @@ def parse(cfg, version, name, output='none'): if flavorname is None: print(f"NOT an SCS flavor: {namestr}") elif output == 'prose': - printv(name, end=': ') + printv(namestr, end=': ') print(f"{prettyname(flavorname)}") elif output == 'yaml': print(yaml.dump(flavorname_to_dict(flavorname), explicit_start=True)) diff --git a/Tests/iaas/flavor-naming/flavor_names.py b/Tests/iaas/flavor-naming/flavor_names.py index f3d799060..d856a8d7f 100644 --- a/Tests/iaas/flavor-naming/flavor_names.py +++ b/Tests/iaas/flavor-naming/flavor_names.py @@ -212,7 +212,7 @@ class GPU: type = "GPU" component_name = "gpu" gputype = TblAttr("Type", {"g": "vGPU", "G": "Pass-Through GPU"}) - brand = TblAttr("Brand", {"N": "nVidia", "A": "AMD", "I": "Intel"}) + brand = TblAttr("Brand", {"N": "Nvidia", "A": "AMD", "I": "Intel"}) gen = DepTblAttr("Gen", brand, { "N": {'': '(unspecified)', "f": "Fermi", "k": "Kepler", "m": "Maxwell", "p": "Pascal", "v": "Volta", "t": "Turing", "a": "Ampere", "l": "AdaLovelace", "g": "GraceHopper"}, @@ -222,7 +222,9 @@ class GPU: "3": "Arc/Gen12.7/DG2"}, }) cu = OptIntAttr("#.N:SMs/A:CUs/I:EUs") - perf = TblAttr("Performance", {"": "Std Perf", "h": "High Perf", "hh": "Very High Perf", "hhh": "Very Very High Perf"}) + perf = TblAttr("Frequency", {"": "Std Freq", "h": "High Freq", "hh": "Very High Freq"}) + vram = OptIntAttr("#.V:GiB VRAM") + vramperf = TblAttr("Bandwidth", {"": "Std BW {<~1GiB/s)", "h": "High BW", "hh": "Very High BW"}) class IB: @@ -278,7 +280,7 @@ class Outputter: hype = "_%s" hwvirt = "_%?" cpubrand = "_%s%0%s" - gpu = "_%s%s%s%-%s" + gpu = "_%s%s%s%-%s%-%s" ib = "_%?" def output_component(self, pattern, component, parts): @@ -341,7 +343,7 @@ class SyntaxV1: hwvirt = re.compile(r"\-(hwv)") # cpubrand needs final lookahead assertion to exclude confusion with _ib extension cpubrand = re.compile(r"\-([izar])([0-9]*)(h*)(?=$|\-)") - gpu = re.compile(r"\-([gG])([NAI])([^:h]*)(?::([0-9]+)|)(h*)") + gpu = re.compile(r"\-([gG])([NAI])([^:h]*)(?::([0-9]+)|)(h*)(?::([0-9]+)|)(h*)") ib = re.compile(r"\-(ib)") @staticmethod @@ -366,7 +368,7 @@ class SyntaxV2: hwvirt = re.compile(r"_(hwv)") # cpubrand needs final lookahead assertion to exclude confusion with _ib extension cpubrand = re.compile(r"_([izar])([0-9]*)(h*)(?=$|_)") - gpu = re.compile(r"_([gG])([NAI])([^\-h]*)(?:\-([0-9]+)|)(h*)") + gpu = re.compile(r"_([gG])([NAI])([^\-h]*)(?:\-([0-9]+)|)(h*)(?:\-([0-9]+)|)(h*)") ib = re.compile(r"_(ib)") @staticmethod @@ -697,10 +699,14 @@ def prettyname(flavorname, prefix=""): if flavorname.gpu: stg += "and " + _tbl_out(flavorname.gpu, "gputype") stg += _tbl_out(flavorname.gpu, "brand") - stg += _tbl_out(flavorname.gpu, "perf", True) stg += _tbl_out(flavorname.gpu, "gen", True) if flavorname.gpu.cu is not None: - stg += f"(w/ {flavorname.gpu.cu} SMs/CUs/EUs) " + stg += f"(w/ {flavorname.gpu.cu} {_tbl_out(flavorname.gpu, 'perf', True)}SMs/CUs/EUs" + # Can not specify VRAM without CUs + if flavorname.gpu.vram: + stg += f" and {flavorname.gpu.vram} GiB {_tbl_out(flavorname.gpu, 'vramperf', True)}VRAM) " + else: + stg += ") " # IB if flavorname.ib: stg += "and Infiniband " From 52fd69cb9d5b2ad7b04a2231866ae262cd2d1c02 Mon Sep 17 00:00:00 2001 From: Kurt Garloff Date: Sat, 19 Oct 2024 07:57:51 -0400 Subject: [PATCH 7/7] Shorten GPU names by dropping h. Fix intel Gen 12.7. (#786) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Shorten GPU names by dropping h. Fix intel Gen 12.7. Plus some typos. * Implement 'shorten' method for each name part * fix typo, thanks unit tests * bugfix: parameter got shifted * Output shortened name in interactive CLI generator. Signed-off-by: Kurt Garloff Signed-off-by: Matthias Büchse Co-authored-by: Matthias Büchse --- ...w1-flavor-naming-implementation-testing.md | 14 ++--- Tests/iaas/flavor-naming/flavor-name-check.py | 3 ++ Tests/iaas/flavor-naming/flavor_names.py | 51 ++++++++++++++++--- 3 files changed, 53 insertions(+), 15 deletions(-) diff --git a/Standards/scs-0100-w1-flavor-naming-implementation-testing.md b/Standards/scs-0100-w1-flavor-naming-implementation-testing.md index 0783216d6..d9f5f62b2 100644 --- a/Standards/scs-0100-w1-flavor-naming-implementation-testing.md +++ b/Standards/scs-0100-w1-flavor-naming-implementation-testing.md @@ -53,13 +53,13 @@ of a GPU) result in what GPU part of the flavor name. #### Nvidia (`N`) -We show the most popular recent generations here. older one are of course possible as well. +We show the most popular recent generations here. Older one are of course possible as well. ##### Ampere (`a`) One Streaming Multiprocessor on Ampere has 64 (A30, A100) or 128 Cuda Cores (A10, A40). -GPUs without MIG (one SM has 128 Cude Cores and 4 Tensor Cores): +GPUs without MIG (one SM has 128 Cuda Cores and 4 Tensor Cores): | Nvidia GPU | Tensor C | Cuda Cores | SMs | VRAM | SCS name piece | |------------|----------|------------|-----|-----------|----------------| @@ -138,14 +138,14 @@ Cores and 64 Stream Processors per CU. #### intel Xe (`I`) -##### Xe-HPC (Ponte Vecchio) (`12.7`) +##### Xe-HPC (Ponte Vecchio) (`3`) 1 EU corresponds to one Tensor Core and contains 128 Shading Units. -| intel DC GPU | Tensor C | Shading U | EUs | VRAM | SCS name piece | -|--------------|----------|-----------|-----|------------|-------------------| -| Max 1100 | 56 | 7168 | 56 | 48G HBM2e | `GI12.7-56-48h` | -| Max 1550 | 128 | 16384 | 128 | 128G HBM2e | `GI12.7-128-128h` | +| intel DC GPU | Tensor C | Shading U | EUs | VRAM | SCS name part | +|--------------|----------|-----------|-----|------------|----------------| +| Max 1100 | 56 | 7168 | 56 | 48G HBM2e | `GI3-56-48h` | +| Max 1550 | 128 | 16384 | 128 | 128G HBM2e | `GI3-128-128h` | ## Automated tests diff --git a/Tests/iaas/flavor-naming/flavor-name-check.py b/Tests/iaas/flavor-naming/flavor-name-check.py index 536372757..e5d395e54 100755 --- a/Tests/iaas/flavor-naming/flavor-name-check.py +++ b/Tests/iaas/flavor-naming/flavor-name-check.py @@ -86,6 +86,9 @@ def main(argv): nm2 = _fnmck.outname(ret2) if nm1 != nm2: print(f"WARNING: {nm1} != {nm2}") + snm = _fnmck.outname(ret.shorten()) + if snm != nm1: + print(f"Shortened name: {snm}") argv = argv[1:] scs = 1 diff --git a/Tests/iaas/flavor-naming/flavor_names.py b/Tests/iaas/flavor-naming/flavor_names.py index d856a8d7f..10ca54da6 100644 --- a/Tests/iaas/flavor-naming/flavor_names.py +++ b/Tests/iaas/flavor-naming/flavor_names.py @@ -162,6 +162,9 @@ class Main: raminsecure = BoolAttr("?no ECC", letter="u") ramoversubscribed = BoolAttr("?RAM Over", letter="o") + def shorten(self): + return self + class Disk: """Class representing the disk part""" @@ -171,6 +174,9 @@ class Disk: disksize = OptIntAttr("#.GB Disk") disktype = TblAttr("Disk type", {'': '(unspecified)', "n": "Networked", "h": "Local HDD", "s": "SSD", "p": "HiPerf NVMe"}) + def shorten(self): + return self + class Hype: """Class repesenting Hypervisor""" @@ -178,6 +184,9 @@ class Hype: component_name = "hype" hype = TblAttr(".Hypervisor", {"kvm": "KVM", "xen": "Xen", "hyv": "Hyper-V", "vmw": "VMware", "bms": "Bare Metal System"}) + def shorten(self): + return None + class HWVirt: """Class repesenting support for hardware virtualization""" @@ -185,6 +194,9 @@ class HWVirt: component_name = "hwvirt" hwvirt = BoolAttr("?HardwareVirt", letter="hwv") + def shorten(self): + return None + class CPUBrand: """Class repesenting CPU brand""" @@ -206,6 +218,12 @@ def __init__(self, cpuvendor="i", cpugen=0, perf=""): self.cpugen = cpugen self.perf = perf + def shorten(self): + # For non-x86-64, don't strip out CPU brand for short name, as it contains the architecture + if self.cpuvendor in ('i', 'z'): + return None + return CPUBrand(self.cpuvendor) + class GPU: """Class repesenting GPU support""" @@ -226,6 +244,19 @@ class GPU: vram = OptIntAttr("#.V:GiB VRAM") vramperf = TblAttr("Bandwidth", {"": "Std BW {<~1GiB/s)", "h": "High BW", "hh": "Very High BW"}) + def __init__(self, gputype="g", brand="N", gen='', cu=None, perf='', vram=None, vramperf=''): + self.gputype = gputype + self.brand = brand + self.gen = gen + self.cu = cu + self.perf = perf + self.vram = vram + self.vramperf = vramperf + + def shorten(self): + # remove h modifiers + return GPU(gputype=self.gputype, brand=self.brand, gen=self.gen, cu=self.cu, vram=self.vram) + class IB: """Class representing Infiniband""" @@ -233,6 +264,9 @@ class IB: component_name = "ib" ib = BoolAttr("?IB") + def shorten(self): + return self + class Flavorname: """A flavor name; merely a bunch of components""" @@ -250,14 +284,15 @@ def __init__( def shorten(self): """return canonically shortened name as recommended in the standard""" - if self.hype is None and self.hwvirt is None and self.cpubrand is None: - return self - # For non-x86-64, don't strip out CPU brand for short name, as it contains the architecture - if self.cpubrand and self.cpubrand.cpuvendor not in ('i', 'z'): - return Flavorname(cpuram=self.cpuram, disk=self.disk, - cpubrand=CPUBrand(self.cpubrand.cpuvendor), - gpu=self.gpu, ib=self.ib) - return Flavorname(cpuram=self.cpuram, disk=self.disk, gpu=self.gpu, ib=self.ib) + return Flavorname( + cpuram=self.cpuram and self.cpuram.shorten(), + disk=self.disk and self.disk.shorten(), + hype=self.hype and self.hype.shorten(), + hwvirt=self.hwvirt and self.hwvirt.shorten(), + cpubrand=self.cpubrand and self.cpubrand.shorten(), + gpu=self.gpu and self.gpu.shorten(), + ib=self.ib and self.ib.shorten(), + ) class Outputter: