diff --git a/.github/workflows/lint-golang.yml b/.github/workflows/lint-golang.yml
new file mode 100644
index 000000000..faf7fdc8c
--- /dev/null
+++ b/.github/workflows/lint-golang.yml
@@ -0,0 +1,28 @@
+name: Check Go syntax
+
+on:
+ push:
+ paths:
+ - 'Tests/kaas/kaas-sonobuoy-tests/**/*.go'
+ - .github/workflows/lint-go.yml
+
+jobs:
+ lint-go-syntax:
+ runs-on: ubuntu-latest
+ steps:
+ - uses: actions/checkout@v4
+
+ - name: Set up Go
+ uses: actions/setup-go@v4
+ with:
+ go-version: '1.23'
+
+ # Install golangci-lint
+ - name: Install golangci-lint
+ run: |
+ curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.61.0
+
+ # Run golangci-lint
+ - name: Run golangci-lint
+ working-directory: Tests/kaas/kaas-sonobuoy-tests
+ run: golangci-lint run ./... -v
diff --git a/.markdownlint-cli2.jsonc b/.markdownlint-cli2.jsonc
index 94fca3275..4d44024bd 100644
--- a/.markdownlint-cli2.jsonc
+++ b/.markdownlint-cli2.jsonc
@@ -43,9 +43,10 @@
{
"name": "double-spaces",
"message": "Avoid double spaces",
- "searchPattern": "/([^\\s>]) ([^\\s|])/g",
+ "searchPattern": "/([^\\s>|]) ([^\\s|])/g",
"replace": "$1 $2",
- "skipCode": true
+ "skipCode": true,
+ "tables": false
}
]
}
diff --git a/Drafts/node-to-node-encryption.md b/Drafts/node-to-node-encryption.md
new file mode 100644
index 000000000..4234b64db
--- /dev/null
+++ b/Drafts/node-to-node-encryption.md
@@ -0,0 +1,529 @@
+---
+title: _End-to-End Encryption between Customer Workloads_
+type: Decision Record
+status: Proposal
+track: IaaS
+---
+
+## Abstract
+
+This document explores options for developing end-to-end (E2E) encryption for
+VMs, Magnum workloads, and container layers to enhance security between user
+services. It includes a detailed review of various technologies, feedback from
+the OpenStack community, and the decision-making process that led to selecting
+VXLANs with the OpenStack ML2 plugin and it's later abandonment in favour of
+natural openvswitch-ipsec solution.
+
+## Terminology
+
+| Term | Meaning |
+|---|---|
+| CSP | Cloud service provider, in this document it includes also an operator of a private cloud |
+| VM | Virtual machine, alternatively instance, is a virtualized compute resource that functions as a self-contained server for a customer |
+| Node | Machine under CSP administration which hosts cloud services and compute instances |
+
+## Context
+
+### Motivation
+
+The security of customer/user workloads is one of CSPs main concerns. With
+larger and more diverse cloud instances, parts of the underlying physical
+infrastructure can be outside of CSPs direct control, either when
+interconnecting datacenters via public internet or in the case of renting
+infrastructure from third party. Many security breaches occur due to
+actions of malicious or negligent inhouse operators. While some burden lies
+with customers, which should secure their own workloads, CSP should have the
+option to transparently protect the data pathways between instances, more so
+for private clouds, where CSP and customer are the same entity or parts of the
+same entity.
+
+In RFC8926[^rfc] it is stated:
+> A tenant system in a customer premises (private data center) may want to
+> connect to tenant systems on their tenant overlay network in a public cloud
+> data center, or a tenant may want to have its tenant systems located in
+> multiple geographically separated data centers for high availability. Geneve
+> data traffic between tenant systems across such separated networks should be
+> protected from threats when traversing public networks. Any Geneve overlay
+> data leaving the data center network beyond the operator's security domain
+> SHOULD be secured by encryption mechanisms, such as IPsec or other VPN
+> technologies, to protect the communications between the NVEs when they are
+> geographically separated over untrusted network links.
+
+We aren't considering the communication intra node, meaning inside one host
+node between different VMs potentially of multiple tenants as this is a
+question of tenant isolation, not of networking security, and encryption here
+would be possibly a redundant measure. Isolation of VMs is handled by OpenStack
+on multiple levels - overlay tunneling protocols, routing rules on networking
+level, network namespaces on kernel level and hypervisor isolation mechanisms.
+All the communication here is existing inside node and any malicious agent with
+high enough access to the node itself to observe/tamper with the internal
+communication traffic would pressumably have access to the encryption keys
+themselves, rendering the encryption ineffective.
+
+### Potential threats in detail
+
+We are assuming that:
+
+* the customer workloads are not executed within secure enclaves (e.g. Security
+Guard Extensions (SGX)) and aren't using security measures like end-to-end
+encryption themselves, either relying with security on the CSP or in the case
+of a private cloud are run by the operator of the cloud
+* the CSP OpenStack administrators are deemed trustworthy since they possess
+root access to the host nodes, with access to keys and certificates, enabling
+them to bypass any form of internode communication encryption
+* a third party or an independent team manages physical network communication
+between nodes within a colocation setting or the communication passes unsafe
+public infrastructure in the case of a single stretched instance spanning
+multiple data centers
+
+#### Man in the Middle Attack
+
+Considering the assumptions and the objective to enforce end-to-end (E2E)
+encryption for user workloads, our primary security concern is averting
+man-in-the-middle (MITM) attacks. These can be categorized into two distinct
+forms: active and passive.
+
+##### Passive Attacks - Eavesdropping
+
+Consider the scenario where an untrusted individual, such as a third party
+network administrator, with physical access to the data center engages in
+'passive' covert surveillance, silently monitoring unencrypted traffic
+without interfering with data integrity or network operations.
+
+Wiretapping is a common technique employed in such espionage. It involves the
+unauthorized attachment to network cabling, enabling the covert observation of
+data transit. This activity typically goes unnoticed as it does not disrupt
+the flow of network traffic, although it may occasionally introduce minor
+transmission errors.
+
+An alternative strategy involves deploying an interception device that
+captures and retransmits data, which could potentially introduce network
+latency or, if deployed disruptively, cause connectivity interruptions. Such
+devices can be concealed by synchronizing their installation with network
+downtime, maintenance periods, or less conspicuous times like power outages.
+They could also be strategically placed in less secure, more accessible
+locations, such as inter-building network links. This applies to wiretapping
+as well.
+
+Furthermore, the vulnerability extends to network devices, where an attacker
+could exploit unsecured management ports or leverage compromised remote
+management tools (like IPMI) to gain unauthorized access. Such access points,
+especially those not routinely monitored like backup VPNs, present additional
+security risks.
+
+Below is a conceptual diagram depicting potential vulnerabilities in an
+OpenStack deployment across dual regions, highlighting how these passive
+eavesdropping techniques could be facilitated.
+
+![image](https://github.com/SovereignCloudStack/issues/assets/1249759/f5b7edf3-d259-4b2a-8632-c877934f3e31)
+
+##### Active - Spoofing / Tampering
+
+Active network attacks like spoofing and tampering exploit various access
+points, often leveraging vulnerabilities uncovered during passive eavesdropping
+phases. These attacks actively manipulate or introduce unauthorized
+communications on the network.
+
+Spoofing involves an attacker masquerading as another device or user within the
+network. This deception can manifest in several ways:
+
+* **ARP Spoofing:** The attacker sends forged ARP (Address Resolution Protocol)
+ messages onto the network. This can redirect network traffic flow to the
+ attacker's machine, intercepting, modifying, or blocking data before it
+ reaches its intended destination.
+* **DNS Spoofing:** By responding with falsified DNS replies, an attacker can
+ reroute traffic to malicious sites, further compromising or data exfiltration.
+* **IP Spoofing:** The attacker disguises their network identity by falsifying
+ IP address information in packets, tricking the network into accepting them
+ as legitimate traffic. This can be particularly damaging if encryption is not
+ enforced, enabling the attacker to interact with databases, invoke APIs, or
+ execute unauthorized commands while appearing as a trusted entity.
+
+Moreover, when an active interception device is in place, attackers can extend
+their capabilities to traffic filtering. They might selectively delete or alter
+logs and metrics to erase traces of their intrusion or fabricate system
+performance data, thus obscuring the true nature of their activities.
+
+### Preliminary considerations
+
+Initially we wanted to create a plugin into Neutron[^ne] using eBPF[^eb] to
+secure the traffic automatically between VMs. We presented the idea in a
+team IaaS call[^ia]. After the initial round of feedback specific requirements
+emerged.
+
+#### Utilize existing solutions
+
+Leverage existing technologies and frameworks as much as possible. This
+approach aims to reduce development time and ensure the solution is built on
+proven, reliable foundations. Potential technologies include:
+
+* **OVS[^sw] + IPSec[^ip]**: Provides an overlay network and has built-in
+ support for encryption using IPsec. Leveraging OVS can minimize development
+ time since it is already integrated with OpenStack.
+* **Neutron[^ne] with eBPF[^eb]**: Using eBPF[^eb] could provide fine-grained
+ control over packet filtering and encryption directly in the kernel.
+* **TripleO[^to] (with IPsec)**: TripleO[^to] tool set for OpenStack deployment
+ supports IPsec tunnels between nodes.
+* **Neutron[^ne] + Cilium[^ci]**: Cilium is an open source, cloud native
+ eBPF[^eb]-based networking solution, including transparent encryption tools.
+* **Tailscale[^ta]** is a mesh VPN based on WireGuard[^wg] that simplifies the
+ setup of secure, encrypted networks. This could be a potential alternative
+ to managing encrypted connections in OpenStack environments.
+
+#### Upstream integration
+
+Move as much of the development work upstream into existing OpenStack projects.
+This will help ensure the solution is maintained by the wider OpenStack
+community, reducing the risk of it becoming unmaintained or unusable in the
+future. This means to collaborate with the OpenStack community to contribute
+changes upstream, particularly in projects like Neutron[^ne], OVN[^ov],
+kolla[^kl] and ansible-kolla[^ka].
+
+#### Address threat modeling issues
+
+"We should not encrypt something just for the sake of encryption." The solution
+must address the specific security issues identified in the
+[threat modeling](#potential-threats-in-detail). This ideally includes
+protecting against both passive (eavesdropping) and active (spoofing,
+tampering) MITM attacks. Encryption mechanisms on all communication channels
+between VMs, containers, hosts prevents successfull eavesdropping,
+authentication and integrity checks prevent spoofing and tampering. For example
+IPsec[^ip] provides mechanisms for both encyption and integrity verification.
+
+#### Performance impact and ease of use
+
+Evaluate the performance impact of the encryption solution and ensure it is
+minimal. Performance benchmarking should be conducted to assess the impact of
+the encryption solution on network throughput and latency. For local trusted
+scenarios opt out should be possible. The solution should also be easy to use
+and manage, both for administrators and ideally fully transparent for
+end-users. This may involve developing user-friendly interfaces and automated
+tools for key management and configuration.
+
+#### Avoid redundant encryption
+
+If possible, develop a mechanism to detect and avoid encrypting traffic that is
+already encrypted. This will help optimize performance and resource usage.
+
+By focusing on these detailed requirements and considerations, we aim to
+develop a robust, efficient, and sustainable E2E encryption solution for
+OpenStack environments. This solution will not only enhance security for user
+workloads but also ensure long-term maintainability and ease of use.
+
+### Exploration of technologies
+
+Based on the result of the threat modeling and presentation, we explored the
+following technologies and also reached out to the OpenStack mailing list for
+additional comments.
+
+This section provides a brief explanation of OpenStack networking and design
+decisions for encryption between customer workloads.
+
+#### Networking in OpenStack
+
+The foundation of networking in OpenStack is the Neutron[^ne] project,
+providing networking as a service (NaaS). It creates and manages network
+resources such as switches, routers, subnets, firewalls and load balancers,
+uses plugin architecture to support different physical network implementation
+options and is accessible to admin or other services through API.
+
+Another integral part is the Open vSwitch (OVS)[^sw] - a widely adopted virtual
+switch implementation, which is not strictly necessary, as Neutron is quite
+flexible with compenents used to implement the infrastructure, but tends to
+be the agent of choice and is the current default agent for Neutron. It allows
+it to respond to environment changes, supporting accounting and monitoring
+protocols and maintaining OVSDB state database. It manages virtual ports,
+bridges and tunnels on hosts.
+
+Open Virtual Networking (OVN[^ov]) is a logical abstraction layer on top of OVS,
+developed by the same community that became the default controller driver for
+Neutron. It manages logical networks insulated from underlying physical/virtual
+networks by encapsulation. It replaces the need for OVS agents running on each
+host and supports L2 switching, distributed L3 routing, access control and load
+balancing.
+
+#### Encryption options
+
+##### MACsec[^ms]
+
+A layer 2 security protocol, defined by an IEEE standard 802.1AE. It allows to
+secure an ethernet link for almost all traffic, including control protocols
+like DHCP and ARP. It is mostly implemented in hardware, in routers and
+switches, but software implementations exist, notably a Linux kernel module.
+
+##### eBPF[^eb]-based encryption with Linux Kernel Crypto API
+
+A network packet specific filtering technology in Linux kernel called
+Berkeley Packet Filter (BPF) uses a specialized virtual machine inside
+kernel to run filters on the networking stack. eBPF is an extension of this
+principle to a general purpose stack which can run sandboxed programs in kernel
+without changes of kernel code or loading modules. High-performance networking
+observability and security is a natural use-case with projects like Cilium[^ci]
+implementing transparent in-kernel packet encryption with it. Linux kernel
+itself also provides an encryption framework called
+Linux Kernel Crypto API[^lkc] which such solutions use.
+
+##### IPsec[^ip]
+
+Internet Protocol security is a suite of protocols for network security on
+layer 3, providing authentication and packets encryption used for example in
+Virtual Private Network (VPN) setups. It is an IETF[^ie] specification with
+various open source and commercial implementations. For historical
+reasons[^ipwh] it defines two main transmission protocols
+Authentication Header (AH) and Encapsulating Security Payload (ESP) where only
+the latter provides encryption in addition to authentication and integrity. The
+key negotiations use the IKE(v1/v2) protocol to establish and maintain
+Security Associations (SA).
+
+##### WireGuard[^wg]
+
+Aims to be a simple and fast open source secure network tunneling solution
+working on layer 3, utilizing state-of-the-art cryptography while maintaining
+much simpler codebase and runtime setup as alternative solutions[^wgwp]. Focus
+is on fast in-kernel encryption. WireGuard[^wg] adds new network interfaces,
+managable by standard tooling (ifconfig, route,...) which act as tunneling
+interfaces. Main mechanism, called _Cryptokey routing_, are tables associating
+public keys of endpoints with allowed IPs inside given tunnels. These behave as
+routing tables when sending and access control lists (ACL) when receiving
+packets. All packets are sent over UDP. Built-in roaming is achieved by both
+server and clients being able to update the peer list by examining from where
+correctly authenticated data originates.
+
+### Solution proposals
+
+#### TripleO[^to] with IPsec[^ip]
+
+> TripleO is a project aimed at installing, upgrading and operating OpenStack
+> clouds using OpenStack's own cloud facilities as the foundation - building on
+> Nova, Ironic, Neutron and Heat to automate cloud management at datacenter
+> scale
+
+This project is retired as of February 2024, but its approach was considered
+for adoption.
+
+Its deployment allowed for IPsec[^ip] encryption of node communication. When
+utilized, two types of tunnels were created in overcloud: node-to-node tunnels
+for each two nodes on the same network, for all networks those nodes were on,
+and Virtual IP tunnels. Each node hosting the Virtual IP would open a tunnel
+for any node in the specific network that can properly authenticate.
+
+#### OVN[^ov] + IPsec[^ip]
+
+There is support in the OVN[^ov] project for IPsec[^ip] encryption of tunnel
+traffic[^oit]. A daemon running in each chassis automatically manages and
+monitors IPsec[^ip] tunnel states.
+
+#### Neutron[^ne] + Cilium[^ci]
+
+Another potential architecture involves a Neutron[^ne] plugin hooking an
+eBPF[^eb] proxy on each interface and moving internal traffic via an encrypted
+Cilium[^ci] mesh. Cilium uses IPsec[^ip] or WireGuard[wg] to transparently
+encrypt node-to-node traffic. There were some attempts to integrate Cilium[^ci]
+with OpenStack [^neci1], [^neci2], but we didn't find any concrete projects
+which would leverage the transparent encryption ability of Cilium[^ci] in
+OpenStack environment. This path would pressumably require significant
+development.
+
+#### Neutron[^ne] + Calico[^ca]
+
+The Calico[^ca] project in its community open source version provides
+node-to-node encryption using WireGuard[^wg]. Despite being primarily a
+Kubernetes networking project, it provides an OpenStack integration[^caos] via
+a Neutron[^ne] plugin and deploying the necessary subset of tools like etcd,
+Calico agent Felix, routing daemon BIRD and a DHCP agent.
+
+### Proof of concept implementations
+
+#### Neutron Plugin
+
+Initially the potential for developing a specialized Neutron plugin was
+investigated and a simple skeleton implementation for testing purposes was
+devised.
+
+Own development was later abandoned in favor of a more sustainable solution
+using existing technologies as disussed in
+[preliminary considerations](#preliminary-considerations).
+
+#### Manual setup
+
+We created a working Proof of Concept with manually setting up VXLAN tunnels
+between nodes. While this solution ensures no impact on OpenStack and is easy
+to set up, it has limitations, such as unencrypted data transmission if the
+connection breaks. To mitigate this, we proposed using a dedicated subnet
+present only in the IPsec[^ip] tunnels.
+
+We presented the idea to the kolla-ansible[^ak] project, but it was deemed out
+of scope. Instead, we were directed towards a native Open vSwitch solution
+supporting IPsec[^ip]. This requires creating a new OpenStack service
+(working name: openstack-ipsec) and a role to manage chassis keys and run the
+openstack-ipsec container on each node.
+
+#### Proof of concept (PoC) implementation
+
+In our second proof of concept, we decided to implement support for
+openstack-ipsec. The initial step involved creating a new container image
+within the kolla[^kl] project specifically for this purpose.
+
+##### Architecture
+
+When Neutron[^ne] uses OVN[^ov] as controller it instructs it to create the
+necessary virtual networking infrastructure (logical switches, routers, etc.),
+particullary to create Geneve tunnels between compute nodes. These tunnels are
+used to carry traffic between instances on different compute nodes.
+
+In PoC setup Libreswan[^ls] suite runs on each compute node and manages the
+IPSec[^ip] tunnels. It encrypts the traffic flowing over the Geneve tunnels,
+ensuring that data is secure as it traverses the physical network. In setup
+phase it establishes IPSec tunnels between compute nodes by negotiating the
+necessary security parameters (encryption, authentication, etc.). Once the
+tunnels are established, Libreswan[^ls] monitors and manages them, ensuring
+that the encryption keys are periodically refreshed and that the tunnels remain
+up. It also dynamically adds and removes tunnels based on changes of network
+topology.
+
+A packet originating from a VM on one compute node and destined for a VM on
+a different node is processed by OVS and encapsulated into a Geneve tunnel.
+Before the Geneve-encapsulated packet leaves the compute node, it passes
+through the Libreswan process, which applies IPSec encryption. The encrypted
+packet traverses the physical network to the destination compute node. On the
+destination node, Libreswan[^ls] decrypts the packet, and OVN[^ov] handles
+decapsulation and forwards it to the target VM.
+
+##### Challanges
+
+Implementing the openstack-ipsec image we encountered a significant challenge:
+the ovs-ctl start-ovs-ipsec command could not run inside the container because
+it requires a running init.d or systemctl to start the IPsec daemon immediately
+after OVS[^ov] deploys the configuration. We attempted to use supervisor to
+manage the processes within the container. However, this solution forced a
+manual start of the IPsec daemon before ovs-ctl had the chance to create the
+appropriate configurations.
+
+Another challenge was the requirement for both the IPsec daemon and ovs-ipsec
+to run within a single container. This added complexity to the container
+configuration and management, making it harder to ensure both services operated
+correctly and efficiently.
+
+##### Additional infrastructure
+
+New ansible role for generating chassis keys and distributing them to the
+respective machines was created. This utility also handles the configuration on
+each machine. Managing and creating production certificates is up to the user,
+which is also true for the backend TLS certificates in kolla-ansible[^ka].
+While this management should be handled within the same process, it currently
+poses a risk of downtime when certificates expire, as it requires careful
+management and timely renewal of certificates.
+
+The new container image was designed to include all necessary
+components for openstack-ipsec. Using supervisor to manage the IPsec daemon
+within the container involved creating configuration files to ensure all
+services start correctly. However, integrating supervisor introduced additional
+complexity and potential points of failure.
+
+##### Possible improvements
+
+PoC doesn't currently address the opt-out possibility for disabling the
+encryption for some specific group of nodes, where operator deems it
+detrimental because of them being virtual or where security is already handled
+in some other layer of the stack. This could be implemented as a further
+customization available to the operator to encrypt only some subset of Geneve
+tunnels, either in blacklist or whitelist manner.
+
+Further refinement is needed to ensure ovs-ctl and the IPsec daemon start and
+configure correctly within the container environment. Exploring alternative
+process management tools or improving the configuration of supervisor could
+help achieve a more robust solution.
+
+Implementing automated certificate management could mitigate the risks
+associated with manual certificate renewals. Tools like Certbot or integration
+with existing Public Key Infrastructure (PKI) solutions might be beneficial.
+
+Engaging with the upstream Open vSwitch community to address containerization
+challenges and improve support for running ovs-ctl within containers could lead
+to more sustainable solution.
+
+## Decision
+
+The final proof of concept implementation demonstrated the feasibility of
+implementing transparent IPsec[^ip] encryption between nodes in an OVN[^ov]
+logical networking setup in OpenStack.
+To recapitulate our preliminary considerations:
+
+### Utilize existing solutions
+
+Implementation in kolla-ansible[^ka] is unintrusive, provided by a
+self-contained new kolla[^kl] container, which only adds an IPsec[^ip]
+tunneling support module to OVS[^sw], which is already an integral part of
+OpenStack networking, and a mature open source toolkit - Libreswan[^ls]. Also
+OVN[^ov] has native support in OpenStack and became the default controller for
+Neutron[^ne].
+
+### Address threat modeling issues
+
+As disussed in [motivation](#motivation) and [threat
+modelling](#potential-threats-in-detail) sections our concern lies with the
+potentially vulnerable physical infrastructure between nodes inside or between
+data centers. In this case ensuring encryption and integrity of packets before
+leaving any node addresses these threats, while avoiding the complexity of
+securing the communication on the VM level, where frequent additions, deletions
+and migrations could render such system complicated and error prone. We also
+don't needlessly encrypt VM communication inside one node.
+
+### Avoid redundant encryption
+
+As the encryption happens inside tunnels specific for inter-node workload
+communication, isolated on own network and also inside Geneve tunnels, no cloud
+service data, which could be possibly encrypted on higher-levels (TLS) is
+possible here. As to the workload communication itself - detecting higher-layer
+encryption in a way that would allow IPsec[^ip] to avoid redundant encryption
+is complex and would require custom modifications or non-standard solutions.
+It's usually safer and more straightforward to allow the redundancy, ensuring
+security at multiple layers, rather than trying to eliminate it.
+
+### Performance impact and ease of use
+
+Setup is straightforward for the operator, there is just a flag to enable or
+disable the IPsec[^ip] encryption inside Geneve tunnels and the need to set the
+Neutron[^ne] agent to OVN[^ov]. No other configuration is necessary. The only
+other administrative burden is the deployment of certificates to provided
+configuration directory on the control node.
+
+Certificate management for this solution can and should be handled in the same
+way as for the backend service certificates which are part of the ongoing
+efforts to provide complete service communication encryption in kolla-ansible.
+Currently the management of these certificates is partially left on external
+processes, but if a toolset or a process would be devised inside the project,
+this solution would fit in.
+
+### Upstream integration
+
+The potential for upstream adoption and long-term maintainability makes this a
+promising direction for securing inter-node communication in OpenStack
+environments.
+
+## References
+
+[^ne]: [Neutron](https://docs.openstack.org/neutron/latest/) - networking as a service (NaaS) in OpenStack
+[^eb]: [eBPF](https://en.wikipedia.org/wiki/EBPF)
+[^ia]: Team IaaS call [minutes](https://github.com/SovereignCloudStack/minutes/blob/main/iaas/20240214.md)
+[^sw]: [open vSwitch](https://www.openvswitch.org/)
+[^ip]: [IPsec](https://en.wikipedia.org/wiki/IPsec)
+[^ipwh]: [Why is IPsec so complicated](https://destcert.com/resources/why-the-is-ipsec-so-complicated/)
+[^to]: [TripleO](https://docs.openstack.org/developer/tripleo-docs/) - OpenStack on OpenStack
+[^ci]: [Cillium](https://cilium.io/)
+[^ca]: [Calico](https://docs.tigera.io/calico/latest/about)
+[^caos]: [Calico for OpenStack](https://docs.tigera.io/calico/latest/getting-started/openstack/overview)
+[^ta]: [Tailscale](https://tailscale.com/solutions/devops)
+[^ov]: [Open Virtual Network](https://www.ovn.org/en/) (OVN)
+[^oit]: [OVN IPsec tutorial](https://docs.ovn.org/en/latest/tutorials/ovn-ipsec.html)
+[^kl]: [kolla](https://opendev.org/openstack/kolla) project
+[^ka]: [kolla-ansible](https://docs.openstack.org/kolla-ansible/latest/) project
+[^wg]: [WireGuard](https://www.wireguard.com/)
+[^wgwp]: WireGuard [white paper](https://www.wireguard.com/papers/wireguard.pdf)
+[^ie]: [Internet Engineering Task Force](https://www.ietf.org/) (IETF)
+[^rfc]: [RFC8926](https://datatracker.ietf.org/doc/html/rfc8926#name-inter-data-center-traffic)
+[^lkc]: [Linux Kernel Crypto API](https://www.kernel.org/doc/html/v4.10/crypto/index.html)
+[^ls]: [Libreswan](https://libreswan.org/) VPN software
+[^ms]: [MACsec standard](https://en.wikipedia.org/wiki/IEEE_802.1AE)
+[^neci1]: [Neutron + Cilium architecture example](https://gist.github.com/oblazek/466a9ae836f663f8349b71e76abaee7e)
+[^neci2]: [Neutron + Cilium Proposal](https://github.com/cilium/cilium/issues/13433)
diff --git a/README.md b/README.md
index 2685faddb..5052b25bf 100644
--- a/README.md
+++ b/README.md
@@ -1,23 +1,10 @@
-
# Sovereign Cloud Stack – Standards and Certification
SCS unifies the best of cloud computing in a certified standard. With a decentralized and federated cloud stack, SCS puts users in control of their data and fosters trust in clouds, backed by a global open-source community.
## SCS compatible clouds
-This is a list of clouds that we test on a nightly basis against our `scs-compatible` certification level.
-
-| Name | Description | Operator | _SCS-compatible IaaS_ Compliance | HealthMon |
-| -------------------------------------------------------------------------------------------------------------- | ------------------------------------------------- | ----------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------: |
-| [gx-scs](https://github.com/SovereignCloudStack/docs/blob/main/community/cloud-resources/plusserver-gx-scs.md) | Dev environment provided for SCS & GAIA-X context | plusserver GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-gx-scs-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-gx-scs-v4.yml) | [HM](https://health.gx-scs.sovereignit.cloud:3000/) |
-| [pluscloud open](https://www.plusserver.com/en/products/pluscloud-open)
- prod1
- prod2
- prod3
- prod4 | Public cloud for customers (4 regions) | plusserver GmbH |
- prod1 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-pco-prod1-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-pco-prod1-v4.yml)
- prod2 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-pco-prod2-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-pco-prod2-v4.yml)
- prod3 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-pco-prod3-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-pco-prod3-v4.yml)
- prod4 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-pco-prod4-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-pco-prod4-v4.yml) |
[HM1](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-pco)
[HM2](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod2)
[HM3](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod3)
[HM4](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod4) |
-| [Wavestack](https://www.noris.de/wavestack-cloud/) | Public cloud for customers | noris network AG/Wavecon GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-wavestack-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-wavestack-v4.yml) | [HM](https://health.wavestack1.sovereignit.cloud:3000/) |
-| [REGIO.cloud](https://regio.digital) | Public cloud for customers | OSISM GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-regio-a-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-regio-a-v4.yml) | broken |
-| [CNDS](https://cnds.io/) | Public cloud for customers | artcodix GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-artcodix-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-artcodix-v4.yml) | [HM](https://ohm.muc.cloud.cnds.io/) |
-| [aov.cloud](https://www.aov.de/) | Community cloud for customers | aov IT.Services GmbH | (soon) | [HM](https://health.aov.cloud/) |
-| PoC WG-Cloud OSBA | Cloud PoC for FITKO (yaook-based) | Cloud&Heat Technologies GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-poc-wgcloud-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-poc-wgcloud-v4.yml) | [HM](https://health.poc-wgcloud.osba.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?var-mycloud=poc-wgcloud&orgId=1) |
-| PoC KDO | Cloud PoC for FITKO | KDO Service GmbH / OSISM GmbH | [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-poc-kdo-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-poc-kdo-v4.yml) | (soon) |
-| [syseleven](https://www.syseleven.de/en/products-services/openstack-cloud/)
- dus2
- ham1 | Public OpenStack Cloud (2 SCS regions) | SysEleven GmbH |
- dus2 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-syseleven-dus2-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-syseleven-dus2-v4.yml)
- ham1 [![Compliance Status](https://img.shields.io/github/actions/workflow/status/SovereignCloudStack/standards/check-syseleven-ham1-v4.yml?label=v4)](https://github.com/SovereignCloudStack/standards/actions/workflows/check-syseleven-ham1-v4.yml) |
(soon)
(soon) |
+See [Compliant clouds overview](https://docs.scs.community/standards/certification/overview) on our docs page.
## SCS standards overview
diff --git a/Standards/scs-0100-v3-flavor-naming.md b/Standards/scs-0100-v3-flavor-naming.md
index 0c7fea124..587bde220 100644
--- a/Standards/scs-0100-v3-flavor-naming.md
+++ b/Standards/scs-0100-v3-flavor-naming.md
@@ -14,7 +14,7 @@ description: |
## Introduction
-This is the standard v3.1 for SCS Release 5.
+This is the standard v3.2 for SCS Release 8.
Note that we intend to only extend it (so it's always backwards compatible),
but try to avoid changing in incompatible ways.
(See at the end for the v1 to v2 transition where we have not met that
@@ -417,7 +417,7 @@ is more significant.
### [OPTIONAL] GPU support
-Format: `_`\[`G/g`\]X\[N\]\[`-`M\]\[`h`\]
+Format: `_`\[`G/g`\]X\[N\[`-`M\[`h`\]\[`-`V\[`h`\]\]\]\]
This extension provides more details on the specific GPU:
@@ -425,7 +425,9 @@ This extension provides more details on the specific GPU:
- vendor (X)
- generation (N)
- number (M) of processing units that are exposed (for pass-through) or assigned; see table below for vendor-specific terminology
-- high-performance indicator (`h`)
+- high-frequency indicator (`h`) for compute units
+- amount of video memory (V) in GiB
+- an indicator for high-bandwidth memory
Note that the vendor letter X is mandatory, generation and processing units are optional.
@@ -440,13 +442,29 @@ for AMD GCN-x=0.x, RDNA1=1, C/RDNA2=2, C/RDNA3=3, C/RDNA3.5=3.5, C/RDNA4=4, ...
for Intel Gen9=0.9, Xe(12.1/DG1)=1, Xe(12.2)=2, Arc(12.7/DG2)=3 ...
(Note: This may need further work to properly reflect what's out there.)
-The optional `h` suffix to the compute unit count indicates high-performance (e.g. high freq or special
-high bandwidth gfx memory such as HBM);
-`h` can be duplicated for even higher performance.
+The optional `h` suffix to the compute unit count indicates high-frequency GPU compute units.
+It is not normally recommended to use it except if there are several variants of cards within
+a generation of GPUs and with similar number of SMs/CUs/EUs.
+In case there are even more than two variants, the letter `h` can be duplicated for even
+higher frquencies.
-Example: `SCS-16V-64-500s_GNa-14h`
-This flavor has a pass-through GPU nVidia Ampere with 14 SMs and either high-bandwidth memory or specially high frequencies.
-Looking through GPU specs you could guess it's 1/4 of an A30.
+Please note that there are GPUs from one generation and vendor that have vastly different sizes
+(or different fractions are being passed to an instance with multi-instance-GPUs). The number
+M allows to differentiate between them and have an indicator of the compute capability and
+parallelism. M can not at all be compared between different generations let alone different
+vendors.
+
+The amount of video memory dedicated to the instance can be indicated by V (in binary
+Gigabytes). This number needs to be an integer - fractional memory sizes must be rounded
+down. An optional `h` can be used to indicate high bandwidth memory (such as HBM2+) with
+bandwidths well above 1GiB/s.
+
+Example: `SCS-16V-64-500s_GNa-14-6h`
+This flavor has a pass-through GPU nVidia Ampere with 14 SMs and 6 GiB of high-bandwidth video
+memory. Looking through GPU specs you could guess it's 1/4 of an A30.
+
+We have a table with common GPUs in the
+[implementation hints for this standard](scs-0100-w1-flavor-naming-implementation-testing.md)
### [OPTIONAL] Infiniband
@@ -490,14 +508,14 @@ an image is considered broken by the SCS team.
## Proposal Examples
-| Example | Decoding |
-| ------------------------- | ---------------------------------------------------------------------------------------------- |
-| SCS-2C-4-10n | 2 dedicated cores (x86-64), 4GiB RAM, 10GB network disk |
-| SCS-8Ti-32-50p_i1 | 8 dedicated hyperthreads (insecure), Skylake, 32GiB RAM, 50GB local NVMe |
-| SCS-1L-1u-5 | 1 vCPU (heavily oversubscribed), 1GiB Ram (no ECC), 5GB disk (unspecific) |
-| SCS-16T-64-200s_GNa-64_ib | 16 dedicated threads, 64GiB RAM, 200GB local SSD, Infiniband, 64 Passthrough nVidia Ampere SMs |
-| SCS-4C-16-2x200p_a1 | 4 dedicated Arm64 cores (A76 class), 16GiB RAM, 2x200GB local NVMe drives |
-| SCS-1V-0.5 | 1 vCPU, 0.5GiB RAM, no disk (boot from cinder volume) |
+| Example | Decoding |
+| ------------------------------ | ---------------------------------------------------------------------------------------------- |
+| `SCS-2C-4-10n` | 2 dedicated cores (x86-64), 4GiB RAM, 10GB network disk |
+| `SCS-8Ti-32-50p_i1` | 8 dedicated hyperthreads (insecure), Skylake, 32GiB RAM, 50GB local NVMe |
+| `SCS-1L-1u-5` | 1 vCPU (heavily oversubscribed), 1GiB Ram (no ECC), 5GB disk (unspecific) |
+| `SCS-16T-64-200s_GNa-72-24_ib` | 16 dedicated threads, 64GiB RAM, 200GB local SSD, Infiniband, 72 Passthrough nVidia Ampere SMs |
+| `SCS-4C-16-2x200p_a1` | 4 dedicated Arm64 cores (A76 class), 16GiB RAM, 2x200GB local NVMe drives |
+| `SCS-1V-0.5` | 1 vCPU, 0.5GiB RAM, no disk (boot from cinder volume) |
## Previous standard versions
diff --git a/Standards/scs-0100-w1-flavor-naming-implementation-testing.md b/Standards/scs-0100-w1-flavor-naming-implementation-testing.md
index 71756e07d..d9f5f62b2 100644
--- a/Standards/scs-0100-w1-flavor-naming-implementation-testing.md
+++ b/Standards/scs-0100-w1-flavor-naming-implementation-testing.md
@@ -32,7 +32,8 @@ See the [README](https://github.com/SovereignCloudStack/standards/tree/main/Test
for more details.
The functionality of this script is also (partially) exposed via the web page
-[https://flavors.scs.community/](https://flavors.scs.community/).
+[https://flavors.scs.community/](https://flavors.scs.community/), which can both
+parse SCS flavors names as well as generate them.
With the OpenStack tooling (`python3-openstackclient`, `OS_CLOUD`) in place, you can call
`cli.py -v parse v3 $(openstack flavor list -f value -c Name)` to get a report
@@ -45,6 +46,107 @@ will create a whole set of flavors in one go.
To that end, it provides different options: either the standard mandatory and
possibly recommended flavors can be created, or the user can set a file containing his flavors.
+### GPU table
+
+The most commonly used datacenter GPUs are listed here, showing what GPUs (or partitions
+of a GPU) result in what GPU part of the flavor name.
+
+#### Nvidia (`N`)
+
+We show the most popular recent generations here. Older one are of course possible as well.
+
+##### Ampere (`a`)
+
+One Streaming Multiprocessor on Ampere has 64 (A30, A100) or 128 Cuda Cores (A10, A40).
+
+GPUs without MIG (one SM has 128 Cuda Cores and 4 Tensor Cores):
+
+| Nvidia GPU | Tensor C | Cuda Cores | SMs | VRAM | SCS name piece |
+|------------|----------|------------|-----|-----------|----------------|
+| A10 | 288 | 9216 | 72 | 24G GDDR6 | `GNa-72-24` |
+| A40 | 336 | 10752 | 84 | 48G GDDR6 | `GNa-84-48` |
+
+GPUs with Multi-Instance-GPU (MIG), where GPUs can be partitioned and the partitions handed
+out as as pass-through PCIe devices to instances. One SM corresponds to 64 Cuda Cores and
+4 Tensor Cores.
+
+| Nvidia GPU | Fraction | Tensor C | Cuda Cores | SMs | VRAM | SCS GPU name |
+|------------|----------|----------|------------|-----|-----------|----------------|
+| A30 | 1/1 | 224 | 3584 | 56 | 24G HBM2 | `GNa-56-24` |
+| A30 | 1/2 | 112 | 1792 | 28 | 12G HBM2 | `GNa-28-12` |
+| A30 | 1/4 | 56 | 896 | 14 | 6G HBM2 | `GNa-14-6` |
+| A30X | 1/1 | 224 | 3584 | 56 | 24G HBM2e | `GNa-56h-24h` |
+| A100 | 1/1 | 432 | 6912 | 108 | 80G HBM2e | `GNa-108h-80h` |
+| A100 | 1/2 | 216 | 3456 | 54 | 40G HBM2e | `GNa-54h-40h` |
+| A100 | 1/4 | 108 | 1728 | 27 | 20G HBM2e | `GNa-27h-20h` |
+| A100 | 1/7 | 60+ | 960+ | 15+| 10G HBM2e | `GNa-15h-10h`+ |
+| A100X | 1/1 | 432 | 6912 | 108 | 80G HBM2e | `GNa-108-80h` |
+
+[+] The precise numbers for the 1/7 MIG configurations are not known by the author of
+this document and need validation.
+
+##### Ada Lovelave (`l`)
+
+No MIG support, 128 Cuda Cores and 4 Tensor Cores per SM.
+
+| Nvidia GPU | Tensor C | Cuda Cores | SMs | VRAM | SCS name piece |
+|------------|----------|------------|-----|-----------|----------------|
+| L4 | 232 | 7424 | 58 | 24G GDDR6 | `GNl-58-24` |
+| L40 | 568 | 18176 | 142 | 48G GDDR6 | `GNl-142-48` |
+| L40G | 568 | 18176 | 142 | 48G GDDR6 | `GNl-142h-48` |
+| L40S | 568 | 18176 | 142 | 48G GDDR6 | `GNl-142hh-48` |
+
+##### Grace Hopper (`g`)
+
+These have MIG support and 128 Cuda Cores and 4 Tensor Cores per SM.
+
+| Nvidia GPU | Fraction | Tensor C | Cuda Cores | SMs | VRAM | SCS GPU name |
+|------------|----------|----------|------------|-----|------------|----------------|
+| H100 | 1/1 | 528 | 16896 | 132 | 80G HBM3 | `GNg-132-80h` |
+| H100 | 1/2 | 264 | 8448 | 66 | 40G HBM3 | `GNg-66-40h` |
+| H100 | 1/4 | 132 | 4224 | 33 | 20G HBM3 | `GNg-33-20h` |
+| H100 | 1/7 | 72+ | 2304+ | 18+| 10G HBM3 | `GNg-18-10h`+ |
+| H200 | 1/1 | 528 | 16896 | 132 | 141G HBM3e | `GNg-132-141h` |
+| H200 | 1/2 | 264 | 16896 | 66 | 70G HBM3e | `GNg-66-70h` |
+| ... |
+
+[+] The precise numbers for the 1/7 MIG configurations are not known by the author of
+this document and need validation.
+
+#### AMD Radeon (`A`)
+
+##### CDNA 2 (`2`)
+
+One CU contains 64 Stream Processors.
+
+| AMD Instinct| Stream Proc | CUs | VRAM | SCS name piece |
+|-------------|-------------|-----|------------|----------------|
+| Inst MI210 | 6656 | 104 | 64G HBM2e | `GA2-104-64h` |
+| Inst MI250 | 13312 | 208 | 128G HBM2e | `GA2-208-128h` |
+| Inst MI250X | 14080 | 229 | 128G HBM2e | `GA2-220-128h` |
+
+##### CDNA 3 (`3`)
+
+SRIOV partitioning is possible, resulting in pass-through for
+up to 8 partitions, somewhat similar to Nvidia MIG. 4 Tensor
+Cores and 64 Stream Processors per CU.
+
+| AMD GPU | Tensor C | Stream Proc | CUs | VRAM | SCS name piece |
+|-------------|----------|-------------|-----|------------|----------------|
+| Inst MI300X | 1216 | 19456 | 304 | 192G HBM3 | `GA3-304-192h` |
+| Inst MI325X | 1216 | 19456 | 304 | 288G HBM3 | `GA3-304-288h` |
+
+#### intel Xe (`I`)
+
+##### Xe-HPC (Ponte Vecchio) (`3`)
+
+1 EU corresponds to one Tensor Core and contains 128 Shading Units.
+
+| intel DC GPU | Tensor C | Shading U | EUs | VRAM | SCS name part |
+|--------------|----------|-----------|-----|------------|----------------|
+| Max 1100 | 56 | 7168 | 56 | 48G HBM2e | `GI3-56-48h` |
+| Max 1550 | 128 | 16384 | 128 | 128G HBM2e | `GI3-128-128h` |
+
## Automated tests
### Errors
diff --git a/Standards/scs-0111-v1-volume-type-decisions.md b/Standards/scs-0111-v1-volume-type-decisions.md
index 28fc32e8b..aaf3e522f 100644
--- a/Standards/scs-0111-v1-volume-type-decisions.md
+++ b/Standards/scs-0111-v1-volume-type-decisions.md
@@ -7,7 +7,7 @@ track: IaaS
## Introduction
-Volumes in OpenStack are virtual drives. They are managed by the storage service Cinder, which abstracts creation and usage of many different storage backends. While it is possible to use a backend like lvm which can reside on the same host as the hypervisor, the SCS wants to make a more clear differentiation between volumes and the ephemeral storage of a virtual machine. For all SCS deployments we want to assume that volumes are always residing in a storage backend that is NOT on the same host as a hypervisor - in short terms: Volumes are network storage. Ephemeral storage on the other hand is the only storage residing on a compute host. It is created by creating a VM directly from an Image and is automatically los as soon as the VM cease to exist. Volumes on the other hand have to be created from Images and only after that can be used for VMs. They are persistent and will remain in the last state a VM has written on them before they cease to exit. Being persistent and not relying on the host where the VM resides, Volumes can easily be attached to another VM in case of a node outage and VMs be migrated way more easily, because only metadata and data in RAM has to be shifted to another host, accelerating any migration or evacuation of a VM.
+Volumes in OpenStack are virtual drives. They are managed by the storage service Cinder, which abstracts creation and usage of many different storage backends. While it is possible to use a backend like lvm which can reside on the same host as the hypervisor, this decision record wants to make a more clear differentiation between volumes and the ephemeral storage of a virtual machine. For all SCS deployments we want to assume that volumes are always residing in a storage backend that is NOT on the same host as a hypervisor - in short terms: Volumes are network storage. Ephemeral storage on the other hand is the only storage residing on a compute host. It is created by creating a VM directly from an Image and is automatically lost as soon as the VM cease to exist. Volumes on the other hand have to be created from Images and only after that can be used for VMs. They are persistent and will remain in the last state a VM has written on them before they cease to exit. Being persistent and not relying on the host where the VM resides, Volumes can easily be attached to another VM in case of a node outage and VMs be migrated way more easily, because only metadata and data in RAM has to be shifted to another host, accelerating any migration or evacuation of a VM.
Volume Types are used to classify volumes and provide a basic decision for what kind of volume should be created. These volume types can sometimes very be backend-specific, and it might be hard for a user to choose the most suitable volume type, if there is more than one default type. Nevertheless, most of the configuration is done in the backends themselves, so volume types only work as a rough classification.
diff --git a/Standards/scs-0120-w1-Availability-Zones-Standard.md b/Standards/scs-0121-w1-Availability-Zones-Standard.md
similarity index 100%
rename from Standards/scs-0120-w1-Availability-Zones-Standard.md
rename to Standards/scs-0121-w1-Availability-Zones-Standard.md
diff --git a/Tests/iaas/flavor-naming/cli.py b/Tests/iaas/flavor-naming/cli.py
index 86969cbbb..796b6a733 100755
--- a/Tests/iaas/flavor-naming/cli.py
+++ b/Tests/iaas/flavor-naming/cli.py
@@ -72,7 +72,7 @@ def parse(cfg, version, name, output='none'):
if flavorname is None:
print(f"NOT an SCS flavor: {namestr}")
elif output == 'prose':
- printv(name, end=': ')
+ printv(namestr, end=': ')
print(f"{prettyname(flavorname)}")
elif output == 'yaml':
print(yaml.dump(flavorname_to_dict(flavorname), explicit_start=True))
diff --git a/Tests/iaas/flavor-naming/flavor-name-check.py b/Tests/iaas/flavor-naming/flavor-name-check.py
index 536372757..e5d395e54 100755
--- a/Tests/iaas/flavor-naming/flavor-name-check.py
+++ b/Tests/iaas/flavor-naming/flavor-name-check.py
@@ -86,6 +86,9 @@ def main(argv):
nm2 = _fnmck.outname(ret2)
if nm1 != nm2:
print(f"WARNING: {nm1} != {nm2}")
+ snm = _fnmck.outname(ret.shorten())
+ if snm != nm1:
+ print(f"Shortened name: {snm}")
argv = argv[1:]
scs = 1
diff --git a/Tests/iaas/flavor-naming/flavor_names.py b/Tests/iaas/flavor-naming/flavor_names.py
index f3d799060..10ca54da6 100644
--- a/Tests/iaas/flavor-naming/flavor_names.py
+++ b/Tests/iaas/flavor-naming/flavor_names.py
@@ -162,6 +162,9 @@ class Main:
raminsecure = BoolAttr("?no ECC", letter="u")
ramoversubscribed = BoolAttr("?RAM Over", letter="o")
+ def shorten(self):
+ return self
+
class Disk:
"""Class representing the disk part"""
@@ -171,6 +174,9 @@ class Disk:
disksize = OptIntAttr("#.GB Disk")
disktype = TblAttr("Disk type", {'': '(unspecified)', "n": "Networked", "h": "Local HDD", "s": "SSD", "p": "HiPerf NVMe"})
+ def shorten(self):
+ return self
+
class Hype:
"""Class repesenting Hypervisor"""
@@ -178,6 +184,9 @@ class Hype:
component_name = "hype"
hype = TblAttr(".Hypervisor", {"kvm": "KVM", "xen": "Xen", "hyv": "Hyper-V", "vmw": "VMware", "bms": "Bare Metal System"})
+ def shorten(self):
+ return None
+
class HWVirt:
"""Class repesenting support for hardware virtualization"""
@@ -185,6 +194,9 @@ class HWVirt:
component_name = "hwvirt"
hwvirt = BoolAttr("?HardwareVirt", letter="hwv")
+ def shorten(self):
+ return None
+
class CPUBrand:
"""Class repesenting CPU brand"""
@@ -206,13 +218,19 @@ def __init__(self, cpuvendor="i", cpugen=0, perf=""):
self.cpugen = cpugen
self.perf = perf
+ def shorten(self):
+ # For non-x86-64, don't strip out CPU brand for short name, as it contains the architecture
+ if self.cpuvendor in ('i', 'z'):
+ return None
+ return CPUBrand(self.cpuvendor)
+
class GPU:
"""Class repesenting GPU support"""
type = "GPU"
component_name = "gpu"
gputype = TblAttr("Type", {"g": "vGPU", "G": "Pass-Through GPU"})
- brand = TblAttr("Brand", {"N": "nVidia", "A": "AMD", "I": "Intel"})
+ brand = TblAttr("Brand", {"N": "Nvidia", "A": "AMD", "I": "Intel"})
gen = DepTblAttr("Gen", brand, {
"N": {'': '(unspecified)', "f": "Fermi", "k": "Kepler", "m": "Maxwell", "p": "Pascal",
"v": "Volta", "t": "Turing", "a": "Ampere", "l": "AdaLovelace", "g": "GraceHopper"},
@@ -222,7 +240,22 @@ class GPU:
"3": "Arc/Gen12.7/DG2"},
})
cu = OptIntAttr("#.N:SMs/A:CUs/I:EUs")
- perf = TblAttr("Performance", {"": "Std Perf", "h": "High Perf", "hh": "Very High Perf", "hhh": "Very Very High Perf"})
+ perf = TblAttr("Frequency", {"": "Std Freq", "h": "High Freq", "hh": "Very High Freq"})
+ vram = OptIntAttr("#.V:GiB VRAM")
+ vramperf = TblAttr("Bandwidth", {"": "Std BW {<~1GiB/s)", "h": "High BW", "hh": "Very High BW"})
+
+ def __init__(self, gputype="g", brand="N", gen='', cu=None, perf='', vram=None, vramperf=''):
+ self.gputype = gputype
+ self.brand = brand
+ self.gen = gen
+ self.cu = cu
+ self.perf = perf
+ self.vram = vram
+ self.vramperf = vramperf
+
+ def shorten(self):
+ # remove h modifiers
+ return GPU(gputype=self.gputype, brand=self.brand, gen=self.gen, cu=self.cu, vram=self.vram)
class IB:
@@ -231,6 +264,9 @@ class IB:
component_name = "ib"
ib = BoolAttr("?IB")
+ def shorten(self):
+ return self
+
class Flavorname:
"""A flavor name; merely a bunch of components"""
@@ -248,14 +284,15 @@ def __init__(
def shorten(self):
"""return canonically shortened name as recommended in the standard"""
- if self.hype is None and self.hwvirt is None and self.cpubrand is None:
- return self
- # For non-x86-64, don't strip out CPU brand for short name, as it contains the architecture
- if self.cpubrand and self.cpubrand.cpuvendor not in ('i', 'z'):
- return Flavorname(cpuram=self.cpuram, disk=self.disk,
- cpubrand=CPUBrand(self.cpubrand.cpuvendor),
- gpu=self.gpu, ib=self.ib)
- return Flavorname(cpuram=self.cpuram, disk=self.disk, gpu=self.gpu, ib=self.ib)
+ return Flavorname(
+ cpuram=self.cpuram and self.cpuram.shorten(),
+ disk=self.disk and self.disk.shorten(),
+ hype=self.hype and self.hype.shorten(),
+ hwvirt=self.hwvirt and self.hwvirt.shorten(),
+ cpubrand=self.cpubrand and self.cpubrand.shorten(),
+ gpu=self.gpu and self.gpu.shorten(),
+ ib=self.ib and self.ib.shorten(),
+ )
class Outputter:
@@ -278,7 +315,7 @@ class Outputter:
hype = "_%s"
hwvirt = "_%?"
cpubrand = "_%s%0%s"
- gpu = "_%s%s%s%-%s"
+ gpu = "_%s%s%s%-%s%-%s"
ib = "_%?"
def output_component(self, pattern, component, parts):
@@ -341,7 +378,7 @@ class SyntaxV1:
hwvirt = re.compile(r"\-(hwv)")
# cpubrand needs final lookahead assertion to exclude confusion with _ib extension
cpubrand = re.compile(r"\-([izar])([0-9]*)(h*)(?=$|\-)")
- gpu = re.compile(r"\-([gG])([NAI])([^:h]*)(?::([0-9]+)|)(h*)")
+ gpu = re.compile(r"\-([gG])([NAI])([^:h]*)(?::([0-9]+)|)(h*)(?::([0-9]+)|)(h*)")
ib = re.compile(r"\-(ib)")
@staticmethod
@@ -366,7 +403,7 @@ class SyntaxV2:
hwvirt = re.compile(r"_(hwv)")
# cpubrand needs final lookahead assertion to exclude confusion with _ib extension
cpubrand = re.compile(r"_([izar])([0-9]*)(h*)(?=$|_)")
- gpu = re.compile(r"_([gG])([NAI])([^\-h]*)(?:\-([0-9]+)|)(h*)")
+ gpu = re.compile(r"_([gG])([NAI])([^\-h]*)(?:\-([0-9]+)|)(h*)(?:\-([0-9]+)|)(h*)")
ib = re.compile(r"_(ib)")
@staticmethod
@@ -697,10 +734,14 @@ def prettyname(flavorname, prefix=""):
if flavorname.gpu:
stg += "and " + _tbl_out(flavorname.gpu, "gputype")
stg += _tbl_out(flavorname.gpu, "brand")
- stg += _tbl_out(flavorname.gpu, "perf", True)
stg += _tbl_out(flavorname.gpu, "gen", True)
if flavorname.gpu.cu is not None:
- stg += f"(w/ {flavorname.gpu.cu} SMs/CUs/EUs) "
+ stg += f"(w/ {flavorname.gpu.cu} {_tbl_out(flavorname.gpu, 'perf', True)}SMs/CUs/EUs"
+ # Can not specify VRAM without CUs
+ if flavorname.gpu.vram:
+ stg += f" and {flavorname.gpu.vram} GiB {_tbl_out(flavorname.gpu, 'vramperf', True)}VRAM) "
+ else:
+ stg += ") "
# IB
if flavorname.ib:
stg += "and Infiniband "
diff --git a/Tests/kaas/kaas-sonobuoy-tests/Makefile b/Tests/kaas/kaas-sonobuoy-tests/Makefile
index b7099cee7..dffc6d7a2 100644
--- a/Tests/kaas/kaas-sonobuoy-tests/Makefile
+++ b/Tests/kaas/kaas-sonobuoy-tests/Makefile
@@ -96,10 +96,19 @@ test-function:
@echo "only run tests for: $${TESTFUNCTION_CODE}"
DEVELOPMENT_MODE=createcluster go test -run=$${TESTFUNCTION_CODE} ./... || true
-
-lint:
- @echo "NOT YET IMPLEMENTED"
-
+lint: check-golangci-lint
+ @echo "[Running golangci-lint...]"
+ @golangci-lint run ./... -v || true
+
+GOLANGCI_LINT_VERSION ?= v1.61.0
+check-golangci-lint:
+ @if ! [ -x "$$(command -v golangci-lint)" ]; then \
+ echo "[golangci-lint not found, installing...]"; \
+ go install github.com/golangci/golangci-lint/cmd/golangci-lint@$(GOLANGCI_LINT_VERSION); \
+ echo "[golangci-lint installed]"; \
+ else \
+ echo "[golangci-lint is already installed]"; \
+ fi
dev-clean-result:
@rm -rf *.tar.gz || true
diff --git a/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/main_test.go b/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/main_test.go
index 8359b1b0f..98d305a5b 100644
--- a/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/main_test.go
+++ b/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/main_test.go
@@ -3,9 +3,10 @@ package scs_k8s_tests
import (
"context"
"fmt"
+ "log"
"os"
"testing"
- "log"
+
plugin_helper "github.com/vmware-tanzu/sonobuoy-plugins/plugin-helper"
v1 "k8s.io/api/core/v1"
"sigs.k8s.io/e2e-framework/pkg/env"
@@ -13,11 +14,16 @@ import (
"sigs.k8s.io/e2e-framework/pkg/envfuncs"
)
+// Define a custom type for the context key
+type nsContextKey string
+
+// Define a custom type for context keys
+type contextKey string
const (
ProgressReporterCtxKey = "SONOBUOY_PROGRESS_REPORTER"
- NamespacePrefixKey = "NS_PREFIX"
- DevelopmentModeKey = "DEVELOPMENT_MODE"
+ NamespacePrefixKey = "NS_PREFIX"
+ DevelopmentModeKey = "DEVELOPMENT_MODE"
)
var testenv env.Environment
@@ -33,82 +39,82 @@ func TestMain(m *testing.M) {
updateReporter := plugin_helper.NewProgressReporter(0)
developmentMode := os.Getenv(DevelopmentModeKey)
- log.Printf("Setup test enviornment for: %#v", developmentMode )
-
- switch KubernetesEnviornment := developmentMode; KubernetesEnviornment {
-
- case "createcluster":
- log.Println("Create kind cluster for test")
- testenv = env.New()
- kindClusterName := envconf.RandomName("gotestcluster", 16)
- //~ namespace := envconf.RandomName("testnamespace", 16)
-
- testenv.Setup(
- envfuncs.CreateKindCluster(kindClusterName),
- )
-
- testenv.Finish(
- //~ envfuncs.DeleteNamespace(namespace),
- envfuncs.DestroyKindCluster(kindClusterName),
- )
-
- case "usecluster":
- log.Println("Use existing k8s cluster for the test")
- log.Println("Not Yet Implemented")
- //~ testenv = env.NewFromFlags()
- //~ KubeConfig:= os.Getenv(KUBECONFIGFILE)
- //~ testenv = env.NewWithKubeConfig(KubeConfig)
-
- default:
- // Assume we are running in the cluster as a Sonobuoy plugin.
- log.Println("Running tests inside k8s cluster")
- testenv = env.NewInClusterConfig()
-
- testenv.Setup(func(ctx context.Context, config *envconf.Config) (context.Context, error) {
- // Try and create the client; doing it before all the tests allows the tests to assume
- // it can be created without error and they can just use config.Client().
- _,err:=config.NewClient()
- return context.WithValue(ctx,ProgressReporterCtxKey,updateReporter) ,err
- })
-
- testenv.Finish(
- func(ctx context.Context, cfg *envconf.Config) (context.Context, error) {
- log.Println("Finished go test suite")
- //~ if err := ???; err != nil{
- //~ return ctx, err
- //~ }
- return ctx, nil
- },
- )
-
- }
+ log.Printf("Setup test enviornment for: %#v", developmentMode)
+
+ switch KubernetesEnviornment := developmentMode; KubernetesEnviornment {
+
+ case "createcluster":
+ log.Println("Create kind cluster for test")
+ testenv = env.New()
+ kindClusterName := envconf.RandomName("gotestcluster", 16)
+ //~ namespace := envconf.RandomName("testnamespace", 16)
+
+ testenv.Setup(
+ envfuncs.CreateKindCluster(kindClusterName),
+ )
+
+ testenv.Finish(
+ //~ envfuncs.DeleteNamespace(namespace),
+ envfuncs.DestroyKindCluster(kindClusterName),
+ )
+
+ case "usecluster":
+ log.Println("Use existing k8s cluster for the test")
+ log.Println("Not Yet Implemented")
+ //~ testenv = env.NewFromFlags()
+ //~ KubeConfig:= os.Getenv(KUBECONFIGFILE)
+ //~ testenv = env.NewWithKubeConfig(KubeConfig)
+
+ default:
+ // Assume we are running in the cluster as a Sonobuoy plugin.
+ log.Println("Running tests inside k8s cluster")
+ testenv = env.NewInClusterConfig()
+
+ testenv.Setup(func(ctx context.Context, config *envconf.Config) (context.Context, error) {
+ // Try and create the client; doing it before all the tests allows the tests to assume
+ // it can be created without error and they can just use config.Client().
+ _, err := config.NewClient()
+ return context.WithValue(ctx, contextKey(ProgressReporterCtxKey), updateReporter), err
+ })
+
+ testenv.Finish(
+ func(ctx context.Context, cfg *envconf.Config) (context.Context, error) {
+ log.Println("Finished go test suite")
+ //~ if err := ???; err != nil{
+ //~ return ctx, err
+ //~ }
+ return ctx, nil
+ },
+ )
+
+ }
testenv.BeforeEachTest(func(ctx context.Context, cfg *envconf.Config, t *testing.T) (context.Context, error) {
- fmt.Println("BeforeEachTest")
+ fmt.Println("BeforeEachTest")
updateReporter.StartTest(t.Name())
return createNSForTest(ctx, cfg, t, runID)
})
testenv.AfterEachTest(func(ctx context.Context, cfg *envconf.Config, t *testing.T) (context.Context, error) {
- fmt.Println("AfterEachTest")
- updateReporter.StopTest(t.Name(),t.Failed(),t.Skipped(),nil)
+ fmt.Println("AfterEachTest")
+ updateReporter.StopTest(t.Name(), t.Failed(), t.Skipped(), nil)
return deleteNSForTest(ctx, cfg, t, runID)
})
/*
- testenv.BeforeEachFeature(func(ctx context.Context, config *envconf.Config, info features.Feature) (context.Context, error) {
- // Note that you can also add logic here for before a feature is tested. There may be
- // more than one feature in a test.
- fmt.Println("BeforeEachFeature")
- return ctx, nil
- })
-
- testenv.AfterEachFeature(func(ctx context.Context, config *envconf.Config, info features.Feature) (context.Context, error) {
- // Note that you can also add logic here for after a feature is tested. There may be
- // more than one feature in a test.
- fmt.Println("AfterEachFeature")
- return ctx, nil
- })
+ testenv.BeforeEachFeature(func(ctx context.Context, config *envconf.Config, info features.Feature) (context.Context, error) {
+ // Note that you can also add logic here for before a feature is tested. There may be
+ // more than one feature in a test.
+ fmt.Println("BeforeEachFeature")
+ return ctx, nil
+ })
+
+ testenv.AfterEachFeature(func(ctx context.Context, config *envconf.Config, info features.Feature) (context.Context, error) {
+ // Note that you can also add logic here for after a feature is tested. There may be
+ // more than one feature in a test.
+ fmt.Println("AfterEachFeature")
+ return ctx, nil
+ })
*/
os.Exit(testenv.Run(m))
@@ -136,6 +142,6 @@ func deleteNSForTest(ctx context.Context, cfg *envconf.Config, t *testing.T, run
return ctx, cfg.Client().Resources().Delete(ctx, &nsObj)
}
-func nsKey(t *testing.T) string {
- return "NS-for-%v" + t.Name()
+func nsKey(t *testing.T) nsContextKey {
+ return nsContextKey("NS-for-" + t.Name())
}
diff --git a/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/scs_0200_smoke_test.go b/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/scs_0200_smoke_test.go
index 83909b41a..62ec43e3d 100644
--- a/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/scs_0200_smoke_test.go
+++ b/Tests/kaas/kaas-sonobuoy-tests/scs_k8s_conformance_tests/scs_0200_smoke_test.go
@@ -1,16 +1,15 @@
package scs_k8s_tests
import (
+ "os"
"testing"
- "os"
)
-
func Test_scs_0200_smoke(t *testing.T) {
- // This test ensures that no DevelopmentMode was set
- // when using this test-suite productively
- developmentMode := os.Getenv(DevelopmentModeKey)
+ // This test ensures that no DevelopmentMode was set
+ // when using this test-suite productively
+ developmentMode := os.Getenv(DevelopmentModeKey)
if developmentMode != "" {
- t.Errorf("developmentMode is set to = %v; want None", developmentMode )
+ t.Errorf("developmentMode is set to = %v; want None", developmentMode)
}
}
diff --git a/compliance-monitor/README.md b/compliance-monitor/README.md
index daf20d9cd..b504b6eba 100644
--- a/compliance-monitor/README.md
+++ b/compliance-monitor/README.md
@@ -227,9 +227,3 @@ Returns compliance details for given subject and scope.
### GET /{view_type}/scope/{scopeuuid}
Returns spec overview for the given scope.
-
-### GET /subjects
-
-Returns the list of subjects (together with activity status).
-
-### POST /subjects
diff --git a/compliance-monitor/bootstrap.yaml b/compliance-monitor/bootstrap.yaml
index 8928923f9..50b722703 100644
--- a/compliance-monitor/bootstrap.yaml
+++ b/compliance-monitor/bootstrap.yaml
@@ -63,52 +63,3 @@ accounts:
- public_key: "AAAAC3NzaC1lZDI1NTE5AAAAILufk4C7e0eQQIkmUDK8GB2IoiDjYtv6mx2eE8wZ3VWT"
public_key_type: "ssh-ed25519"
public_key_name: "primary"
-subjects:
- gx-scs:
- active: true
- name: gx-scs
- provider: plusserver GmbH
- artcodix:
- active: true
- name: CNDS
- provider: artcodix GmbH
- pco-prod1:
- active: true
- name: pluscloud open prod1
- provider: plusserver GmbH
- pco-prod2:
- active: true
- name: pluscloud open prod2
- provider: plusserver GmbH
- pco-prod3:
- active: true
- name: pluscloud open prod3
- provider: plusserver GmbH
- pco-prod4:
- active: true
- name: pluscloud open prod4
- provider: plusserver GmbH
- poc-kdo:
- active: true
- name: PoC KDO
- provider: KDO Service GmbH / OSISM GmbH
- poc-wgcloud:
- active: true
- name: PoC WG-Cloud OSBA
- provider: Cloud&Heat Technologies GmbH
- syseleven-dus2:
- active: true
- name: SysEleven dus2
- provider: SysEleven GmbH
- syseleven-ham1:
- active: true
- name: SysEleven ham1
- provider: SysEleven GmbH
- regio-a:
- active: true
- name: REGIO.cloud
- provider: OSISM GmbH
- wavestack:
- active: true
- name: Wavestack
- provider: noris network AG/Wavecon GmbH
diff --git a/compliance-monitor/monitor.py b/compliance-monitor/monitor.py
index 10deef50f..aa02cbae1 100755
--- a/compliance-monitor/monitor.py
+++ b/compliance-monitor/monitor.py
@@ -1,4 +1,17 @@
#!/usr/bin/env python3
+# AN IMPORTANT NOTE ON CONCURRENCY:
+# This server is based on uvicorn and, as such, is not multi-threaded.
+# (It could use multiple processes, but we don't do that yet.)
+# Consequently, we don't need to use any measures for thread-safety.
+# However, if we do at some point enable the use of multiple processes,
+# we should make sure that all processes are "on the same page" with regard
+# to basic data such as certificate scopes, templates, and accounts.
+# One way to achieve this synchronicity could be to use the Postgres server
+# more, however, I hope that more efficient ways are possible.
+# Also, it is quite likely that the signal SIGHUP could no longer be used
+# to trigger a re-load. In any case, the `uvicorn.run` call would have to be
+# fundamentally changed:
+# > You must pass the application as an import string to enable 'reload' or 'workers'.
from collections import defaultdict
from datetime import date, datetime, timedelta
from enum import Enum
@@ -7,11 +20,9 @@
import os
import os.path
from shutil import which
+import signal
from subprocess import run
from tempfile import NamedTemporaryFile
-# _thread: low-level library, but (contrary to the name) not private
-# https://docs.python.org/3/library/_thread.html
-from _thread import allocate_lock, get_ident
from typing import Annotated, Optional
from fastapi import Depends, FastAPI, HTTPException, Request, Response, status
@@ -30,8 +41,7 @@
db_find_account, db_update_account, db_update_publickey, db_filter_publickeys, db_get_reports,
db_get_keys, db_insert_report, db_get_recent_results2, db_patch_approval2, db_get_report,
db_ensure_schema, db_get_apikeys, db_update_apikey, db_filter_apikeys, db_clear_delegates,
- db_patch_subject, db_get_subjects, db_insert_result2, db_get_relevant_results2, db_add_delegate,
- db_find_subjects,
+ db_find_subjects, db_insert_result2, db_get_relevant_results2, db_add_delegate,
)
@@ -117,10 +127,7 @@ class ViewType(Enum):
templates_map = {
k: None for k in REQUIRED_TEMPLATES
}
-# map thread id (cf. `get_ident`) to a dict that maps scope uuids to scope documents
-# -- access this using function `get_scopes`
-_scopes = defaultdict(dict) # thread-local storage (similar to threading.local, but more efficient)
-_scopes_lock = allocate_lock() # mutex lock so threads can add their local storage without races
+_scopes = {} # map scope uuid to `PrecomputedScope` instance
class TimestampEncoder(json.JSONEncoder):
@@ -216,8 +223,6 @@ def import_bootstrap(bootstrap_path, conn):
db_filter_apikeys(cur, accountid, lambda keyid, *_: keyid in keyids)
keyids = set(db_update_publickey(cur, accountid, key) for key in account.get("keys", ()))
db_filter_publickeys(cur, accountid, lambda keyid, *_: keyid in keyids)
- for subject, record in subjects.items():
- db_patch_subject(cur, {'subject': subject, **record})
conn.commit()
@@ -279,6 +284,10 @@ def evaluate(self, scope_results):
# always include draft (but only at the end)
relevant.extend(by_validity['draft'])
passed = [vname for vname in relevant if version_results[vname]['result'] == 1]
+ if passed:
+ summary = 1 if self.versions[passed[0]].validity in ('effective', 'warn') else -1
+ else:
+ summary = 0
return {
'name': self.name,
'versions': version_results,
@@ -288,6 +297,7 @@ def evaluate(self, scope_results):
vname + ASTERISK_LOOKUP[self.versions[vname].validity]
for vname in passed
]),
+ 'summary': summary,
}
def update_lookup(self, target_dict):
@@ -326,19 +336,8 @@ def import_cert_yaml_dir(yaml_path, target_dict):
def get_scopes():
- """returns thread-local copy of the scopes dict"""
- ident = get_ident()
- with _scopes_lock:
- yaml_path = _scopes['_yaml_path']
- counter = _scopes['_counter']
- current = _scopes.get(ident)
- if current is None:
- _scopes[ident] = current = {'_counter': -1}
- if current['_counter'] != counter:
- current.clear()
- import_cert_yaml_dir(yaml_path, current)
- current['_counter'] = counter
- return current
+ """returns the scopes dict"""
+ return _scopes
def import_templates(template_dir, env, templates):
@@ -685,48 +684,17 @@ async def post_results(
conn.commit()
-@app.get("/subjects")
-async def get_subjects(
- request: Request,
- account: Annotated[tuple[str, str], Depends(auth)],
- conn: Annotated[connection, Depends(get_conn)],
- active: Optional[bool] = None, limit: int = 10, skip: int = 0,
-):
- """get subjects, potentially filtered by activity status"""
- check_role(account, roles=ROLES['read_any'])
- with conn.cursor() as cur:
- return db_get_subjects(cur, active, limit, skip)
+def pick_filter(results, subject, scope):
+ """Jinja filter to pick scope results from `results` for given `subject` and `scope`"""
+ return results.get(subject, {}).get(scope, {})
-@app.post("/subjects")
-async def post_subjects(
- request: Request,
- account: Annotated[tuple[str, str], Depends(auth)],
- conn: Annotated[connection, Depends(get_conn)],
-):
- """post approvals to this endpoint"""
- check_role(account, roles=ROLES['admin'])
- content_type = request.headers['content-type']
- if content_type not in ('application/json', ):
- raise HTTPException(status_code=500, detail="Unsupported content type")
- body = await request.body()
- document = json.loads(body.decode("utf-8"))
- records = [document] if isinstance(document, dict) else document
- with conn.cursor() as cur:
- for record in records:
- db_patch_subject(cur, record)
- conn.commit()
-
-
-def passed_filter(results, subject, scope):
- """Jinja filter to pick list of passed versions from `results` for given `subject` and `scope`"""
- subject_data = results.get(subject)
- if not subject_data:
- return ""
- scope_data = subject_data.get(scope)
- if not scope_data:
- return ""
- return scope_data['passed_str']
+def summary_filter(scope_results):
+ """Jinja filter to construct summary from `scope_results`"""
+ passed_str = scope_results.get('passed_str', '') or '–'
+ summary = scope_results.get('summary', 0)
+ color = {1: '✅'}.get(summary, '🛑') # instead of 🟢🔴 (hard to distinguish for color-blind folks)
+ return f'{color} {passed_str}'
def verdict_filter(value):
@@ -741,22 +709,31 @@ def verdict_check_filter(value):
return {1: '✔', -1: '✘'}.get(value, '⚠')
+def reload_static_config(*args, do_ensure_schema=False):
+ # allow arbitrary arguments so it can readily be used as signal handler
+ logger.info("loading static config")
+ scopes = {}
+ import_cert_yaml_dir(settings.yaml_path, scopes)
+ # import successful: only NOW destructively update global _scopes
+ _scopes.clear()
+ _scopes.update(scopes)
+ import_templates(settings.template_path, env=env, templates=templates_map)
+ validate_templates(templates=templates_map)
+ with mk_conn(settings=settings) as conn:
+ if do_ensure_schema:
+ db_ensure_schema(conn)
+ import_bootstrap(settings.bootstrap_path, conn=conn)
+
+
if __name__ == "__main__":
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO)
env.filters.update(
- passed=passed_filter,
+ pick=pick_filter,
+ summary=summary_filter,
verdict=verdict_filter,
verdict_check=verdict_check_filter,
markdown=markdown,
)
- with mk_conn(settings=settings) as conn:
- db_ensure_schema(conn)
- import_bootstrap(settings.bootstrap_path, conn=conn)
- _scopes.update({
- '_yaml_path': settings.yaml_path,
- '_counter': 0,
- })
- _ = get_scopes() # make sure they can be read
- import_templates(settings.template_path, env=env, templates=templates_map)
- validate_templates(templates=templates_map)
+ reload_static_config(do_ensure_schema=True)
+ signal.signal(signal.SIGHUP, reload_static_config)
uvicorn.run(app, host='0.0.0.0', port=8080, log_level="info", workers=1)
diff --git a/compliance-monitor/sql.py b/compliance-monitor/sql.py
index e0a1160a9..b4c2549e0 100644
--- a/compliance-monitor/sql.py
+++ b/compliance-monitor/sql.py
@@ -7,7 +7,6 @@
# use ... (Ellipsis) here to indicate that no default value exists (will lead to error if no value is given)
ACCOUNT_DEFAULTS = {'subject': ..., 'api_key': ..., 'roles': ...}
PUBLIC_KEY_DEFAULTS = {'public_key': ..., 'public_key_type': ..., 'public_key_name': ...}
-SUBJECT_DEFAULTS = {'subject': ..., 'name': ..., 'provider': None, 'active': False}
class SchemaVersionError(Exception):
@@ -78,12 +77,6 @@ def db_ensure_schema_common(cur: cursor):
accountid integer NOT NULL REFERENCES account ON DELETE CASCADE ON UPDATE CASCADE,
UNIQUE (accountid, keyname)
);
- CREATE TABLE IF NOT EXISTS subject (
- subject text PRIMARY KEY,
- active boolean,
- name text,
- provider text
- );
CREATE TABLE IF NOT EXISTS report (
reportid SERIAL PRIMARY KEY,
reportuuid text UNIQUE,
@@ -409,29 +402,3 @@ def db_patch_approval2(cur: cursor, record):
RETURNING resultid;''', record)
resultid, = cur.fetchone()
return resultid
-
-
-def db_get_subjects(cur: cursor, active: bool, limit, skip):
- """list subjects"""
- columns = ('subject', 'active', 'name', 'provider')
- cur.execute(sql.SQL('''
- SELECT subject, active, name, provider
- FROM subject
- {where_clause}
- LIMIT %(limit)s OFFSET %(skip)s;''').format(
- where_clause=make_where_clause(
- None if active is None else sql.SQL('active = %(active)s'),
- ),
- ), {"limit": limit, "skip": skip, "active": active})
- return [{col: val for col, val in zip(columns, row)} for row in cur.fetchall()]
-
-
-def db_patch_subject(cur: cursor, record: dict):
- sanitized = sanitize_record(record, SUBJECT_DEFAULTS)
- cur.execute('''
- INSERT INTO subject (subject, active, name, provider)
- VALUES (%(subject)s, %(active)s, %(name)s, %(provider)s)
- ON CONFLICT (subject)
- DO UPDATE
- SET active = EXCLUDED.active, name = EXCLUDED.name, provider = EXCLUDED.provider
- ;''', sanitized)
diff --git a/compliance-monitor/templates/overview.md.j2 b/compliance-monitor/templates/overview.md.j2
index f37340479..36e3ced23 100644
--- a/compliance-monitor/templates/overview.md.j2
+++ b/compliance-monitor/templates/overview.md.j2
@@ -6,37 +6,37 @@ for the time being to have the highest degree of control
| Name | Description | Operator | [SCS-compatible IaaS](https://docs.scs.community/standards/scs-compatible-iaas/) | HealthMon |
|-------|--------------|-----------|----------------------|:----------:|
| [gx-scs](https://github.com/SovereignCloudStack/docs/blob/main/community/cloud-resources/plusserver-gx-scs.md) | Dev environment provided for SCS & GAIA-X context | plusserver GmbH |
-{#- #} [{{ results | passed('gx-scs', iaas) or '–' }}]({{ detail_url('gx-scs', iaas) }}) {# -#}
+{#- #} [{{ results | pick('gx-scs', iaas) | summary }}]({{ detail_url('gx-scs', iaas) }}) {# -#}
| [HM](https://health.gx-scs.sovereignit.cloud:3000/) |
| [aov.cloud](https://www.aov.de/) | Community cloud for customers | aov IT.Services GmbH |
-{#- #} [{{ results | passed('aov', iaas) or '–' }}]({{ detail_url('aov', iaas) }}) {# -#}
+{#- #} [{{ results | pick('aov', iaas) | summary }}]({{ detail_url('aov', iaas) }}) {# -#}
| [HM](https://health.aov.cloud/) |
| [CNDS](https://cnds.io/) | Public cloud for customers | artcodix GmbH |
-{#- #} [{{ results | passed('artcodix', iaas) or '–' }}]({{ detail_url('artcodix', iaas) }}) {# -#}
+{#- #} [{{ results | pick('artcodix', iaas) | summary }}]({{ detail_url('artcodix', iaas) }}) {# -#}
| [HM](https://ohm.muc.cloud.cnds.io/) |
| [pluscloud open](https://www.plusserver.com/en/products/pluscloud-open)
(4 regions) | Public cloud for customers | plusserver GmbH | {# #}
-{#- #}prod1: [{{ results | passed('pco-prod1', iaas) or '–' }}]({{ detail_url('pco-prod1', iaas) }}){# -#}
+{#- #}prod1: [{{ results | pick('pco-prod1', iaas) | summary }}]({{ detail_url('pco-prod1', iaas) }}){# -#}
-{#- #}prod2: [{{ results | passed('pco-prod2', iaas) or '–' }}]({{ detail_url('pco-prod2', iaas) }}){# -#}
+{#- #}prod2: [{{ results | pick('pco-prod2', iaas) | summary }}]({{ detail_url('pco-prod2', iaas) }}){# -#}
-{#- #}prod3: [{{ results | passed('pco-prod3', iaas) or '–' }}]({{ detail_url('pco-prod3', iaas) }}){# -#}
+{#- #}prod3: [{{ results | pick('pco-prod3', iaas) | summary }}]({{ detail_url('pco-prod3', iaas) }}){# -#}
-{#- #}prod4: [{{ results | passed('pco-prod4', iaas) or '–' }}]({{ detail_url('pco-prod4', iaas) }}) {# -#}
+{#- #}prod4: [{{ results | pick('pco-prod4', iaas) | summary }}]({{ detail_url('pco-prod4', iaas) }}) {# -#}
| [HM1](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-pco)
[HM2](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod2)
[HM3](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod3)
[HM4](https://health.prod1.plusserver.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?orgId=1&var-mycloud=plus-prod4) |
| PoC KDO | Cloud PoC for FITKO | KDO Service GmbH / OSISM GmbH |
-{#- #} [{{ results | passed('poc-kdo', iaas) or '–' }}]({{ detail_url('poc-kdo', iaas) }}) {# -#}
+{#- #} [{{ results | pick('poc-kdo', iaas) | summary }}]({{ detail_url('poc-kdo', iaas) }}) {# -#}
| (soon) |
| PoC WG-Cloud OSBA | Cloud PoC for FITKO | Cloud&Heat Technologies GmbH |
-{#- #} [{{ results | passed('poc-wgcloud', iaas) or '–' }}]({{ detail_url('poc-wgcloud', iaas) }}) {# -#}
+{#- #} [{{ results | pick('poc-wgcloud', iaas) | summary }}]({{ detail_url('poc-wgcloud', iaas) }}) {# -#}
| [HM](https://health.poc-wgcloud.osba.sovereignit.cloud:3000/d/9ltTEmlnk/openstack-health-monitor2?var-mycloud=poc-wgcloud&orgId=1) |
| [REGIO.cloud](https://regio.digital) | Public cloud for customers | OSISM GmbH |
-{#- #} [{{ results | passed('regio-a', iaas) or '–' }}]({{ detail_url('regio-a', iaas) }}) {# -#}
+{#- #} [{{ results | pick('regio-a', iaas) | summary }}]({{ detail_url('regio-a', iaas) }}) {# -#}
| [HM](https://apimon.services.regio.digital/public-dashboards/17cf094a47404398a5b8e35a4a3968d4?orgId=1&refresh=5m) |
| [syseleven](https://www.syseleven.de/en/products-services/openstack-cloud/)
(2 SCS regions) | Public OpenStack Cloud | SysEleven GmbH | {# #}
-{#- #}dus2: [{{ results | passed('syseleven-dus2', iaas) or '–' }}]({{ detail_url('syseleven-dus2', iaas) }}){# -#}
+{#- #}dus2: [{{ results | pick('syseleven-dus2', iaas) | summary }}]({{ detail_url('syseleven-dus2', iaas) }}){# -#}
-{#- #}ham1: [{{ results | passed('syseleven-ham1', iaas) or '–' }}]({{ detail_url('syseleven-ham1', iaas) }}) {# -#}
+{#- #}ham1: [{{ results | pick('syseleven-ham1', iaas) | summary }}]({{ detail_url('syseleven-ham1', iaas) }}) {# -#}
| (soon)
(soon) |
| [Wavestack](https://www.noris.de/wavestack-cloud/) | Public cloud for customers | noris network AG/Wavecon GmbH |
-{#- #} [{{ results | passed('wavestack', iaas) or '–' }}]({{ detail_url('wavestack', iaas) }}) {# -#}
+{#- #} [{{ results | pick('wavestack', iaas) | summary }}]({{ detail_url('wavestack', iaas) }}) {# -#}
| [HM](https://health.wavestack1.sovereignit.cloud:3000/) |