The Hybrid Cloud Operator for Object Storage is a Kubernetes Operator that has been designed for hybrid cloud, multi-teams kubernetes platforms to allow teams to deploy and manage their own object storages via kubernetes without cloud provider specific provisioning.
In classical cloud environments object storage would typically be managed by a central platform team via infrastructure automation like terraform. But this means when different teams are active on such a platform there exists a bottleneck because that central platform team must handle all requests for object storage buckets and accounts. With this operator teams in kubernetes gain the potential to manage buckets on their own. And because the operator integrates into the kubernetes API the teams have the same unified interface/API for all their deployments: Kubernetes YAMLs.
Additionally the operator also provides a consistent interface regardless of the environment (cloud provider, on-premise) the kubernetes cluster runs in. This means in usecases where teams have to deploy to clusters running in different environments they still get the same interface on all clusters and do not have to concern themselves with any differences.
Main features:
- Provides Kubernetes Custom resources for deploying and managing object storage buckets
- Abstracted, unified API regardless of target environment (cloud, on-premise)
- Currently supported backends:
- Azure Storage Account blob services
Planned features:
- Support for AWS S3 backend
- Support for on-prem environments using MinIO
- New CRD to manage access to the buckets (other services/users can request read/write access to a bucket via separate credentials)
To test out the operator you currently need an Azure account with a service principal and a kubernetes cluster (you can for example create a local one with k3d) with cluter-admin permissions.
-
Run
helm repo add hybrid-cloud-object-storage-operator https://maibornwolff.github.io/hybrid-cloud-object-storage-operator/
to prepare the helm repository. -
Prepare secret with azure credentials as
operator-creds.yaml
and apply it to the cluster (e.g.kubectl apply -f operator-creds.yaml
):apiVersion: v1 kind: Secret type: Opaque metadata: name: hybrid-cloud-object-storage-operator-creds namespace: default stringData: AZURE_SUBSCRIPTION_ID: "<your-azure-subscription-id>" AZURE_TENANT_ID: "<your-azure-tenant-id>" AZURE_CLIENT_ID: "<service-principal-client-id>" AZURE_CLIENT_SECRET: "<service-principal-password>"
-
Prepare operator config as
operator-values.yaml
:serviceAccount: create: true envSecret: hybrid-cloud-object-storage-operator-creds operatorConfig: | handler_on_resume: false backend: azureblob backends: azureblob: subscription_id: <your-azure-subscription-id> location: westeurope name_pattern: "<your-prefix>{name}" resource_group: <your-azure-resource-group> kind: StorageV2 access_tier: Hot sku: name: "Standard_LRS" allow_anonymous_access: true parameters: network: public_access: true
-
Run
helm install hybrid-cloud-object-storage-operator-crds maibornwolff/hybrid-cloud-object-storage-operator-crds
to install the CRDs for the operator. -
Run
helm install hybrid-cloud-object-storage-operator hybrid-cloud-object-storage-operator/hybrid-cloud-object-storage-operator -f operator-values.yaml
to install the operator. -
Check if the pod of the operator is running and healthy:
kubectl get pods -l app.kubernetes.io/name=hybrid-cloud-object-storage-operator
. -
Create your first bucket:
kubectl apply -f examples/azureblob.yaml
. -
Check in azure to see if the new storage account is created.
-
Retrieve the credentials for the storage account:
kubectl get secret demoteam-storage-credentials -o jsonpath="{.data.key}" | base64 -d
-
After you are finished, delete the bucket again:
kubectl delete -f examples/azureblob.yaml
To achieve its hybrid-cloud feature the operator abstracts between the generic API (Custom resource ObjectStorageBucket
) and the concrete implementation for a specific cloud service. The concrete implementations are called backends. You can configure which backends should be active in the configuration. If you have several backends active the user can also select one.
The operator can be configured using a yaml-based config file. This is the complete configuration file with all options. Please refer to the comments in each line for explanations:
handler_on_resume: false # If set to true the operator will reconcile every available resource on restart even if there were no changes
backend: helmbitnami # Default backend to use, required
allowed_backends: [] # List of backends the users can select from. If list is empty the default backend is always used regardless of if the user selects a backend
backends: # Configuration for the different backends. Required fields are only required if the backend is used
azureblob:
subscription_id: 1-2-3-4-5 # Azure Subscription id to provision storage account in, required
location: westeurope # Location to provision storage account in, required
name_pattern: "{namespace}{name}" # Pattern to use for naming storage accounts in azure. Variables {namespace} and {name} can be used and will be replaced by metadata.namespace and metadata.name of the custom object, required
resource_group: foobar-rg # Resource group to provision the storage account in, required
delete_fake: false # If enabled on delete the storage account will not actually be deleted but only be tagged, optional
lock_from_deletion: false # If enabled an azure lock will be set on the storage account object, requires owner permissions for the operator, optional
tags: {} # Extra tags to add to the storage account resource in Azure. variables {namespace} and {name} can be used, optional
kind: StorageV2 # Kind to use for the storage accounts, optional
access_tier: Hot # Access tier for the storage accounts, can be Hot or Cold, optional
hns_enabled: false # Enable hierarchical namespace (only during bucket creation), optional
sku:
name: "Standard_LRS" # Name of the SKU to use for the storage accounts. If this is set, the settings of classes are ignored
classes: # List of size classes the user can select from, optional
lrs: # Name of the class, required
name: "Standard_LRS" # Name of the SKU in Azure, required
hns_enabled: false # Enable hierarchical namespace (only during bucket creation), optional
grs:
name: "Standard_GRS"
default_class: lrs # Name of the class to use as default if the user-provided one is invalid or not available, required if classes should be usable
allow_anonymous_access: false # If set to true users can configure their storage accouts to allow anonymous access to blobs
network:
allow_azure_services: true # If enabled a firewall rule will be added so that azure services can access the storage account, optional
vnets: # List of vnets the storage account should allow access from. Each vnet listed here must have Microsoft.Storage added to the ServiceEndpoints collection of the subnet, optional
- vnet: foobar-vnet # Name of the virtual network, required
subnet: default # Name of the subnet, required
backup: # Configuration for use of Azure Backup Services. vault_name and either policy_name or default_class with classes are mandatory, if you want to use Azure Backup
vault_name: foobar-vault # The name of the existing backup vault, make sure the Storage Account has the Role Assignment "Storage Account Backup Contributor" for the according vault
policy_name: 123123123 # The policy within the backup vault to use
default_class: 60d # The name of a class defined in classes
classes: # List of classes the user can select from
60d: # Name of the class
name: backup-policy-60d # Name of the policy in the backup vault
90d:
name: backup-policy-90d
parameters: # Fields here define defaults for parameters also in the CRD and are used if the parameter is not set in the custom object supplied by the user
network:
public_access: false # If set to true no network restrictions are placed on the storage account, if set to false access is only possible through vnet and firewall rules, optional
firewall_rules: # List of firewall rules to add to the storage account. Only take effect if public_access is set to false, optional
- name: foobar # Name of the rule, required
cidr: 10.1.2.0/24 # CIDR (with a suffix of < 30) or single IP, required
versioning:
enabled: false # If set to true data versioning will be enabled on the storage account
delete_retention:
enabled: false # It set to true retention of deleted data will be enabled, optional
days: 2 # Number of days to keep deleted data, optional
sftp: # SFTP feature can only be enabled for the first time at creation of the storage account. Background: The hierarchical namespace setting is needed for SFTP and will be used implicitly but it can be only set at creation time.
enabled: false # enable SFTP interface, optional
backup:
enabled: false # If enabled, the storage accounts will be added to an existing backup vault by default. Backup instances will not be cleaned up with Object Storage Buckets for recovery purposes
Single configuration options can also be provided via environment variables, the complete path is concatenated using underscores, written in uppercase and prefixed with HYBRIDCLOUD_
. As an example: backends.azureblob.subscription_id
becomes HYBRIDCLOUD_BACKENDS_AZUREBLOB_SUBSCRIPTION_ID
.
To protect storage accounts against accidential deletion you can enable lock_from_deletion
in the azureblob backend. When enabled the operator will create a delete lock on the resource in Azure. Note that the operator will not remove that lock when the object in kubernetes is deleted, you have to do that yourself via either the Azure CLI or the Azure Portal so the operator can delete the storage account. If that is not done the kubernetes object cannot be deleted and any calls ala kubectl delete
will hang until the lock is manually removed.
The azure backend also support a feature called fake deletion
(via options delete_fake
) where the storage accounts are not actually deleted but only tagged to mark it as deleted when the kubernetes custom object is deleted. This can be used in situations where the operator is freshly introduced in an environment where the users have little experience with this type of declarative management and you want to reduce the risk of accidental data loss.
For the azureblob backend there are several ways to protect the storage accounts from external access. One is on the network layer by disabling network access to the accounts from outside the cluster (via the parameters.network.public_access
and parameters.network.firewall_rules
and network.vnets
) and the other is on the access layer by disallowing anonymous access (via allow_anonymous_access
, this only gives the users the right to configure anonymous access, unless a user specifically does that only authenticated access is possible).
The azureblob backend supports backups using Azure Backup Vaults. To enable Azure backup, first set the field backup.vault_name
(the existing backup vault to use) and either backup.policy_name
(the existing policy to use) or backup.default_class
with backup.classes
(the existing policies to use). Now you can either enable backups by default using the field parameters.backup.enabled
or configure backup per manifest using the field backup.enabled
. Note: the configuration in the manifest overrides the global operator configuration.
For the operator to interact with Azure it needs credentials. For local testing it can pick up the token from the azure cli but for real deployments it needs a dedicated service principal. Supply the credentials for the service principal using the environment variables AZURE_SUBSCRIPTION_ID
, AZURE_TENANT_ID
, AZURE_CLIENT_ID
and AZURE_CLIENT_SECRET
(if you deploy via the helm chart use the use envSecret
value). Depending on the backend the operator requires the following azure permissions within the scope of the resource group it deploys to:
Microsoft.Storage/*
Microsoft.Authorization/locks/*
, optional, if you want the operator to set delete locks
Instead of using a service principal the operator can also be used with a managed identity using Workload Identity. To do so:
-
Make sure your AKS cluster is configured with the workload identity feature.
-
Create the managed identity and assign it the needed permissions (e.g. Contributor role).
-
Establish the federated identity (as described in the linked guide), as subject use
system:serviceaccount:default:hybrid-cloud-object-storage-operator
(assuming you install the operator into thedefault
namespace and do not change the name of the serviceaccount). -
Configure the operator to use the workload identity. Add the following to the values:
podLabels: azure.workload.identity/use: "true" serviceAccount: annotations: azure.workload.identity/client-id: <client-id-of-managed-identity>
-
You do not need to provide a secret with credentials as described for the service principal.
The operator can be deployed via helm chart:
- Run
helm repo add hybrid-cloud-object-storage-operator https://maibornwolff.github.io/hybrid-cloud-object-storage-operator/
to prepare the helm repository. - Run
helm install hybrid-cloud-object-storage-operator-crds maibornwolff/hybrid-cloud-object-storage-operator-crds
to install the CRDs for the operator. - Run
helm install hybrid-cloud-object-storage-operator hybrid-cloud-object-storage-operator/hybrid-cloud-object-storage-operator -f values.yaml
to install the operator.
Configuration of the operator is done via helm values. For a full list of the available values see the values.yaml in the chart. These are the important ones:
operatorConfig
: overwrite this with your specific operator configenvSecret
: Name of a secret with sensitive credentials (e.g. Azure service principal credentials)serviceAccount.create
: Either set this to true or create the serviceaccount with appropriate permissions yourself and setserviceAccount.name
to its name
The operator is completely controlled via Kubernetes custom resources (ObjectStorageBucket
). Once a bucket object is created the operator will utilize one of its backends to provision an actual object storage bucket.
The ObjectStorageBucket
resource has the following options:
apiVersion: hybridcloud.maibornwolff.de/v1alpha1
kind: ObjectStorageBucket
metadata:
name: teamfoo # Name of the bucket, based on this a name in the backend will be generated, required
namespace: default # Kubernetes namespace, required
spec:
backend: azureblob # Name of the backend to use, optional, should be left empty unless provided by the admin
interface: azureblob # Interface to use for the bucket, defaults to the native interface of the backend (Azure Storage API for azureblob), optional
network: # Network related features, optional
publicAccess: false # If set to false access to the bucket is only possible from inside the cluster and the network ranges specified under firewallRules, optional
firewallRules: # If the backend supports it a list of firewall rules to configure access from outside the cluster, optional
- name: foobar # Name of the rule, required
cidr: 10.1.2.0/24 # CIDR (with a suffix of < 30) or single IP, required
security:
anonymousAccess: false # It set to true anonymous access can be enabled for containers, optional
cors: # A list of CORS rules to configure for the bucket, relevant if the bucket is used as a sort of fileserver, optional
- name: foobar # Name of the CORS rule, required
allowedOrigins: # List of origins to allow, required
- https://my.origin.site
allowedMethods: # List of HTTP methods to allow (must be one or more of ["DELETE", "GET", "HEAD", "MERGE", "OPTIONS", "PATCH", "POST", "PUT"]), required
- GET
exposedHeaders: # List of HTTP headers to allow in the request, wildcard "*" is allowed, required
- "*"
allowedHeaders: # List of HTTP headers to allow in the response, wildcard "*" is allowed, required
- "*"
maxAgeInSeconds: 200 # Time in seconds the bucket should cache Preflight-OPTIONS requests, required
dataRetention: # Settings related to data retention, optional; will not be respected if backup.enabled = true
versioning: # Settings related to versioning, optional
enabled: false # Enable versioning in storage account, optional
deleteRetention: # Settings related to delete retention, optional
enabled: false # Enable retention on delete, optional
retentionPeriodInDays: 1 # Days to keep deleted data, optional
backup:
enabled: false # Override the default backup strategy configured in the global operator config
size:
class: lrs # Resource class to use, available classes are specified by the operator admin. If sku is specified in the operator config, this setting is ignored. optional
lifecycle: # Define rules which determine the lifecycle of blob resources, optional
rules:
- name: foobar-rule # Lifecycle rule name, optional
blobPrefix: foobar # Prefix of blob resources to apply rule to, required
deleteDaysAfterModification: 30 # Delete blob resources after number of days after last modification, required
containers: # Only relevant for azure, list of containers to create in the bucket, for azure at least one is required, containers not on the list will be removed from the storage account, including their data
- name: assets # Name of the container, required
anonymousAccess: false # If set to true objects in the container can be accessed without authentication/authorization, only relevant if `security.anonymousAccess` is set to true, optional
sftp: # SFTP feature can only be enabled for the first time at creation of the storage account. Background: The hierarchical namespace setting is needed for SFTP and will be used implicitly but it can be only set at creation time.
enabled: true # enable SFTP interface, required
users: # creating users that can access the bucket via SFTP protocol
- username: techuser # username, required
access: # definition which ressources can be accessed by the user. Currently only blob resources are supported.
- container: assets # name of the container
permissions: # list of the operations a user can do. Possible values are READ, WRITE, DELETE, LIST, CREATE
- READ
- LIST
sshKeys: # public key authentication is supported, required
- description: just a sample description # key description
publicKey: # public key, required
credentialsSecret: teamfoo-storage-credentials # Name of a secret where the credentials for the bucket should be stored, required
Depending on the operator configuration buckets by default are protected from external or unauthenticated (anonymous) access. If anonymous access is configured very often CORS must also be configured. For more details on CORS see the AWS S3 CORS docs or the Azure Storage CORS docs.
A service/application that wants to access the bucket should depend on the credentials secret and use its values for the connection. That way it is independent of the actual backend. Provided keys in the secret depend on the interface: For azure blob the fields are: interface
(set to azureblob
), endpoint
, key
, connection_string
(connection string with key and endpoint as accepted by azure storage libraries).
If needed the access credentials for the bucket can be reset/rotated. Add a label operator/action
with value rotate-keys
to the bucket object in kubernetes. The operator will pick up the label, will rotate or regenerate the credentials, update the credentials secret and remove the label from the object to signal completion. You are responsible for restarting any applications/pods using the credentials.
The operator is implemented in Python using the Kopf (docs) framework.
To run it locally follow these steps:
- Create and activate a local python virtualenv
- Install dependencies:
pip install -r requirements.txt
- Setup a local kubernetes cluster, e.g. with k3d:
k3d cluster create
- Apply the CRDs in your local cluster:
kubectl apply -f helm/hybrid-cloud-object-storage-operator-crds/templates/
- If you want to deploy to azure: Either have the azure cli installed and configured with an active login or export the following environment variables:
AZURE_TENANT_ID
,AZURE_CLIENT_ID
,AZURE_CLIENT_SECRET
- Adapt the
config.yaml
to suit your needs - Run
kopf run main.py -A
- In another window apply some objects to the cluster to trigger the operator (see the
examples
folder)
The code is structured in the following packages:
handlers
: Implements the operator interface for the provided custom resources, reacts to create/update/delete events in handler functionsbackends
: Backends for the different environments (currently Azure)util
: Helper and utility functions
- Kopf marks every object it manages with a finalizer, that means that if the operator is down or doesn't work a
kubectl delete
will hang. To work around that edit the object in question (kubectl edit <type> <name>
) and remove the finalizer from the metadata. After that you can normally delete the object. Note that in this case the operator will not take care of cleaning up any azure resources. - If the operator encounters an exception while processing an event in a handler, the handler will be retried after a short back-off time. During the development you can then stop the operator, make changes to the code and start the operator again. Kopf will pick up again and rerun the failed handler.
- When a handler was successfull but you still want to rerun it you need to fake a change in the object being handled. The easiest is adding or changing a label.