This guide is intended for users of the Opensearch Operator. If you want to contribute to the development of the Operator, please see the Design documents and the Developer guide instead.
The Operator can be easily installed using Helm:
- Add the helm repo:
helm repo add opensearch-operator https://opensearch-project.github.io/opensearch-k8s-operator/
- Install the Operator:
helm install opensearch-operator opensearch-operator/opensearch-operator
Follow the instructions in this video to install the Operator:
A few notes on operator releases:
- Please see the project README for a compatibility matrix which operator release is compatible with which OpenSearch release.
- The userguide in the repository corresponds to the current development state of the code. To view the documentation for a specific released version switch to that tag in the Github menu.
- We track feature requests as Github Issues. If you are missing a feature and find an issue for it, please be aware that an issue ticket closed as completed only means that feature has been implemented in the development version. After that it might still take some for the feature to be contained in a release. If you are unsure, please check the list of releases in our Github project if your feature is mentioned in the release notes.
After you have successfully installed the Operator, you can deploy your first OpenSearch cluster. This is done by creating an OpenSearchCluster
custom object in Kubernetes or using Helm.
An OpenSearch cluster can be easily deployed using Helm. Follow the instructions in Cluster Chart Guide to install a cluster.
Create a file cluster.yaml
with the following content:
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: my-first-cluster
namespace: default
spec:
general:
serviceName: my-first-cluster
version: 2.3.0
dashboards:
enable: true
version: 2.3.0
replicas: 1
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "512Mi"
cpu: "200m"
nodePools:
- component: nodes
replicas: 3
diskSize: "5Gi"
nodeSelector:
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "500m"
roles:
- "cluster_manager"
- "data"
Then run kubectl apply -f cluster.yaml
. If you watch the cluster (e.g. watch -n 2 kubectl get pods
), you will see that after a few seconds the Operator will create several pods. First, a bootstrap pod will be created (my-first-cluster-bootstrap-0
) that helps with initial master discovery. Then three pods for the OpenSearch cluster will be created (my-first-cluster-masters-0/1/2
), and one pod for the dashboards instance. After the pods are appearing as ready, which normally takes about 1-2 minutes, you can connect to your cluster using port-forwarding.
Run kubectl port-forward svc/my-first-cluster-dashboards 5601
, then open http://localhost:5601 in your browser and log in with the default demo credentials admin / admin
.
Alternatively, if you want to access the OpenSearch REST API, run: kubectl port-forward svc/my-first-cluster 9200
. Then open a second terminal and run: curl -k -u admin:admin https://localhost:9200/_cat/nodes?v
. You should see the three deployed pods listed.
If you'd like to delete your cluster, run: kubectl delete -f cluster.yaml
. The Operator will then clean up and delete any Kubernetes resources created for the cluster. Note that this will not delete the persistent volumes for the cluster, in most cases. For a complete cleanup, run: kubectl delete pvc -l opster.io/opensearch-cluster=my-first-cluster
to also delete the PVCs.
The minimal cluster you deployed in this section is only intended for demo purposes. Please see the next sections on how to configure and manage the different aspects of your cluster.
Single-Node clusters are currently not supported. Your cluster must have at least 3 nodes with the master/cluster_manager
role configured.
The majority of this guide deals with configuring and managing OpenSearch clusters. But there are some general options that can be configured for the operator itself. All of this is done using helm values your provide during installation: helm install opensearch-operator opensearch-operator/opensearch-operator -f values.yaml
.
For a list of all possible values see the chart default values.yaml. Some important ones:
manager:
# Log level of the operator. Possible values: debug, info, warn, error
loglevel: info
# If specified, the operator will be restricted to watch objects only in the desired namespace. Defaults is to watch all namespaces.
watchNamespace:
# Configure extra environment variables for the operator. You can also pull them from secrets or configmaps
extraEnv: []
# - name: MY_ENV
# value: somevalue
There have been situations reported where the operator is leaking memory. To help diagnose these situations the standard go pprof endpoints can be enabled by adding the following to your values.yaml
:
manager:
pprofEndpointsEnabled: true
The access the endpoints you will need to use a port-forward as for security reasons the endpoints are only exposed on localhost inside the pod: kubectl port-forward deployment/opensearch-operator-controller-manager 6060
. Then from another terminal you can use the go pprof tool, e.g.: go tool pprof http://localhost:6060/debug/pprof/heap
.
The main job of the operator is to deploy and manage OpenSearch clusters. As such it offers a wide range of options to configure clusters.
OpenSearch clusters are composed of one or more node pools, with each representing a logical group of nodes that have the same role. Each node pool can have its own resources. For each configured nodepool the operator will create a Kubernetes StatefulSet. It also creates a Kubernetes service object for each nodepool so you can communicate with a specfic nodepool if you want.
spec:
nodePools:
- component: masters
replicas: 3 # The number of replicas
diskSize: "30Gi" # The disk size to use
resources: # The resource requests and limits for that nodepool
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "500m"
roles: # The roles the nodes should have
- "cluster_manager"
- "data"
- component: nodes
replicas: 3
diskSize: "10Gi"
nodeSelector:
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "500m"
roles:
- "data"
Additional configuration options are available for node pools and are documented in this guide in later sections.
The Operator automatically generates the main OpenSearch configuration file opensearch.yml
based on the parameters you provide in the different sections (e.g. TLS configuration). If you need to add your own settings, you can do that using the additionalConfig
field in the cluster spec:
spec:
general:
# ...
additionalConfig:
some.config.option: somevalue
# ...
nodePools:
- component: masters
# ...
additionalConfig:
some.other.config: foobar
Using spec.general.additionalConfig
you can add settings to all nodes, using nodePools[].additionalConfig
you can add settings to only a pool of nodes. The settings must be provided as a map of strings, so use the flat form of any setting. If the value you want to provide is not a string, put it in quotes (for example "true"
or "1234"
). The Operator merges its own generated settings with whatever extra settings you provide. Note that basic settings like node.name
, node.roles
, cluster.name
and settings related to network and discovery are set by the Operator and cannot be overwritten using additionalConfig
. The value of spec.general.additionalConfig
is also used for configuring the bootstrap pod. To overwrite the values of the bootstrap pod, set the field spec.bootstrap.additionalConfig
.
Note that changing any of the additionalConfig
will trigger a rolling restart of the cluster. If want to avoid that please use the Cluster Settings API to change them at runtime.
For security reasons, encryption is required for communication with the OpenSearch cluster and between cluster nodes. If you do not configure any encryption, OpenSearch will use the included demo TLS certificates, which are not ideal for most active deployments.
Depending on your requirements, the Operator offers two ways of managing TLS certificates. You can either supply your own certificates, or the Operator will generate its own CA and sign certificates for all nodes using that CA. The second option is recommended, unless you want to directly expose your OpenSearch cluster outside your Kubernetes cluster, or your organization has rules about using self-signed certificates for internal communication.
⚠️ Clusters with operator-generated certificates will stop working after 1 year: Make sure you have tested certificate renewals in your cluster before putting it in production!
TLS certificates are used in three places, and each can be configured independently.
OpenSearch cluster nodes communicate with each other using the OpenSearch transport protocol (port 9300 by default). This is not exposed externally, so in almost all cases, generated certificates should be adequate.
To configure node transport security you can use the following fields in the OpenSearchCluster
custom resource:
# ...
spec:
security:
tls: # Everything related to TLS configuration
transport: # Configuration of the transport endpoint
generate: true # Have the operator generate and sign certificates
perNode: true # Separate certificate per node
secret:
name: # Name of the secret that contains the provided certificate
caSecret:
name: # Name of the secret that contains a CA the operator should use
nodesDn: [] # List of certificate DNs allowed to connect
adminDn: [] # List of certificate DNs that should get admin access
# ...
To have the Operator generate the certificates, you only need to set the generate
and perNode
fields to true
(all other fields can be omitted). The Operator will then generate a CA certificate and one certificate per node, and then use the CA to sign the node certificates. These certificates are valid for one year. Note that the Operator does not currently have certificate renewal implemented.
Alternatively, you can provide the certificates yourself (e.g. if your organization has an internal CA). You can either provide one certificate to be used by all nodes or provide a certificate for each node (recommended). In this mode, set generate: false
and perNode
to true
or false
depending on whether you're providing per-node certificates.
If you provide just one certificate, it must be placed in a Kubernetes TLS secret (with the fields ca.crt
, tls.key
and tls.crt
, must all be PEM-encoded), and you must provide the name of the secret as secret.name
. If you want to keep the CA certificate separate, you can place it in a separate secret and supply that as caSecret.name
. If you provide one certificate per node, you must place all certificates into one secret (including the ca.crt
) with a <hostname>.key
and <hostname>.crt
for each node. The hostname is defined as <cluster-name>-<nodepool-component>-<index>
(e.g. my-first-cluster-masters-0
).
If you provide the certificates yourself, you must also provide the list of certificate DNs in nodesDn
, wildcards can be used (e.g. "CN=my-first-cluster-*,OU=my-org"
).
If you provide your own node certificates you must also provide an admin cert that the operator can use for managing the cluster:
spec:
security:
config:
adminSecret:
name: my-first-cluster-admin-cert # The secret must have keys tls.crt and tls.key
Make sure the DN of the certificate is set in the adminDn
field.
Each OpenSearch cluster node exposes the REST API using HTTPS (by default port 9200).
To configure HTTP API security, the following fields in the OpenSearchCluster
custom resource are available:
# ...
spec:
security:
tls: # Everything related to TLS configuration
http: # Configuration of the HTTP endpoint
generate: true # Have the Operator generate and sign certificates
secret:
name: # Name of the secret that contains the provided certificate
caSecret:
name: # Name of the secret that contains a CA the Operator should use
# ...
Again, you have the option of either letting the Operator generate and sign the certificates or providing your own. The only difference between node transport certificates and node HTTP/REST APIs is that per-node certificate are not possible here. In all other respects the two work the same way.
If you provide your own certificates, please make sure the following names are added as SubjectAltNames (SAN): <cluster-name>
, <cluster-name>.<namespace>
, <cluster-name>.<namespace>.svc
,<cluster-name>.<namespace>.svc.cluster.local
.
Directly exposing the node HTTP port outside the Kubernetes cluster is not recommended. Rather than doing so, you should configure an ingress. The ingress can then also present a certificate from an accredited CA (for example LetsEncrypt) and hide self-signed certificates that are being used internally. In this way, the nodes should be supplied internally with properly signed certificates.
You can extend the functionality of OpenSearch via plugins. Commonly used ones are snapshot repository plugins for external backups (e.g. to AWS S3 or Azure Blog Storage). The operator has support to automatically install such plugins during setup.
To install a plugin for opensearch add it to the list under general.pluginsList
:
general:
version: 2.3.0
httpPort: 9200
vendor: opensearch
serviceName: my-cluster
pluginsList: ["repository-s3","https://github.com/aiven/prometheus-exporter-plugin-for-opensearch/releases/download/1.3.0.0/prometheus-exporter-1.3.0.0.zip"]
To install a plugin for opensearch dashboards add it to the list under dashboards.pluginsList
:
dashboards:
enable: true
version: 2.4.1
pluginsList:
- sample-plugin-name
Please note:
- Bundled plugins do not have to be added to the list, they are installed automatically
- You can provide either a plugin name or a complete to the plugin zip. The items you provide are passed to the
bin/opensearch-plugin install <plugin-name>
command. - Updating the list for an already installed cluster will lead to a rolling restart of all opensearch nodes to install the new plugin.
- If your plugin requires additional configuration you must provide that either through
additionalConfig
(see section Configuring opensearch.yml) or as secrets in the opensearch keystore (see section Add secrets to keystore).
Some OpenSearch features (e.g. snapshot repository plugins) require sensitive configuration. This is handled via the opensearch keystore. The operator allows you to populate this keystore using Kubernetes secrets.
To do so add the secrets under the general.keystore
section:
general:
# ...
keystore:
- secret:
name: credentials
- secret:
name: some-other-secret
With this configuration all keys of the secrets will become keys in the keystore.
If you only want to load some keys from a secret or rename the existing keys, you can add key mappings as a map:
general:
# ...
keystore:
- secret:
name: many-secret-values
keyMappings:
# Only read "sensitive-value" from the secret, keep its name.
sensitive-value: sensitive-value
- secret:
name: credentials
keyMappings:
# Renames key accessKey in secret to s3.client.default.access_key in keystore
accessKey: s3.client.default.access_key
password: s3.client.default.secret_key
Note that only provided keys will be loaded from the secret! Any keys not specified will be ignored.
What is SmartScaler?
SmartScaler is a mechanism built into the Operator that enables nodes to be safely removed from the cluster. When a node is being removed from a cluster, the safe drain process ensures that all of its data is transferred to other nodes in the cluster before the node is taken offline. This prevents any data loss or corruption that could occur if the node were simply shut down or disconnected without first transferring its data to other nodes.
During the safe drain process, the node being removed is marked as "draining", which means that it will no longer receive any new requests. Instead, it will only process outstanding requests until its workload has been completed. Once all requests have been processed, the node will begin transferring its data to other nodes in the cluster. The safe drain process will continue until all data has been transferred and the node is no longer part of the cluster. Only after that, the OMC will turn down the node.
To configure the amount of memory allocated to the OpenSearch nodes, configure the heap size using the JVM args. This operation is expected to have no downtime and the cluster should be operational.
Recommendation: Set to half of memory request
spec:
nodePools:
- component: nodes
replicas: 3
diskSize: "10Gi"
jvm: -Xmx1024M -Xms1024M
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "500m"
roles:
- "data"
If jvm
is not provided then the java heap size will be set to half of
resources.requests.memory
which is the recommend value for data nodes
If jvm
is not provided and resources.requests.memory
does not exist
then value will be -Xmx512M -Xms512M
.
We don't support dynamic values depending on the node type for now.
OpenSearch requires the Linux kernel vm.max_map_count
option to be set to at least 262144. You can either set this yourself on the Kubernetes hosts using sysctl or you can let the operator take care of it by adding the following option to your cluster spec:
spec:
general:
setVMMaxMapCount: true
This will configure an init container for each opensearch pod that executes the needed sysctl
command. By default the init container uses a busybox image. If you want to change that (for example to use an image from a private registry), see Custom init helper.
You can configure the snapshot repositories for the OpenSearch cluster through the operator. Using general.snapshotRepositories
settings you can configure multiple snapshot repositories. Once the snapshot repository is configured a user can create custom _ism
policies through dashboard to backup indexes.
spec:
general:
snapshotRepositories:
- name: my_s3_repository_1
type: s3
settings:
bucket: opensearch-s3-snapshot
region: us-east-1
base_path: os-snapshot
- name: my_s3_repository_3
type: s3
settings:
bucket: opensearch-s3-snapshot
region: us-east-1
base_path: os-snapshot_1
Before configuring snapshotRepositories
for a cluster, please ensure the following prerequisites are met:
-
The right cloud provider native plugins are installed. For example:
spec: general: pluginsList: ["repository-s3"]
-
The required roles/permissions for the backend cloud are pre-created. An example AWS IAM role added for kubernetes nodes so that snapshots can be published to the
opensearch-s3-snapshot
s3 bucket:{ "Statement": [ { "Action": [ "s3:ListBucket", "s3:GetBucketLocation", "s3:ListBucketMultipartUploads", "s3:ListBucketVersions" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::opensearch-s3-snapshot" ] }, { "Action": [ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::opensearch-s3-snapshot/*" ] } ], "Version": "2012-10-17" }
The operator can automatically deploy and manage a OpenSearch Dashboards instance. To do so add the following section to your cluster spec:
# ...
spec:
dashboards:
enable: true # Set this to true to enable the Dashboards deployment
version: 2.3.0 # The Dashboards version to deploy. This should match the configured opensearch version
replicas: 1 # The number of replicas to deploy
You can customize the OpenSearch Dashboards configuration (opensearch_dashboards.yml
) using the additionalConfig
field in the dashboards section of the OpenSearchCluster
custom resource:
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
#...
spec:
dashboards:
additionalConfig:
opensearch_security.auth.type: "proxy"
opensearch.requestHeadersWhitelist: |
["securitytenant","Authorization","x-forwarded-for","x-auth-request-access-token", "x-auth-request-email", "x-auth-request-groups"]
opensearch_security.multitenancy.enabled: "true"
You can for example use this to set up any of the backend authentication types for Dashboards.
Note that the configuration must be valid or the Dashboards instance will fail to start.
There are situations where you need to store sensitive information inside the dashboards configuration file (for example a client secret for OpenIDConnect). To do this safely you can utilize the fact that OpenSearch Dashboards does variable substitution in its configuration file.
For this to work you need to create a secret with the sensitive information (for example dashboards-oidc-config
) and then mount that secret as an environment variable into the Dashboards pod (see the section on Adding environment variables to pods on how to do that). You can then reference any keys from that secret in your dashboards configuration.
As an example this is a part of a cluster spec:
spec:
dashboards:
env:
- name: OPENID_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: dashboards-oidc-config
key: client_secret
additionalConfig:
opensearch_security.openid.client_secret: "${OPENID_CLIENT_SECRET}"
Note that changing the value in the secret has no direct influence on the dashboards config. For this to take effect you need to restart the dashboards pods.
When using OpenSearch behind a reverse proxy on a subpath (e.g. /logs
) you have to configure a base path. This can be achieved by setting the base path field in the configuraiton of OpenSearch Dashboards. Behind the scenes the correct configuration options are automatically added to the dashboards configuration.
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
...
spec:
dashboards:
enable: true
basePath: "/logs"
This also sets the server.rewriteBasePath
option to true
. So if you expose Dashboards via an ingress controller you must configure it appropriately.
OpenSearch Dashboards can expose its API/UI via HTTP or HTTPS. It is unencrypted by default. Similar to how the operator handles TLS for the opensearch nodes, to secure the connection you can either let the Operator generate and sign a certificate, or provide your own. The following fields in the OpenSearchCluster
custom resource are available to configure it:
# ...
spec:
dashboards:
enable: true # Deploy Dashboards component
tls:
enable: true # Configure TLS
generate: true # Have the Operator generate and sign a certificate
secret:
name: # Name of the secret that contains the provided certificate
caSecret:
name: # Name of the secret that contains a CA the Operator should use
# ...
To let the Operator generate the certificate, just set tls.enable: true
and tls.generate: true
(the other fields under tls
can be ommitted). Again, as with the node certificates, you can supply your own CA via caSecret.name
for the Operator to use.
If you want to use your own certificate, you need to provide it as a Kubernetes TLS secret (with fields tls.key
and tls.crt
) and provide the name as secret.name
.
If you want to expose Dashboards outside of the cluster, it is recommended to use Operator-generated certificates internally and let an Ingress present a valid certificate from an accredited CA (e.g. LetsEncrypt).
Besides configuring OpenSearch itself, the operator also allows you to customize how the operator deploys the opensearch and dashboards pods.
By default, the Operator will create OpenSearch node pools with persistent storage from the default Storage Class. This behaviour can be changed per node pool. You may supply an alternative storage class and access mode, or configure hostPath or emptyDir storage.
The available storage options are:
The default option is persistent storage via PVCs. You can explicity define the storageClass
if needed:
nodePools:
- component: masters
replicas: 3
diskSize: 30
roles:
- "data"
- "master"
persistence:
pvc:
storageClass: mystorageclass # Set the name of the storage class to be used
accessModes: # You can change the accessMode
- ReadWriteOnce
If you do not want to use persistent storage you can use the emptyDir
option. Beware that this can lead to data loss, so you should only use this option for testing, or for data that is otherwise persisted.
nodePools:
- component: masters
replicas: 3
diskSize: 30
roles:
- "data"
- "master"
persistence:
emptyDir: {} # This configures emptyDir
If you are using emptyDir, it is recommended that you set spec.general.drainDataNodes
to be true
. This will ensure that shards are drained from the pods before rolling upgrades or restart operations are performed.
As a last option you can hose a hostPath
. Please note that hostPath is strongly discouraged, and if you do choose this option, then you must also configure affinity for the node pool to ensure that multiple pods do not schedule to the same Kubernetes host.
nodePools:
- component: masters
replicas: 3
diskSize: 30
roles:
- "data"
- "master"
persistence:
hostPath:
path: "/var/opensearch" # Define the path on the host here
You can set the security context for the Opensearch pods and the Dashboard pod. This is useful when you want to define privilege and access control settings for a Pod or Container. To specify security settings for Pods, include the podSecurityContext
field and for containers, include the securityContext
field.
The structure is the same for both Opensearch pods (in spec.general
) and the Dashboard pod (in spec.dashboards
):
spec:
general:
podSecurityContext:
runAsUser: 1000
runAsGroup: 1000
runAsNonRoot: true
securityContext:
allowPrivilegeEscalation: false
privileged: false
dashboards:
podSecurityContext:
fsGroup: 1000
runAsNonRoot: true
securityContext:
capabilities:
drop:
- ALL
privileged: false
The Opensearch pods by default launch an init container to configure the volume. This container needs to run with root permissions and does not use any defined securityContext. If your kubernetes environment does not allow containers with the root user you need to disable this init helper. In this situation also make sure to set general.setVMMaxMapCount
to false
as this feature also launches an init container with root.
Note that the bootstrap pod started during initial cluster setup uses the same (pod)securityContext as the Opensearch pods (with the same limitations for the init containers).
You can add additional labels or annotations on the nodepool configuration. This is useful for integration with other applications such as a service mesh, or configuring a prometheus scrape endpoint:
In addition, annotations can also be configured globally using the spec.general.annotations
field in the Kubernetes specification. These annotations are not only limited to the node pool but also extend to Kubernetes services, providing flexibility and additional information to enhance the Kubernetes environment.
spec:
nodePools:
- component: masters
replicas: 3
diskSize: "5Gi"
labels: # Add any extra labels as key-value pairs here
someLabelKey: someLabelValue
annotations: # Add any extra annotations as key-value pairs here
someAnnotationKey: someAnnotationValue
nodeSelector:
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "500m"
roles:
- "data"
- "master"
Any annotations and labels defined will be added directly to the pods of the nodepools.
You can add labels or annotations to the dashboard pod specification. This is helpful if you want the dashboard to be part of a service mesh or integrate with other applications that rely on labels or annotations.
spec:
dashboards:
enable: true
version: 1.3.1
replicas: 1
labels: # Add any extra labels as key-value pairs here
someLabelKey: someLabelValue
annotations: # Add any extra annotations as key-value pairs here
someAnnotationKey: someAnnotationValue
Any annotations and labels defined will be added directly to the dashboards pods.
You can configure OpenSearch nodes to use a PriorityClass
using the name of the priority class. This is useful to prevent unwanted evictions of your OpenSearch nodes.
spec:
nodePools:
- component: masters
replicas: 3
diskSize: "5Gi"
priorityClassName: somePriorityClassName
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "500m"
roles:
- "master"
Sometimes it is neccessary to mount ConfigMaps, Secrets, emptyDir, projected volumes, or CSI volumes into the Opensearch pods as volumes to provide additional configuration (e.g. plugin config files). This can be achieved by providing an array of additional volumes to mount to the custom resource. This option is located in either spec.general.additionalVolumes
or spec.dashboards.additionalVolumes
. The format is as follows:
spec:
general:
additionalVolumes:
- name: example-configmap
path: /path/to/mount/volume
#subPath: mykey # Add this to mount only a specific key of the configmap/secret
configMap:
name: config-map-name
restartPods: true #set this to true to restart the pods when the content of the configMap changes
- name: temp
path: /tmp
emptyDir: {}
- name: example-csi-volume
path: /path/to/mount/volume
#subPath: "subpath" # Add this to mount the CSI volume at a specific subpath
csi:
driver: csi-driver-name
readOnly: true
volumeAttributes:
secretProviderClass: example-secret-provider-class
- name: example-projected-volume
path: /path/to/mount/volume
projected:
sources:
serviceAccountToken:
path: "token"
dashboards:
additionalVolumes:
- name: example-secret
path: /path/to/mount/volume
secret:
secretName: secret-name
The defined volumes are added to all pods of the opensearch cluster. It is currently not possible to define them per nodepool.
The operator allows you to add your own environment variables to the opensearch pods and the Dashboards pods. You can provide the value as a string literal or mount it from a secret or configmap.
The structure is the same for both opensearch and dashboards:
spec:
dashboards:
env:
- name: MY_ENV_VAR
value: "myvalue"
- name: MY_SECRET_VAR
valueFrom:
secretKeyRef:
name: my-secret
key: some_key
- name: MY_CONFIGMAP_VAR
valueFrom:
configMapKeyRef:
name: my-configmap
key: some_key
nodePools:
- component: nodes
env:
- name: MY_ENV_VAR
value: "myvalue"
# the other options are supported here as well
If your Kubernetes cluster is configured with a custom domain name (default is cluster.local
) you need to configure the operator accordingly in order for internal routing to work properly. This can be achieved by setting manager.dnsBase
in the helm chart values.
manager:
# ...
dnsBase: custom.domain
During cluster initialization the operator uses init containers as helpers. For these containers a busybox image is used ( specifically docker.io/busybox:latest
). In case you are working in an offline environment and the cluster cannot access the registry or you want to customize the image, you can override the image used by specifying the initHelper
image in your cluster spec:
spec:
initHelper:
# You can either only specify the version
version: "1.27.2-buildcustom"
# or specify a totally different image
image: "mycustomrepo.cr/mycustombusybox:myversion"
# Additionally you can define the imagePullPolicy
imagePullPolicy: IfNotPresent
# and imagePullSecrets if needed
imagePullSecrets:
- name: docker-pull-secret
Init container run without any resource constraints, but it's possible to specify resource requests and limits by adding a resources section to the YAML definition. This allows you to control the amount of CPU and memory allocated to the init container, it's helps to ensure that it doesn't starve other containers, by setting appropriate resource limits.
spec:
initHelper:
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
In some cases, you may want to avoid the chmod
init container (e.g. on OpenShift or if your cluster blocks containers running as root
).
It can be disabled by adding the following to your values.yaml
:
manager:
extraEnv:
- name: SKIP_INIT_CONTAINER
value: "true"
The PDB (Pod Disruption Budget) is a Kubernetes resource that helps ensure the high availability of applications by defining the acceptable disruption level during maintenance or unexpected events.
It specifies the minimum number of pods that must remain available to maintain the desired level of service.
The PDB definition is unique for every nodePool.
You must provide either minAvailable
or maxUnavailable
to configure PDB, but not both.
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
...
spec:
nodePools:
- component: masters
replicas: 3
diskSize: "30Gi"
pdb:
enable: true
minAvailable: 3
- component: datas
replicas: 7
diskSize: "100Gi"
pdb:
enable: true
maxUnavailable: 2
If you want to expose the Dashboards instance of your cluster for users/services outside of your Kubernetes cluster, the recommended way is to do this via ingress.
A simple example:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: opensearch-dashboards
namespace: default
spec:
tls:
- hosts:
- dashboards.my.company
rules:
- host: dashboards.my.company
http:
paths:
- backend:
service:
name: my-cluster-dashboards
port:
number: 5601
path: "/(.*)"
pathType: ImplementationSpecific
Note: If you have enabled HTTPS for dashboards you need to instruct your ingress-controller to use a HTTPS connection internally. This is specific for the controller you are using (e.g. nginx-ingress, traefik, ...).
You can customize the Kubernetes Service object that the operator generates for the Dashboards deployment.
Supported Service Types
- ClusterIP (default)
- NodePort
- LoadBalancer
When using type LoadBalancer you can optionally set the load balancer source ranges.
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
...
spec:
dashboards:
service:
type: LoadBalancer # Set one of the supported types
loadBalancerSourceRanges: "10.0.0.0/24, 192.168.0.0/24" # Optional, add source ranges for a loadbalancer
If you want to expose the REST API of OpenSearch outside your Kubernetes cluster, the recommended way is to do this via ingress. Internally you should use self-signed certificates (you can let the operator generate them), and then let the ingress use a certificate from an accepted CA (for example LetsEncrypt or a company-internal CA). That way you do not have the hassle of supplying custom certificates to the opensearch cluster but your users still see valid certificates.
If the cluster nodes do not spins up before the threshold reaches and the pod restarts the timeouts and thresholds can be configured per node as per the requirements.
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
...
spec:
nodePools:
- component: masters
replicas: 3
diskSize: "30Gi"
probes:
liveness:
initialDelaySeconds: 10
periodSeconds: 20
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 10
startup:
initialDelaySeconds: 10
periodSeconds: 20
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 10
readiness:
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 30
failureThreshold: 5
In addition to the information provided in the previous sections on how to specify resource requirements for the node pools, it is also possible to specify resources for all entities created by the operator for more advanced use cases.
The operator generates many pods via resources such as jobs, stateful sets, replica sets, and others, which utilize InitContainers. The following configuration allows you to specify a default resources config for all InitContainer.
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
...
spec:
initHelper:
resources:
requests:
memory: "50Mi"
cpu: "50m"
limits:
memory: "200Mi"
cpu: "200m"
You can also configure the resources for the security update job as shown below.
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
...
spec:
security:
config:
updateJob:
resources:
limits:
cpu: "100m"
memory: "100Mi"
requests:
cpu: "100m"
memory: "100Mi"
Please note that the examples provided here do not reflect actual resource requirements. You may need to conduct further testing to properly adjust the resources based on your specific needs.
The operator contains several features that automate management tasks that might be needed during the cluster lifecycle. The different available options are documented here.
This operator automatically handles common failure scenarios and restarts crashed pods, normally this is done in a one-by-one fashion to maintain quorum and cluster stability.
In case the operator detects several crashed or missing pods (for a nodepool) at the same time it will switch into a special recovery mode and start all pods at once and allow the cluster to form a new quorum. This parallel recovery mode is currently experimental and only works with PVC-backed storage as it uses the number of existing PVCs to determine the number of missing pods. The recovery is done by temporarily changing the statefulset underlying each nodepool and setting the podManagementPolicy
to Parallel
. If you encounter problems with it, you can disable it by redeploying the operator and adding manager.parallelRecoveryEnabled: false
to your values.yaml
. Please also report any problems by opening an issue in the operator github project.
The recovery mode also kicks in if you deleted your cluster but kept the PVCs around and are then reinstalling the cluster.
If the cluster is using emptyDir i.e. every node pool is using emptyDir, the operator starts recovery in case of these failure scenarios:
- More than half the master nodes are missing or crashed and thus, the quorum is broken.
- All data nodes are missing or crashed and thus, no data node is available.
But since the cluster is using emptyDir, data is lost and not recoverable. So, it is impossible to restore the cluster to its old state. Therefore, the operator deletes and recreates the entire OpenSearch cluster.
The operator supports automatic rolling version upgrades. To do so simply change the general.version
in your cluster spec and reapply it:
spec:
general:
version: 1.2.3
The Operator will then perform a rolling upgrade and restart the nodes one-by-one, waiting after each node for the cluster to stabilize and have a green cluster status. Depending on the number of nodes and the size of the data stored this can take some time.
Downgrades and upgrades that span more than one major version are not supported, as this will put the OpenSearch cluster in an unsupported state. If you are using emptyDir storage for data nodes, it is recommended to set general.drainDataNodes
to true
, otherwise you might lose data.
As explained in the section Configuring opensearch.yml you can add extra opensearch configuration to your cluster. Changing this configuration on an already installed cluster will be detected by the operator and it will do a rolling restart of all cluster nodes to apply that new configuration. The same goes for nodepool-specific configuration like resources
, annotation
or labels
.
If your underlying storage supports online volume expansion the operator can orchestrate that action for you.
To increase the disk volume size set the diskSize
of a nodepool to the desired value and re-apply the cluster spec yaml. This operation is expected to have no downtime and the cluster should be operational.
The following considerations should be taken into account in order to increase the PVC size.
- This only works for PVC-based persistence
- Before considering the expansion of the the cluster disk, make sure the volumes/data is backed up in desired format, so that any failure can be tolerated by restoring from the backup.
- Make sure the cluster storage class has
allowVolumeExpansion: true
before applying the newdiskSize
. For more details checkout the kubernetes storage classes document. - Once the above step is done, the cluster yaml can be applied with new
diskSize
value, to all decalared nodepool components or to single component. - It is best recommended not to apply any new changes to the cluster along with volume expansion.
- Make sure the declared size definitions are proper and consistent, example if the
diskSize
is inG
orGi
, make sure the same size definitions are followed for expansion.
Note: To change the diskSize
from G
to Gi
or vice-versa, first make sure data is backed up and make sure the right conversion number is identified, so that the underlying volume has the same value and then re-apply the cluster yaml. This will make sure the statefulset is re-created with right value in VolueClaimTemplates, this operation is expected to have no downtime.
An important part of any OpenSearch cluster is the user and role management to give users access to the cluster (via the opensearch-security plugin). By default the operator will use the included demo securityconfig with default users (see internal_users.yml for a list of users). For any production installation you should swap that out with your own configuration. There are two ways to do that with the operator:
- Defining your own securityconfig
- Managing users and roles via kubernetes resources
Note that currently a combination of both approaches is not possible. Once you use the CRDs you cannot provide your own securityconfig as those would overwrite each other. We are working on a feature to merge these options.
You can provide your own securityconfig (see the entire demo securityconfig as an example and the Access control documentation of the OpenSearch project) with your own users and roles. To do that, you must provide a secret with all the required securityconfig yaml files.
The Operator can be controlled using the following fields in the OpenSearchCluster
custom resource:
# ...
spec:
security:
config: # Everything related to the securityconfig
securityConfigSecret:
name: # Name of the secret that contains the securityconfig files
adminSecret:
name: # Name of a secret that contains the admin client certificate
adminCredentialsSecret:
name: # Name of a secret that contains username/password for admin access
# ...
Provide the name of the secret that contains your securityconfig yaml files as securityconfigSecret.name
. In the secret, you can provide the files that you want to configure. The operator will only apply the files present in the secret. Note that OpenSearch requires all the files to be applied when the cluster is first created. So, the files that you do not provide in the securityconfig secret, the operator will use the default files provided in the opensearch-security plugin. See opensearch-security for the list of all configuration files and their default values.
If you don't want to use the default files, you must provide at least a minimum configuration for the file. Example:
tenants.yml: |-
_meta:
type: "tenants"
config_version: 2
These minimum configuration files can later be removed from the secret so that you don't overwrite the resources created via the CRDs or the REST APIs when modifying other configuration files.
In addition, you must provide the name of a secret as adminCredentialsSecret.name
that has fields username
and password
for a user that the Operator can use for communicating with OpenSearch (currently used for getting the cluster status, doing health checks and coordinating node draining during cluster scaling operations). This user must be defined in your securityconfig and must have appropriate permissions (currently admin).
You must also configure TLS transport (see Node Transport). You can either let the operator generate all needed certificates or supply them yourself. If you use your own certificates you must also provide an admin certificate that the operator can use to apply the securityconfig.
If you provided your own certificate for node transport communication, then you must also provide an admin client certificate (as a Kubernetes TLS secret with fields ca.crt
, tls.key
and tls.crt
) as adminSecret.name
. The DN of the certificate must be listed under security.tls.transport.adminDn
. Be advised that the adminDn
and nodesDn
must be defined in a way that the admin certficate cannot be used or recognized as a node certficiate, otherwise OpenSearch will reject any authentication request using the admin certificate.
To apply the securityconfig to the OpenSearch cluster, the Operator uses a separate Kubernetes job (named <cluster-name>-securityconfig-update
). This job is run during the initial provisioning of the cluster. The Operator also monitors the secret with the securityconfig for any changes and then reruns the update job to apply the new config. Note that the Operator only checks for changes in certain intervals, so it might take a minute or two for the changes to be applied. If the changes are not applied after a few minutes, please use 'kubectl' to check the logs of the pod of the <cluster-name>-securityconfig-update
job. If you have an error in your configuration it will be reported there.
The operator provides custom kubernetes resources that allow you to create/update/manage security configuration resources such as users, roles, action groups etc. as kubernetes objects.
It is possible to manage Opensearch users in Kubernetes with the operator. The operator will not modify users that already exist. You can create an example user as follows:
apiVersion: opensearch.opster.io/v1
kind: OpensearchUser
metadata:
name: sample-user
namespace: default
spec:
opensearchCluster:
name: my-first-cluster
passwordFrom:
name: sample-user-password
key: password
backendRoles:
- kibanauser
The namespace of the OpenSearchUser
must be the namespace the OpenSearch cluster itself is deployed in.
Note that a secret called sample-user-password
will need to exist in the default
namespace with the base64 encoded password in the password
key.
It is possible to manage Opensearch roles in Kubernetes with the operator. The operator will not modify roles that already exist. You can create an example role as follows:
apiVersion: opensearch.opster.io/v1
kind: OpensearchRole
metadata:
name: sample-role
namespace: default
spec:
opensearchCluster:
name: my-first-cluster
clusterPermissions:
- cluster_composite_ops
- cluster_monitor
indexPermissions:
- indexPatterns:
- logs*
allowedActions:
- index
- read
The operator allows you link any number of users, backend roles and roles with a OpensearchUserRoleBinding. Each user in the binding will be granted each role. E.g:
apiVersion: opensearch.opster.io/v1
kind: OpensearchUserRoleBinding
metadata:
name: sample-urb
namespace: default
spec:
opensearchCluster:
name: my-first-cluster
users:
- sample-user
backendRoles:
- sample-backend-role
roles:
- sample-role
It is possible to manage Opensearch action groups in Kubernetes with the operator. The operator will not modify action groups that already exist. You can create an example action group as follows:
apiVersion: opensearch.opster.io/v1
kind: OpensearchActionGroup
metadata:
name: sample-action-group
namespace: default
spec:
opensearchCluster:
name: my-first-cluster
allowedActions:
- indices:admin/aliases/get
- indices:admin/aliases/exists
type: index
description: Sample action group
It is possible to manage Opensearch tenants in Kubernetes with the operator. The operator will not modify tenants that already exist. You can create an example tenant as follows:
apiVersion: opensearch.opster.io/v1
kind: OpensearchTenant
metadata:
name: sample-tenant
namespace: default
spec:
opensearchCluster:
name: my-first-cluster
description: Sample tenant
In order to create your cluster with an adminuser different from the default admin:admin
you will have to walk through the following steps:
First you will have to create a secret with your admin user configuration (in this example admin-credentials-secret
):
apiVersion: v1
kind: Secret
metadata:
name: admin-credentials-secret
type: Opaque
data:
# admin
username: YWRtaW4=
# admin123
password: YWRtaW4xMjM=
Then you have to create your own securityconfig and store it in a secret (securityconfig-secret
in this example). You can take a look at securityconfig-secret.yaml for how such a secret should look like.
Make sure that the password hash of the admin user corresponds to the password you stored in the admin-credentials-secret
.
Notice that inside securityconfig-secret
You must edit the hash
of the admin user before creating the secret. if you have python 3.x installed on your machine you can use the following command to hash your password: python -c 'import bcrypt; print(bcrypt.hashpw("admin123".encode("utf-8"), bcrypt.gensalt(12, prefix=b"2a")).decode("utf-8"))'
internal_users.yml: |-
_meta:
type: "internalusers"
config_version: 2
admin:
hash: "$2y$12$lJsHWchewGVcGlYgE3js/O4bkTZynETyXChAITarCHLz8cuaueIyq" <------- change that hash to your new password hash
reserved: true
backend_roles:
- "admin"
description: "Demo admin user"
The last thing that you have to do is to add that security configuration to your cluster spec:
security:
config:
adminCredentialsSecret:
name: admin-credentials-secret # The secret with the admin credentials for the operator to use
securityConfigSecret:
name: securityconfig-secret # The secret containing your customized securityconfig
tls:
transport:
generate: true
http:
generate: true
Changing the admin password after the cluster has been created is possible via the same way. You must update your securityconfig (in the securityconfig-secret
) and the content of the admin-credentials-secret
to both reflect the new password. Note that currently the operator cannot make changes in the securityconfig itself. As such you must always update the securityconfig in the secret with the new password and in addition provide it via the credentials secret so that the operator can still access the cluster.
Dashboards requires an opensearch user to connect to the cluster. By default Dashboards is configured to use the demo admin user. If you supply your own securityconfig and want to change the credentials Dashboards should use, you must create a secret with keys username
and password
that contains the new credentials and then supply that secret to the operator via the cluster spec:
spec:
dashboards:
opensearchCredentialsSecret:
name: dashboards-credentials # This is the name of your secret that contains the credentials for Dashboards to use
The operator allows you to install and enable the Aiven monitoring plugin for OpenSearch on your cluster as a built-in feature. If enabled the operator will install the aiven plugin into the opensearch pods and generate a Prometheus ServiceMonitor object to configure the plugin for scraping.
This feature needs internet connectivity to download the plugin. if you are working in a restricted environment, please download the plugin zip for your cluster version (example for 2.3.0: https://github.com/aiven/prometheus-exporter-plugin-for-opensearch/releases/download/2.3.0.0/prometheus-exporter-2.3.0.0.zip
) and provide it at a location the operator can reach. Configure that URL as pluginURL
in the monitoring config. By default the convention shown below in the example will be used if no pluginUrl
is specified.
By default the Opensearch admin user will be used to access the monitoring API. If you want to use a separate user with limited permissions you need to create that user using either of the following options:
a. Create new applicative User using OpenSearch API/UI, create new secret with 'username' and 'password' keys and provide that secret name under monitoringUserSecret
.
b. Use Our OpenSearchUser CRD and provide the secret under monitoringUserSecret.
To configure monitoring you can add the following fields to your cluster spec:
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: my-first-cluster
namespace: default
spec:
general:
version: <YOUR_CLUSTER_VERSION>
monitoring:
enable: true # Enable or disable the monitoring plugin
labels: # The labels add for ServiceMonitor
someLabelKey: someLabelValue
scrapeInterval: 30s # The scrape interval for Prometheus
monitoringUserSecret: monitoring-user-secret # Optional, name of a secret with username/password for prometheus to acces the plugin metrics endpoint with, defaults to the admin user
pluginUrl: https://github.com/aiven/prometheus-exporter-plugin-for-opensearch/releases/download/<YOUR_CLUSTER_VERSION>.0/prometheus-exporter-<YOUR_CLUSTER_VERSION>.0.zip # Optional, custom URL for the monitoring plugin
tlsConfig: # Optional, use this to override the tlsConfig of the generated ServiceMonitor, only the following provided options can be set currently
serverName: "testserver.test.local"
insecureSkipVerify: true # The operator currently does not allow configuring the ServiceMonitor with certificates, so this needs to be set
# ...
The operator provides a custom Kubernetes resource that allow you to create/update/manage ISM policies using Kubernetes objects.
It is possible to manage OpenSearch ISM policies in Kubernetes with the operator. Fields in the CRD directly maps to the OpenSearch ISM Policy structure. The operator will not modify policies that already exist. You can create an example policy as follows:
apiVersion: opensearch.opster.io/v1
kind: OpenSearchISMPolicy
metadata:
name: sample-policy
spec:
opensearchCluster:
name: my-first-cluster
description: Sample policy
policyId: sample-policy
defaultState: hot
states:
- name: hot
actions:
- replicaCount:
numberOfReplicas: 4
transitions:
- stateName: warm
conditions:
minIndexAge: "10d"
- name: warm
actions:
- replicaCount:
numberOfReplicas: 2
transitions:
- stateName: delete
conditions:
minIndexAge: "30d"
- name: delete
actions:
- delete: {}
The namespace of the OpenSearchISMPolicy
must be the namespace the OpenSearch cluster itself is deployed in. policyId
is an optional field, and if not provided metadata.name
is used as the default.
The operator provides the OpensearchIndexTemplate and OpensearchComponentTemplate CRDs, which is used for managing index and component templates respectively.
The two CRD specifications attempts to be as close as possible to what the OpenSearch API expects, with some changes from snake_case to camelCase.
The fields that have been changed, is index_patterns
to indexPatterns
(OpensearchIndexTemplate only), composed_of
to composedOf
(OpensearchIndexTemplate only) and template.aliases.<alias>.is_write_index
to template.aliases.<alias>.isWriteIndex
.
The following example creates a component template for setting the number of shards and replicas, together with specifying a specific time format for documents:
apiVersion: opensearch.opster.io/v1
kind: OpensearchComponentTemplate
metadata:
name: sample-component-template
spec:
opensearchCluster:
name: my-first-cluster
template: # required
aliases: # optional
my_alias: {}
settings: # optional
number_of_shards: 2
number_of_replicas: 1
mappings: # optional
properties:
timestamp:
type: date
format: yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis
value:
type: double
version: 1 # optional
_meta: # optional
description: example description
The following index template makes use of the above component template (see composedOf
) for all indices which follows the logs-2020-01-*
index pattern:
apiVersion: opensearch.opster.io/v1
kind: OpensearchIndexTemplate
metadata:
name: sample-index-template
spec:
opensearchCluster:
name: my-first-cluster
name: logs_template # name of the index template - defaults to metadata.name. Can't be updated in-place
indexPatterns: # required index patterns
- "logs-2020-01-*"
composedOf: # optional
- sample-component-template
priority: 100 # optional
template: {} # optional
version: 1 # optional
_meta: {} # optional
Note: the .spec.name
is immutable, meaning that it cannot be changed after the resources have been deployed to a Kubernetes cluster