Skip to content

Commit

Permalink
Add druid horizontal scaling
Browse files Browse the repository at this point in the history
Signed-off-by: Tapajit Chandra Paul <[email protected]>
  • Loading branch information
tapojit047 committed Oct 18, 2024
1 parent e74acb0 commit c697fe2
Show file tree
Hide file tree
Showing 21 changed files with 2,295 additions and 3 deletions.
1 change: 0 additions & 1 deletion docs/guides/druid/backup/application-level/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ This section will demonstrate how to take application-level backup of a `Druid`

## Deploy Sample Druid Database


**Create External Dependency (Deep Storage):**

One of the external dependency of Druid is deep storage where the segments are stored. It is a storage mechanism that Apache Druid does not provide. **Amazon S3**, **Google Cloud Storage**, or **Azure Blob Storage**, **S3-compatible storage** (like **Minio**), or **HDFS** are generally convenient options for deep storage.
Expand Down
49 changes: 49 additions & 0 deletions docs/guides/druid/clustering/topology-cluster-guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,55 @@ Before proceeding:

> Note: The yaml files used in this tutorial are stored in [docs/guides/druid/clustering/topology-cluster-guide/yamls](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/guides/druid/clustering/topology-cluster-guide/yamls) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs).
## Create External Dependency (Deep Storage)

Before proceeding further, we need to prepare deep storage, which is one of the external dependency of Druid and used for storing the segments. It is a storage mechanism that Apache Druid does not provide. **Amazon S3**, **Google Cloud Storage**, or **Azure Blob Storage**, **S3-compatible storage** (like **Minio**), or **HDFS** are generally convenient options for deep storage.

In this tutorial, we will run a `minio-server` as deep storage in our local `kind` cluster using `minio-operator` and create a bucket named `druid` in it, which the deployed druid database will use.

```bash

$ helm repo add minio https://operator.min.io/
$ helm repo update minio
$ helm upgrade --install --namespace "minio-operator" --create-namespace "minio-operator" minio/operator --set operator.replicaCount=1

$ helm upgrade --install --namespace "demo" --create-namespace druid-minio minio/tenant \
--set tenant.pools[0].servers=1 \
--set tenant.pools[0].volumesPerServer=1 \
--set tenant.pools[0].size=1Gi \
--set tenant.certificate.requestAutoCert=false \
--set tenant.buckets[0].name="druid" \
--set tenant.pools[0].name="default"

```

Now we need to create a `Secret` named `deep-storage-config`. It contains the necessary connection information using which the druid database will connect to the deep storage.

```yaml
apiVersion: v1
kind: Secret
metadata:
name: deep-storage-config
namespace: demo
stringData:
druid.storage.type: "s3"
druid.storage.bucket: "druid"
druid.storage.baseKey: "druid/segments"
druid.s3.accessKey: "minio"
druid.s3.secretKey: "minio123"
druid.s3.protocol: "http"
druid.s3.enablePathStyleAccess: "true"
druid.s3.endpoint.signingRegion: "us-east-1"
druid.s3.endpoint.url: "http://myminio-hl.demo.svc.cluster.local:9000/"
```
Let’s create the `deep-storage-config` Secret shown above:

```bash
$ kubectl create -f https://github.com/kubedb/docs/raw/{{< param "info.version" >}}/docs/guides/druid/backup/application-level/examples/deep-storage-config.yaml
secret/deep-storage-config created
```

## Deploy Druid Cluster

The following is an example `Druid` object which creates a Druid cluster of six nodes (coordinators, overlords, brokers, routers, historicals and middleManager). Each with one replica.
Expand Down
53 changes: 52 additions & 1 deletion docs/guides/druid/configuration/druid-cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ In Druid cluster, there are six nodes available coordinators, overlords, brokers

- At first, you need to have a Kubernetes cluster, and the `kubectl` command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using [kind](https://kind.sigs.k8s.io/docs/user/quick-start/).

- - Now, install KubeDB cli on your workstation and KubeDB operator in your cluster following the steps [here](/docs/setup/README.md) and make sure to include the flags `--set global.featureGates.Druid=true` to ensure **Druid CRD** and `--set global.featureGates.ZooKeeper=true` to ensure **ZooKeeper CRD** as Druid depends on ZooKeeper for external dependency with helm command.
- Now, install KubeDB cli on your workstation and KubeDB operator in your cluster following the steps [here](/docs/setup/README.md) and make sure to include the flags `--set global.featureGates.Druid=true` to ensure **Druid CRD** and `--set global.featureGates.ZooKeeper=true` to ensure **ZooKeeper CRD** as Druid depends on ZooKeeper for external dependency with helm command.

- To keep things isolated, this tutorial uses a separate namespace called `demo` throughout this tutorial.

Expand All @@ -48,6 +48,57 @@ standard (default) rancher.io/local-path Delete WaitForFirstConsume

Here, we have `standard` StorageClass in our cluster from [Local Path Provisioner](https://github.com/rancher/local-path-provisioner).

Before deploying `Druid` cluster, we need to prepare the external dependencies.

## Create External Dependency (Deep Storage)

Before proceeding further, we need to prepare deep storage, which is one of the external dependency of Druid and used for storing the segments. It is a storage mechanism that Apache Druid does not provide. **Amazon S3**, **Google Cloud Storage**, or **Azure Blob Storage**, **S3-compatible storage** (like **Minio**), or **HDFS** are generally convenient options for deep storage.

In this tutorial, we will run a `minio-server` as deep storage in our local `kind` cluster using `minio-operator` and create a bucket named `druid` in it, which the deployed druid database will use.

```bash

$ helm repo add minio https://operator.min.io/
$ helm repo update minio
$ helm upgrade --install --namespace "minio-operator" --create-namespace "minio-operator" minio/operator --set operator.replicaCount=1

$ helm upgrade --install --namespace "demo" --create-namespace druid-minio minio/tenant \
--set tenant.pools[0].servers=1 \
--set tenant.pools[0].volumesPerServer=1 \
--set tenant.pools[0].size=1Gi \
--set tenant.certificate.requestAutoCert=false \
--set tenant.buckets[0].name="druid" \
--set tenant.pools[0].name="default"

```

Now we need to create a `Secret` named `deep-storage-config`. It contains the necessary connection information using which the druid database will connect to the deep storage.

```yaml
apiVersion: v1
kind: Secret
metadata:
name: deep-storage-config
namespace: demo
stringData:
druid.storage.type: "s3"
druid.storage.bucket: "druid"
druid.storage.baseKey: "druid/segments"
druid.s3.accessKey: "minio"
druid.s3.secretKey: "minio123"
druid.s3.protocol: "http"
druid.s3.enablePathStyleAccess: "true"
druid.s3.endpoint.signingRegion: "us-east-1"
druid.s3.endpoint.url: "http://myminio-hl.demo.svc.cluster.local:9000/"
```
Let’s create the `deep-storage-config` Secret shown above:

```bash
$ kubectl create -f https://github.com/kubedb/docs/raw/{{< param "info.version" >}}/docs/guides/druid/backup/application-level/examples/deep-storage-config.yaml
secret/deep-storage-config created
```

## Use Custom Configuration

Say we want to change the default log retention time and default replication factor of creating a topic of brokers. Let's create the `middleManagers.properties` file with our desire configurations.
Expand Down
10 changes: 10 additions & 0 deletions docs/guides/druid/scaling/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Scaling Druid
menu:
docs_{{ .version }}:
identifier: guides-druid-scaling
name: Scaling
parent: guides-druid
weight: 70
menu_name: docs_{{ .version }}
---
10 changes: 10 additions & 0 deletions docs/guides/druid/scaling/horizontal-scaling/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Horizontal Scaling
menu:
docs_{{ .version }}:
identifier: guides-druid-scaling-horizontal-scaling
name: Horizontal Scaling
parent: guides-druid-scaling
weight: 10
menu_name: docs_{{ .version }}
---
Loading

0 comments on commit c697fe2

Please sign in to comment.