Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add monitoring stack with prometheus and grafana #14

Merged
merged 2 commits into from
Jan 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/tf-helm-validate.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ jobs:

- name: Terraform Validate
id: validate
run: terraform validate
run: terraform validate -json
66 changes: 65 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,70 @@ istioctl analyze

> **NOTE**: Add the `sidecar.istio.io/inject: "false"` annotation to the metadata section of the pod template. This will prevent the Istio sidecar from being injected into that specific pod.

## Monitoring Stack

To setup a monitoring stack, we will use [Prometheus](https://prometheus.io/) and [Grafana](https://grafana.com/).
Instead of installing the helm charts for these applications, we will use the custom helm chart which includes the grafana helm chart as a dependency in the prometheus chart (developed by the prometheus community). We will use the [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md) from the prometheus community.

> NOTE: This chart was formerly named prometheus-operator chart, now renamed to more clearly reflect that it installs the kube-prometheus project stack, within which Prometheus Operator is only one component.

### Working with kube-prometheus-stack

1. Get the Helm repository information

```bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
```

2. By default, this chart installs additional, dependent charts:

- [prometheus-community/kube-state-metrics](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-state-metrics)
- [prometheus-community/prometheus-node-exporter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-node-exporter)
- [grafana/grafana](https://github.com/grafana/helm-charts/tree/main/charts/grafana)

> NOTE: To disable dependencies during installation, see [multiple releases](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md#multiple-releases).

3. To configure the kube-prometheus-stack helm chart, refer the [documentation](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md#configuration). To see the default values, use the command:

```bash
helm show values prometheus-community/kube-prometheus-stack
```

> \[!IMPORTANT]\
> [Workaround for known issues on GKE](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md#running-on-private-gke-clusters)
> When Google configure the control plane for private clusters, they automatically configure VPC peering between your Kubernetes cluster’s network and a separate Google managed project. In order to restrict what Google are able to access within your cluster, the firewall rules configured restrict access to your Kubernetes pods. This means that in order to use the webhook component with a GKE private cluster, you must configure an additional firewall rule to allow the GKE control plane access to your webhook pod.
> You can read more information on how to add firewall rules for the GKE control plane nodes in the [GKE docs](https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#add_firewall_rules)
> Alternatively, you can disable the hooks by setting `prometheusOperator.admissionWebhooks.enabled=false`.

## Configuring the chart values

For specific `values.yaml`, refer their specific charts and create their respective `values.yaml` files based on the dummy `values.yaml` file.
For specific `values.yaml`, refer their specific charts and create their respective `values.yaml` files based on the dummy `values.yaml` file. You can also use the `example.*.yaml` files in the `root/` directory to view specific values for the chart values.

## Infrastructure Setup

Once we have all our chart `values.yaml` configured, we can apply our Terraform configuration to install the helm charts to our kubernetes cluster.

- Initialize Terraform

```bash
terraform init
```

- Validate the Terraform infrastructure configuration as code

```bash
terraform validate -json
```

- Plan the infrastructure setup

```bash
terraform plan -var-file="prod.tfvars"
```

- Apply the infrastructure to the kubernetes cluster after verifying the configuration in the previous steps

```bash
terraform apply --auto-approve -var-file="prod.tfvars"
```
Binary file removed modules/charts/webapp-helm-chart-1.8.2.tar.gz
Binary file not shown.
Binary file added modules/charts/webapp-helm-chart-1.8.3.tar.gz
Binary file not shown.
12 changes: 12 additions & 0 deletions modules/kube_prometheus/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
resource "helm_release" "kube_prometheus_chart" {
name = "kube-prometheus-stack"
namespace = "prometheus"
create_namespace = true
repository = "https://prometheus-community.github.io/helm-charts"
chart = "kube-prometheus-stack"
timeout = var.timeout
cleanup_on_fail = true
force_update = false
wait = false
values = ["${file(var.kube_prometheus_values_file)}"]
}
Empty file.
2 changes: 2 additions & 0 deletions modules/kube_prometheus/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
variable "timeout" {}
variable "kube_prometheus_values_file" {}
9 changes: 9 additions & 0 deletions modules/namespace/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,12 @@ resource "kubernetes_namespace" "istio_ingress" {
name = "istio-ingress"
}
}

resource "kubernetes_namespace" "prometheus" {
metadata {
# labels = {
# istio-injection = "enabled"
# }
name = "prometheus"
}
}
4 changes: 2 additions & 2 deletions root/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,6 @@ override.tf.json
# example: *tfplan*

# variables
dev.tfvars
prod.tfvars
*.tfvars
*values.yaml
!example*
113 changes: 113 additions & 0 deletions root/example.infra.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
replicaCount: 1
image:
repository: quay.io/pwncorp
name: consumer
tag: 1.1.3
pullPolicy: Always
initContainer:
repository: quay.io/pwncorp
name: initconsumer
tag: 1.0.1
pullPolicy: Always
imagePullSecrets:
type: kubernetes.io/dockerconfigjson
dockerConfig: b2theW1yaGFja2VyCg==

namespace: deps
deployStrat:
rolling: RollingUpdate
maxSurge: 1
maxUnavailable: 0
progressDeadlineSeconds: 120
minReadySeconds: 30
configs:
kafka_port: "9094"
client_id: webapp
topic: healthcheck
db: consumer
dbport: "5432"
app_dbschema: app
secret:
type: Opaque
username: consumer_user
password: consumer@pswd
podLabel:
app: consumer
service:
type: ClusterIP
port: 80
resources:
limits:
memory: 512Mi
cpu: "0.8"
requests:
memory: 128Mi
cpu: "0.4"
psql:
enabled: true
postgresql:
image:
tag: 15.5.0-debian-11-r5
auth:
username: consumer_user
password: consumer@pswd
database: consumer
primary:
persistence:
size: 1Gi
labels:
app: consumer-db
podLabels:
app: consumer-db
resources:
limits:
memory: 1024Mi
cpu: "1"
requests:
memory: 512Mi
cpu: "0.5"
kafka:
listeners:
client:
protocol: PLAINTEXT
controller:
protocol: PLAINTEXT
interbroker:
protocol: PLAINTEXT
external:
protocol: PLAINTEXT
controller:
replicaCount: 0
broker:
replicaCount: 3
persistence:
size: 1Gi
resources:
limits:
memory: 1024Mi
cpu: "1"
requests:
memory: 512Mi
cpu: "0.5"
serviceAccount:
create: false
provisioning:
enabled: true
numPartitions: 3
replicationFactor: 1
podAnnotations:
sidecar.istio.io/inject: "false"
topics:
- name: healthcheck
partitions: 3
replicationFactor: 1
config:
max.message.bytes: 64000
flush.messages: 1
kraft:
enabled: false
zookeeper:
enabled: true
persistence:
size: 1Gi
Transform: AWS::Serverless-2016-10-31
66 changes: 66 additions & 0 deletions root/example.prometheus_grafana.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
## Default values for kube-prometheus helm chart: `helm show values prometheus-community/kube-prometheus-stack`

## Using default values from https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml
##
grafana:
enabled: true

## Deploy default dashboards
##
defaultDashboardsEnabled: true

## Timezone for the default dashboards
## Other options are: browser or a specific timezone, i.e. Europe/Luxembourg
##
defaultDashboardsTimezone: utc

## Editable flag for the default dashboards
##
defaultDashboardsEditable: true

adminPassword: prom-operator

## Passed to grafana subchart and used by servicemonitor below
##
service:
portName: http-web
type: LoadBalancer

## Deploy a Prometheus instance
##

prometheus:
enabled: true

## Configuration for Prometheus service
##
service:
## Port for Prometheus Service to listen on
##
port: 9090

## To be used with a proxy extraContainer port
targetPort: 9090

## List of IP addresses at which the Prometheus server service is available
## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
##
externalIPs: []

## Port to expose on each node
## Only used if service.type is 'NodePort'
##
nodePort: 30090

## Loadbalancer IP
## Only use if service.type is "LoadBalancer"
loadBalancerIP: ""
loadBalancerSourceRanges: []

## Denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints
##
externalTrafficPolicy: Cluster

## Service type
##
type: LoadBalancer
13 changes: 7 additions & 6 deletions root/example.tfvars
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
timeout = 600
infra_values_file = "./infra_values.yaml"
webapp_values_file = "./webapp_values.yaml"
chart_path = "../modules/charts"
webapp_chart = "webapp-helm-chart-1.1.3.tar.gz"
infra_chart = "infra-helm-chart-1.4.0.tar.gz"
timeout = 600
infra_values_file = "./infra_values.yaml"
webapp_values_file = "./webapp_values.yaml"
kube_prometheus_values_file = "./kube_prometheus_values.yaml"
chart_path = "../modules/charts"
webapp_chart = "webapp-helm-chart-1.1.3.tar.gz"
infra_chart = "infra-helm-chart-1.4.0.tar.gz"
Loading