From a6b91e093ab690b3927038b15714e4fd852491ec Mon Sep 17 00:00:00 2001 From: Heitor Tashiro Sergent Date: Wed, 24 Jul 2024 16:12:08 -0500 Subject: [PATCH] Add k6-operator docs to Set up (#1559) * chore: add basic structure for k6-operator docs * chore: hide reference for now * k6-operator: add main sections from the repo Readme.md * Add a short introduction to the index page * Update install-k6-operator.md * Update upgrade-k6-operator.md * Update troubleshooting.md * Update executing-k6-scripts-with-testrun-crd.md * Update extensions.md * Update extensions.md * Update common-options.md * Update scheduling-tests.md * k6-operator: fix typos * k6-operator: add content for troubleshooting.md * chore: replaces instances of k6-operator with k6 Operator * chore: add uninstall instructions * chore: hide Upgrade k6 Operator page * chore: add Use the k6 operator with Grafana Cloud k6 page * chore: review troubleshooting doc * chore: update Namespaced deployment heading to Watch namespace * Apply suggestions from code review Co-authored-by: Olha Yevtushenko * Move docs to next and v0.52.x folders * Remove docs from v0.50.x folder --------- Co-authored-by: Olha Yevtushenko Co-authored-by: Olha Yevtushenko --- .../set-up/set-up-distributed-k6/_index.md | 19 ++ .../install-k6-operator.md | 114 ++++++++ .../set-up-distributed-k6/troubleshooting.md | 260 ++++++++++++++++++ .../upgrade-k6-operator.md | 10 + .../set-up-distributed-k6/usage/_index.md | 10 + .../usage/common-options.md | 57 ++++ .../executing-k6-scripts-with-testrun-crd.md | 219 +++++++++++++++ .../set-up-distributed-k6/usage/extensions.md | 61 ++++ .../usage/k6-operator-to-gck6.md | 73 +++++ .../set-up-distributed-k6/usage/reference.md | 12 + .../usage/scheduling-tests.md | 106 +++++++ .../set-up/set-up-distributed-k6/_index.md | 19 ++ .../install-k6-operator.md | 114 ++++++++ .../set-up-distributed-k6/troubleshooting.md | 260 ++++++++++++++++++ .../upgrade-k6-operator.md | 10 + .../set-up-distributed-k6/usage/_index.md | 10 + .../usage/common-options.md | 57 ++++ .../executing-k6-scripts-with-testrun-crd.md | 219 +++++++++++++++ .../set-up-distributed-k6/usage/extensions.md | 61 ++++ .../usage/k6-operator-to-gck6.md | 73 +++++ .../set-up-distributed-k6/usage/reference.md | 12 + .../usage/scheduling-tests.md | 106 +++++++ 22 files changed, 1882 insertions(+) create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/_index.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/install-k6-operator.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/troubleshooting.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/upgrade-k6-operator.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/usage/_index.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/usage/common-options.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/usage/extensions.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/usage/k6-operator-to-gck6.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/usage/reference.md create mode 100644 docs/sources/next/set-up/set-up-distributed-k6/usage/scheduling-tests.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/_index.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/install-k6-operator.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/troubleshooting.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/upgrade-k6-operator.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/_index.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/common-options.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/extensions.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/k6-operator-to-gck6.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/reference.md create mode 100644 docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/scheduling-tests.md diff --git a/docs/sources/next/set-up/set-up-distributed-k6/_index.md b/docs/sources/next/set-up/set-up-distributed-k6/_index.md new file mode 100644 index 0000000000..06687090b0 --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/_index.md @@ -0,0 +1,19 @@ +--- +weight: 150 +title: Set up distributed k6 +--- + +# Set up distributed k6 + +It's possible to run large load tests even when using a single node, or single machine. But, depending on your use case, you might also want to run a distributed Grafana k6 test in your own infrastructure. + +A couple of reasons why you might want to do this: + +- You run your application in Kubernetes and would like k6 to be executed in the same fashion as all your other infrastructure components. +- You want to run your tests within your private network for security and/or privacy reasons. + +[k6 Operator](https://github.com/grafana/k6-operator) is a Kubernetes operator that you can use to run distributed k6 tests in your cluster. + +This section includes the following topics: + +{{< section depth=2 >}} diff --git a/docs/sources/next/set-up/set-up-distributed-k6/install-k6-operator.md b/docs/sources/next/set-up/set-up-distributed-k6/install-k6-operator.md new file mode 100644 index 0000000000..315ec003ca --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/install-k6-operator.md @@ -0,0 +1,114 @@ +--- +weight: 100 +title: Install k6 Operator +--- + +# Install k6 Operator + +This guide provides step-by-step instructions on how to install k6 Operator. + +## Before you begin + +To install k6 Operator, you'll need: + +- A Kubernetes cluster, along with access to it. +- [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl). + +## Deploy the operator + +There are three different options that you can use to deploy the k6 Operator. + +### Deploy with bundle + +The easiest way to install the operator is with bundle: + +```bash +curl https://raw.githubusercontent.com/grafana/k6-operator/main/bundle.yaml | kubectl apply -f - +``` + +Bundle includes default manifests for k6 Operator, including a `k6-operator-system` namespace and k6 Operator deployment with the latest tagged Docker image. Customizations can be made on top of this manifest as needed, for example, with `kustomize`. + +### Deploy with Helm + +Helm releases of k6 Operator are published together with other Grafana Helm charts. You can install it with the following commands: + +```bash +helm repo add grafana https://grafana.github.io/helm-charts +helm repo update +helm install k6-operator grafana/k6-operator +``` + +You can also pass additional configuration options with a `values.yaml` file: + +```bash +helm install k6-operator grafana/k6-operator -f values.yaml +``` + +Refer to the [k6 Operator samples folder](https://github.com/grafana/k6-operator/blob/main/charts/k6-operator/samples/customAnnotationsAndLabels.yaml) for an example file. + +You can find a complete list of Helm options in the [k6 Operator charts folder](https://github.com/grafana/k6-operator/blob/main/charts/k6-operator/README.md). + +### Deploy with Makefile + +In order to install the operator with a Makefile, you'll need: + +- [go](https://go.dev/doc/install) +- [kustomize](https://kubectl.docs.kubernetes.io/installation/kustomize/) + +A more manual, low-level way to install the k6 operator is by running the command below: + +```bash +make deploy +``` + +This method may be more useful for development of the k6 Operator, depending on specifics of the setup. + +## Install the CRD + +The k6 Operator includes custom resources called `TestRun`, `PrivateLoadZone`, and `K6`. They're automatically installed when you do a deployment or install a bundle, but you can also manually install them by running: + +```bash +make install +``` + +{{< admonition type="warning" >}} + +The `K6` CRD has been replaced by the `TestRun` CRD and will be deprecated in the future. We recommend using the `TestRun` CRD. + +{{< /admonition >}} + +## Watch namespace + +By default, the k6 Operator watches the `TestRun` and `PrivateLoadZone` custom resources in all namespaces. You can also configure the k6 Operator to watch a specific namespace by setting the `WATCH_NAMESPACE` environment variable for the operator's deployment: + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: k6-operator-controller-manager + namespace: k6-operator-system +spec: + template: + spec: + containers: + - name: manager + image: ghcr.io/grafana/k6-operator:controller-v0.0.14 + env: + - name: WATCH_NAMESPACE + value: 'some-ns' +# ... +``` + +## Uninstall k6 Operator + +You can remove all of the resources created by the k6 Operator with `bundle`: + +```bash +curl https://raw.githubusercontent.com/grafana/k6-operator/main/bundle.yaml | kubectl delete -f - +``` + +Or with the `make` command: + +```bash +make delete +``` diff --git a/docs/sources/next/set-up/set-up-distributed-k6/troubleshooting.md b/docs/sources/next/set-up/set-up-distributed-k6/troubleshooting.md new file mode 100644 index 0000000000..a8bf60698b --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/troubleshooting.md @@ -0,0 +1,260 @@ +--- +weight: 400 +title: Troubleshooting +--- + +# Troubleshooting + +This topic includes instructions to help you troubleshoot common issues with the k6 Operator. + +## Common tricks + +### Test your script locally + +Always run your script locally before trying to run it with the k6 Operator: + +```bash +k6 run script.js +``` + +If you're using environment variables or CLI options, pass them in as well: + +```bash +MY_ENV_VAR=foo k6 run script.js --tag my_tag=bar +``` + +That ensures that the script has correct syntax and can be parsed with k6 in the first place. Additionally, running locally can help you check if the configured options are doing what you expect. If there are any errors or unexpected results in the output of `k6 run`, make sure to fix those prior to deploying the script elsewhere. + +### `TestRun` deployment + +#### The pods + +In case of one `TestRun` Custom Resource (CR) creation with `parallelism: n`, there are certain repeating patterns: + +1. There will be `n + 2` Jobs (with corresponding Pods) created: initializer, starter, and `n` runners. +1. If any of these Jobs didn't result in a Pod being deployed, there must be an issue with that Job. Some commands that can help here: + + ```bash + kubectl get jobs -A + kubectl describe job mytest-initializer + ``` + +1. If one of the Pods was deployed but finished with `Error`, you can check its logs with the following command: + + ```bash + kubectl logs mytest-initializer-xxxxx + ``` + +If the Pods seem to be working but not producing an expected result and there's not enough information in the logs, you can use the k6 [verbose option](https://grafana.com/docs/k6//using-k6/k6-options/#options) in the `TestRun` spec: + +```yaml +apiVersion: k6.io/v1alpha1 +kind: TestRun +metadata: + name: k6-sample +spec: + parallelism: 2 + script: + configMap: + name: 'test' + file: 'test.js' + arguments: --verbose +``` + +#### k6 Operator + +Another source of info is the k6 Operator itself. It's deployed as a Kubernetes `Deployment`, with `replicas: 1` by default, and its logs together with observations about the Pods from the previous section usually contain enough information to help you diagnose any issues. With the standard deployment, the logs of the k6 Operator can be checked with: + +```bash +kubectl -n k6-operator-system -c manager logs k6-operator-controller-manager-xxxxxxxx-xxxxx +``` + +#### Inspect `TestRun` resource + +After you deploy a `TestRun` CR, you can inspect it the same way as any other resource: + +```bash +kubectl describe testrun my-testrun +``` + +Firstly, check if the spec is as expected. Then, see the current status: + +```yaml +Status: + Conditions: + Last Transition Time: 2024-01-17T10:30:01Z + Message: + Reason: CloudTestRunFalse + Status: False + Type: CloudTestRun + Last Transition Time: 2024-01-17T10:29:58Z + Message: + Reason: TestRunPreparation + Status: Unknown + Type: TestRunRunning + Last Transition Time: 2024-01-17T10:29:58Z + Message: + Reason: CloudTestRunAbortedFalse + Status: False + Type: CloudTestRunAborted + Last Transition Time: 2024-01-17T10:29:58Z + Message: + Reason: CloudPLZTestRunFalse + Status: False + Type: CloudPLZTestRun + Stage: error +``` + +If `Stage` is equal to `error`, you can check the logs of k6 Operator. + +Conditions can be used as a source of info as well, but it's a more advanced troubleshooting option that should be used if the previous steps weren't enough to diagnose the issue. Note that conditions that start with the `Cloud` prefix only matter in the setting of k6 Cloud test runs, for example, for cloud output and PLZ test runs. + +### `PrivateLoadZone` deployment + +If the `PrivateLoadZone` CR was successfully created in Kubernetes, it should become visible in your account in Grafana Cloud k6 (GCk6) interface soon afterwards. If it doesn't appear in the UI, then there is likely a problem to troubleshoot. + +First, go over the [guide](https://grafana.com/docs/grafana-cloud/k6/author-run/private-load-zone-v2/) to double-check if all the steps have been done correctly and successfully. + +Unlike `TestRun` deployment, when a `PrivateLoadZone` is first created, there are no additional resources deployed. So, the only source for troubleshooting are the logs of k6 Operator. See the [previous subsection](#k6-operator) on how to access its logs. Any errors there might be a hint to diagnose the issue. Refer to [PrivateLoadZone: subscription error](#privateloadzone-subscription-error) for more details. + +### Running tests in `PrivateLoadZone` + +Each time a user runs a test in a PLZ, for example with `k6 cloud script.js`, there is a corresponding `TestRun` being deployed by the k6 Operator. This `TestRun` will be deployed in the same namespace as its `PrivateLoadZone`. If the test is misbehaving, for example, it errors out, or doesn't produce the expected result, then you can check: + +1. If there are any messages in the GCk6 UI. +2. If there are any messages in the output of the `k6 cloud` command. +3. The resources and their logs, the same way as with a [standalone `TestRun` deployment](#testrun-deployment) + +## Common scenarios + +### Issues with environment variables + +Refer to [Environment variables](https://github.com/grafana/k6-operator/blob/main/docs/env-vars.md) for details on how to pass environment variables to the k6 Operator. + +### Tags not working + +Tags are a rather common source of errors when using the k6 Operator. For example, the following tags would lead to parsing errors: + +```yaml + arguments: --tag product_id="Test A" + # or + arguments: --tag foo=\"bar\" +``` + +You can see those errors in the logs of either the initializer or the runner Pod, for example: + +```bash +time="2024-01-11T11:11:27Z" level=error msg="invalid argument \"product_id=\\\"Test\" for \"--tag\" flag: parse error on line 1, column 12: bare \" in non-quoted-field" +``` + +This is a common problem with escaping the characters. You can find an [issue](https://github.com/grafana/k6-operator/issues/211) in the k6 Operator repository that can be upvoted. + +### Initializer logs an error but it's not about tags + +This can happen because of lack of attention to the [preparation](#preparation) step. One command that you can use to help diagnose issues with your script is the following: + +```bash +k6 inspect --execution-requirements script.js +``` + +That command is a shortened version of what the initializer Pod is executing. If the command produces an error, there's a problem with the script itself and it should be solved outside of the k6 Operator. The error itself may contain a hint to what's wrong, such as a syntax error. + +If the standalone `k6 inspect --execution-requirements` executes successfully, then it's likely a problem with `TestRun` deployment specific to your Kubernetes setup. A couple of recommendations here are: + +- Review the output of the initializer Pod: is it logged by the k6 process or by something else? + - :information_source: k6 Operator expects the initializer logs to contain only the output of `k6 inspect`. If there are any other log lines present, then the k6 Operator will fail to parse it and the test won't start. Refer to this [issue](https://github.com/grafana/k6-operator/issues/193) for more details. +- Check events in the initializer Job and Pod as they may contain another hint about what's wrong. + +### Non-existent ServiceAccount + +A ServiceAccount can be defined as `serviceAccountName` in a PrivateLoadZone, and as `runner.serviceAccountName` in a TestRun CRD. If the specified ServiceAccount doesn't exist, k6 Operator will successfully create Jobs but corresponding Pods will fail to be deployed, and the k6 Operator will wait indefinitely for Pods to be `Ready`. This error can be best seen in the events of the Job: + +```bash +kubectl describe job plz-test-xxxxxx-initializer +... +Events: + Warning FailedCreate 57s (x4 over 2m7s) job-controller Error creating: pods "plz-test-xxxxxx-initializer-" is forbidden: error looking up service account plz-ns/plz-sa: serviceaccount "plz-sa" not found +``` + +k6 Operator doesn't try to analyze such scenarios on its own, but you can refer to the following [issue](https://github.com/grafana/k6-operator/issues/260) for improvements. + +#### How to fix + +To fix this issue, the incorrect `serviceAccountName` must be corrected, and the TestRun or PrivateLoadZone resource must be re-deployed. + +### Non-existent `nodeSelector` + +`nodeSelector` can be defined as `nodeSelector` in a PrivateLoadZone, and as `runner.nodeSelector` in the TestRun CRD. + +This case is very similar to the [ServiceAccount](#non-existent-serviceaccount): the Pod creation will fail, but the error is slightly different: + +```bash +kubectl describe pod plz-test-xxxxxx-initializer-xxxxx +... +Events: + Warning FailedScheduling 48s (x5 over 4m6s) default-scheduler 0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector. +``` + +#### How to fix + +To fix this issue, the incorrect `nodeSelector` must be corrected and the TestRun or PrivateLoadZone resource must be re-deployed. + +### Insufficient resources + +A related problem can happen when the cluster does not have sufficient resources to deploy the runners. There's a higher probability of hitting this issue when setting small CPU and memory limits for runners or using options like `nodeSelector`, `runner.affinity` or `runner.topologySpreadConstraints`, and not having a set of nodes matching the spec. Alternatively, it can happen if there is a high number of runners required for the test (via `parallelism` in TestRun or during PLZ test run) and autoscaling of the cluster has limits on the maximum number of nodes, and can't provide the required resources on time or at all. + +This case is somewhat similar to the previous two: the k6 Operator will wait indefinitely and can be monitored with events in Jobs and Pods. If it's possible to fix the issue with insufficient resources on-the-fly, for example, by adding more nodes, k6 Operator will attempt to continue executing a test run. + +### OOM of a runner Pod + +If there's at least one runner Pod that OOM-ed, the whole test will be [stuck](https://github.com/grafana/k6-operator/issues/251) and will have to be deleted manually: + +```bash +kubectl -f my-test.yaml delete +# or +kubectl delete testrun my-test +``` + +In case of OOM, it makes sense to review the k6 script to understand what kind of resource usage this script requires. It may be that the k6 script can be improved to be more performant. Then, set the `spec.runner.resources` in the TestRun CRD, or `spec.resources` in the PrivateLoadZone CRD accordingly. + +### PrivateLoadZone: subscription error + +If there's an issue with your Grafana Cloud k6 subscription, there will be a 400 error in the logs with the message detailing the problem. For example: + +```bash +"Received error `(400) You have reached the maximum Number of private load zones your organization is allowed to have. Please contact support if you want to create more.`. Message from server ``" +``` + +To fix this issue, check your organization settings in Grafana Cloud k6 or contact Support. + +### PrivateLoadZone: Wrong token + +There can be two major problems with the authentication token: + +1. If the token wasn't created, or was created in a wrong location, the logs will show the following error: + + ```bash + Failed to load k6 Cloud token {"namespace": "plz-ns", "name": "my-plz", "reconcileID": "67c8bc73-f45b-4c7f-a9ad-4fd0ffb4d5f6", "name": "token-with-wrong-name", "secretNamespace": "plz-ns", "error": "Secret \"token-with-wrong-name\" not found"} + ``` + +2. If the token contains a corrupted value, or it's not an organizational token, the logs will show the following error: + + ```bash + "Received error `(403) Authentication token incorrect or expired`. Message from server ``" + ``` + +### PrivateLoadZone: Networking setup + +If you see any dial or connection errors in the logs of the k6 Operator, it makes sense to double-check the networking setup. For a PrivateLoadZone to operate, outbound traffic to Grafana Cloud k6 [must be allowed](https://grafana.com/docs/grafana-cloud/k6/author-run/private-load-zone-v2/#before-you-begin). To check the reachability of Grafana Cloud k6 endpoints: + +```bash +kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml +kubectl exec -it dnsutils -- nslookup ingest.k6.io +kubectl exec -it dnsutils -- nslookup api.k6.io +``` + +For more resources on troubleshooting networking, refer to the [Kubernetes docs](https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/). + +### PrivateLoadZone: Insufficient resources + +The PrivateLoadZone insufficient resources problem is similar to [insufficient resources issue](#insufficient-resources). But, when running a PrivateLoadZone test, the k6 Operator will wait only for a timeout period. When the timeout period is up, the test will be aborted by Grafana Cloud k6 and marked as such, both in the PrivateLoadZone and in Grafana Cloud k6. In other words, there is a time limit to fix this issue without restarting the test run. diff --git a/docs/sources/next/set-up/set-up-distributed-k6/upgrade-k6-operator.md b/docs/sources/next/set-up/set-up-distributed-k6/upgrade-k6-operator.md new file mode 100644 index 0000000000..2a46ef392e --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/upgrade-k6-operator.md @@ -0,0 +1,10 @@ +--- +weight: 200 +title: Upgrade k6 Operator +_build: + list: false +--- + +# Upgrade k6 Operator + + diff --git a/docs/sources/next/set-up/set-up-distributed-k6/usage/_index.md b/docs/sources/next/set-up/set-up-distributed-k6/usage/_index.md new file mode 100644 index 0000000000..48ddb3b67c --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/usage/_index.md @@ -0,0 +1,10 @@ +--- +weight: 300 +title: Usage +--- + +# Usage + +This section includes the following topics: + +{{< section depth=2 >}} diff --git a/docs/sources/next/set-up/set-up-distributed-k6/usage/common-options.md b/docs/sources/next/set-up/set-up-distributed-k6/usage/common-options.md new file mode 100644 index 0000000000..43d95d8625 --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/usage/common-options.md @@ -0,0 +1,57 @@ +--- +weight: 300 +title: Common options +--- + +# Common options + + + +The only options that are required as part of the `TestRun` CRD spec are `script` and `parallelism`. This guide covers some of the most common options. + +## Parallelism + +`parallelism` defines how many instances of k6 runners you want to create. Each instance is assigned an equal execution segment. For instance, if your test script is configured to run 200 VUs and `parallelism` is set to 4, the k6 Operator creates four k6 jobs, each running 50 VUs to achieve the desired VU count. + +## Separate + +`separate: true` indicates that the jobs created need to be distributed across different nodes. This is useful if you're running a test with a really high VU count and want to make sure the resources of each node won't become a bottleneck. + +## Service account + +If you want to use a custom Service Account you'll need to pass it into both the starter and the runner object: + +```yaml +apiVersion: k6.io/v1alpha1 +kind: TestRun +metadata: + name: +spec: + script: + configMap: + name: '' + runner: + serviceAccountName: + starter: + serviceAccountName: +``` + +## Runner + +Defines options for the test runner pods. The non-exhaustive list includes: + +- Passing resource limits and requests. +- Passing in labels and annotations. +- Passing in affinity and anti-affinity. +- Passing in a custom image. + +## Starter + +Defines options for the starter pod. The non-exhaustive list includes: + +- Passing in a custom image. +- Passing in labels and annotations. + +## Initializer + +By default, the initializer Job is defined with the same options as the runner Jobs, but its options can be overwritten by setting `.spec.initializer`. diff --git a/docs/sources/next/set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd.md b/docs/sources/next/set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd.md new file mode 100644 index 0000000000..64a7c626cb --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd.md @@ -0,0 +1,219 @@ +--- +weight: 100 +title: Run k6 scripts with TestRun CRD +--- + +# Run k6 scripts with TestRun CRD + +This guide covers how you can configure your k6 scripts to run using the k6 Operator. + +## Defining test scripts + +There are several ways to configure scripts in the `TestRun` CRD. The operator uses `configMap`, `volumeClaim` and `localFile` to serve test scripts to the jobs. + +### ConfigMap + +The main way to configure a script is to create a `ConfigMap` with the script contents: + +```bash +kubectl create configmap my-test --from-file /path/to/my/test.js +``` + +Then specify it in `TestRun`: + +```bash + script: + configMap: + name: my-test + file: test.js +``` + +{{< admonition type="note" >}} + +A single `ConfigMap` has a character limit of 1048576 bytes. If you need to have a larger test file, you have to use a `volumeClaim` or a `localFile` instead. + +{{< /admonition >}} + +### VolumeClaim + +If you have a PVC with the name `stress-test-volumeClaim` containing your script and any other supporting files, you can pass it to the test like this: + +```yaml +spec: + script: + volumeClaim: + name: 'stress-test-volumeClaim' + # test.js should exist inside /test/ folder. + # All the js files and directories test.js is importing + # should be inside the same directory as well. + file: 'test.js' +``` + +The pods will expect to find the script files in the `/test/` folder. If `volumeClaim` fails, that's the first place to check. The latest initializer pod doesn't generate any logs and when it can't find the file, it exits with an error. Refer to [this GitHub issue](https://github.com/grafana/k6-operator/issues/143) for potential improvements. + +#### Sample directory structure + +``` +├── test +│ ├── requests +│ │ ├── stress-test.js +│ ├── test.js +``` + +In the preceding example, `test.js` imports a function from `stress-test.js` and these files would look like this: + +```js +// test.js +import stressTest from './requests/stress-test.js'; + +export const options = { + vus: 50, + duration: '10s', +}; + +export default function () { + stressTest(); +} +``` + +```js +// stress-test.js +import { sleep, check } from 'k6'; +import http from 'k6/http'; + +export default () => { + const res = http.get('https://test-api.k6.io'); + check(res, { + 'status is 200': () => res.status === 200, + }); + sleep(1); +}; +``` + +### LocalFile + +If the script is present in the filesystem of a custom runner image, it can be accessed with the `localFile` option: + +```yaml +spec: + parallelism: 4 + script: + localFile: /test/test.js + runner: + image: +``` + +{{< admonition type="note" >}} + +If there is any limitation on the usage of `volumeClaim` in your cluster, you can use the `localFile` option. We recommend using `volumeClaim` if possible. + +{{< /admonition >}} + +### Multi-file tests + +In case your k6 script is split between multiple JavaScript files, you can create a `ConfigMap` with several data entries like this: + +```bash +kubectl create configmap scenarios-test --from-file test.js --from-file utils.js +``` + +If there are too many files to specify manually, using `kubectl` with a folder might be an option as well: + +```bash +kubectl create configmap scenarios-test --from-file=./test +``` + +Alternatively, you can create an archive with k6: + +```bash +k6 archive test.js [args] +``` + +The `k6 archive` command creates an `archive.tar` in your current folder. You can then use that file in the `configmap`, similarly to a JavaScript script: + +```bash +kubectl create configmap scenarios-test --from-file=archive.tar +``` + +If you use an archive, you must edit your YAML file for the `TestRun` deployment so that the `file` option is set to the correct entrypoint for the `k6 run` command: + +```yaml +# ... +spec: + script: + configMap: + name: 'crocodile-stress-test' + file: 'archive.tar' # <-- change here +``` + +## Run tests + +Tests are executed by applying the custom resource `TestRun` to a cluster where the k6 Operator is running. Additional optional properties of the `TestRun` CRD allow you to control some key aspects of a distributed execution. For example: + +```yaml +# k6-resource.yml + +apiVersion: k6.io/v1alpha1 +kind: TestRun +metadata: + name: k6-sample +spec: + parallelism: 4 + script: + configMap: + name: k6-test + file: test.js + separate: false + runner: + image: + metadata: + labels: + cool-label: foo + annotations: + cool-annotation: bar + securityContext: + runAsUser: 1000 + runAsGroup: 1000 + runAsNonRoot: true + resources: + limits: + cpu: 200m + memory: 1000Mi + requests: + cpu: 100m + memory: 500Mi + starter: + image: + metadata: + labels: + cool-label: foo + annotations: + cool-annotation: bar + securityContext: + runAsUser: 2000 + runAsGroup: 2000 + runAsNonRoot: true +``` + +A `TestRun` CR is created with this command: + +```bash +kubectl apply -f /path/to/your/k6-resource.yml +``` + +## Clean up resources + +After completing a test run, you need to clean up the test jobs that were created: + +```bash +kubectl delete -f /path/to/your/k6-resource.yml +``` + +Alternatively, you can configure the automatic deletion of all resources with the `cleanup` option: + +```yaml +spec: + cleanup: 'post' +``` + +With the `cleanup` option set, k6 Operator removes the `TestRun` CRD and all created resources once the test run ends. diff --git a/docs/sources/next/set-up/set-up-distributed-k6/usage/extensions.md b/docs/sources/next/set-up/set-up-distributed-k6/usage/extensions.md new file mode 100644 index 0000000000..e654032755 --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/usage/extensions.md @@ -0,0 +1,61 @@ +--- +weight: 200 +title: Use k6 Operator with k6 extensions +--- + +# Use k6 Operator with k6 extensions + +By default, the k6 Operator uses `ghcr.io/grafana/k6-operator:latest-runner` as the container image for the test jobs. + +If you want to use k6 [extensions](https://grafana.com/docs/k6//extensions/) built with [xk6](https://github.com/grafana/xk6), you'll need to create your own image and override the `image` property on the `TestRun` Kubernetes resource. + +For example, this is a `Dockerfile` that builds a k6 binary with the `xk6-output-influxdb` extension: + +```Dockerfile +# Build the k6 binary with the extension +FROM golang:1.20 as builder + +RUN go install go.k6.io/xk6/cmd/xk6@latest + +# For our example, we'll add support for output of test metrics to InfluxDB v2. +# Feel free to add other extensions using the '--with ...'. +RUN xk6 build \ + --with github.com/grafana/xk6-output-influxdb@latest \ + --output /k6 + +# Use the operator's base image and override the k6 binary +FROM grafana/k6:latest +COPY --from=builder /k6 /usr/bin/k6 +``` + +You can build the image based on this `Dockerfile` by executing: + +```bash +docker build -t k6-extended:local . +``` + +After the build completes, you can push the resulting `k6-extended:local` image to an image repository accessible to your Kubernetes cluster. + +You can then use that image as follows: + +```yaml +# k6-resource-with-extensions.yml + +apiVersion: k6.io/v1alpha1 +kind: TestRun +metadata: + name: k6-sample-with-extensions +spec: + parallelism: 4 + script: + configMap: + name: my-stress-test + file: test.js + runner: + image: k6-extended:local + env: + - name: K6_OUT + value: xk6-influxdb=http://influxdb.somewhere:8086/demo +``` + +Note that this examples overrides the default image with `k6-extended:latest`, and it includes environment variables that are required by the `xk6-output-influxdb` extension. diff --git a/docs/sources/next/set-up/set-up-distributed-k6/usage/k6-operator-to-gck6.md b/docs/sources/next/set-up/set-up-distributed-k6/usage/k6-operator-to-gck6.md new file mode 100644 index 0000000000..600decab73 --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/usage/k6-operator-to-gck6.md @@ -0,0 +1,73 @@ +--- +weight: 250 +title: Use the k6 Operator with Grafana Cloud k6 +--- + +# Use the k6 Operator with Grafana Cloud k6 + +Grafana Cloud k6 is the Grafana Cloud offering of k6, which gives you access to all of k6 capabilities, while Grafana handles the infrastructure, storage, and metrics aggregation and insights from your tests. + +When using the k6 Operator, you can still leverage Grafana Cloud k6 to get access to the metric storage and analysis that the platform offers. + +There are two ways to use the k6 Operator with Grafana Cloud k6: Private Load Zones and Cloud output. + +## Before you begin + +To use the k6 Operator with Grafana Cloud k6, you’ll need: + +- A [Grafana Cloud account](https://grafana.com/auth/sign-up/create-user). + +## Private Load Zones + +Private Load Zones (PLZ) are load zones that you can host inside your network by using the k6 Operator. You can start a cloud test in a PLZ by referencing it by name from your script, and the test will run in the nodes of your Kubernetes cluster. + +Refer to [Set up private load zones](https://grafana.com/docs/grafana-cloud/testing/k6/author-run/private-load-zone-v2/) for more details. + +## Cloud output + +With k6, you can send the [output from a test run to Grafana Cloud k6](https://grafana.com/docs/k6//results-output/real-time/cloud) with the `k6 run --out cloud script.js` command. This feature is also available in the k6 Operator if you have a Grafana Cloud account. + +{{< admonition type="note" >}} + +The cloud output option only supports a `parallelism` value of 20 or less. + +{{< /admonition >}} + +To use this option in k6 Operator, set the argument in YAML: + +```yaml +# ... +script: + configMap: + name: '' +arguments: --out cloud +# ... +``` + +Then, if you installed operator with bundle or Helm, create a secret with the following command: + +```bash +kubectl -n k6-operator-system create secret generic my-cloud-token \ + --from-literal=token= && kubectl -n k6-operator-system label secret my-cloud-token "k6cloud=token" +``` + +Alternatively, if you installed operator with a Makefile, you can uncomment the cloud output section in `config/default/kustomization.yaml` and copy your token from Grafana Cloud k6 there: + +```yaml +# Uncomment this section if you need cloud output and copy-paste your token +secretGenerator: + - name: cloud-token + literals: + - token= + options: + annotations: + kubernetes.io/service-account.name: k6-operator-controller + labels: + k6cloud: token +``` + +After updating the file, run `make deploy`. + +After these steps, you can run k6 with the cloud output and default values of `projectID` and `name`. + +Refer to [Cloud options](https://grafana.com/docs/grafana-cloud/testing/k6/author-run/cloud-scripting-extras/cloud-options/#cloud-options) for details on how to change the `projectID` and `name` options. diff --git a/docs/sources/next/set-up/set-up-distributed-k6/usage/reference.md b/docs/sources/next/set-up/set-up-distributed-k6/usage/reference.md new file mode 100644 index 0000000000..f6f8b6d06f --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/usage/reference.md @@ -0,0 +1,12 @@ +--- +weight: 500 +title: Reference +_build: + list: false +--- + +# Reference + + + +{{< section depth=2 >}} diff --git a/docs/sources/next/set-up/set-up-distributed-k6/usage/scheduling-tests.md b/docs/sources/next/set-up/set-up-distributed-k6/usage/scheduling-tests.md new file mode 100644 index 0000000000..02fc7503a5 --- /dev/null +++ b/docs/sources/next/set-up/set-up-distributed-k6/usage/scheduling-tests.md @@ -0,0 +1,106 @@ +--- +weight: 400 +title: Schedule k6 tests +--- + +# Schedule k6 tests + +While the k6 Operator doesn't support scheduling k6 tests directly, you can schedule tests with the `CronJob` object from Kubernetes directly. The `CronJob` would run on a schedule and execute the creation and deletion of the `TestRun` object. + +Running these tests requires a little more setup than a standalone test run. + +## Create a `ConfigMap` with k6 scripts + +Refer to [Run k6 scripts with `TestRun` CRD](https://grafana.com/docs/k6//set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd/) for details on how to create a `ConfigMap` with k6 scripts. + +## Create a ConfigMap of the YAML file for the `TestRun` job + + + +When using the `make deploy` installation method, add a `configMapGenerator` to the `kustomization.yaml`: + +```yaml +configMapGenerator: + - name: -config + files: + - .yaml +``` + +## Create a `ServiceAccount` for the `CronJob` + +For the `CronJob` to be able to create and delete `TestRun` objects, create a service account: + +```yaml +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: k6- +rules: + - apiGroups: + - k6.io + resources: + - testruns + verbs: + - create + - delete + - get + - list + - patch + - update + - watch +--- +kind: RoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: k6- +roleRef: + kind: Role + name: k6- + apiGroup: rbac.authorization.k8s.io +subjects: + - kind: ServiceAccount + name: k6- + namespace: +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: k6- +``` + +## Create a `CronJob` + +This is an example of how to define a `CronJob` in a YAML file: + +```yaml +# snapshotter.yml +apiVersion: batch/v1beta1 +kind: CronJob +metadata: + name: -cron +spec: + schedule: '' + concurrencyPolicy: Forbid + jobTemplate: + spec: + template: + spec: + serviceAccount: k6 + containers: + - name: kubectl + image: bitnami/kubectl + volumeMounts: + - name: k6-yaml + mountPath: /tmp/ + command: + - /bin/bash + args: + - -c + - 'kubectl delete -f /tmp/.yaml; kubectl apply -f /tmp/.yaml' + restartPolicy: OnFailure + volumes: + - name: k6-yaml + configMap: + name: -config +``` diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/_index.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/_index.md new file mode 100644 index 0000000000..06687090b0 --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/_index.md @@ -0,0 +1,19 @@ +--- +weight: 150 +title: Set up distributed k6 +--- + +# Set up distributed k6 + +It's possible to run large load tests even when using a single node, or single machine. But, depending on your use case, you might also want to run a distributed Grafana k6 test in your own infrastructure. + +A couple of reasons why you might want to do this: + +- You run your application in Kubernetes and would like k6 to be executed in the same fashion as all your other infrastructure components. +- You want to run your tests within your private network for security and/or privacy reasons. + +[k6 Operator](https://github.com/grafana/k6-operator) is a Kubernetes operator that you can use to run distributed k6 tests in your cluster. + +This section includes the following topics: + +{{< section depth=2 >}} diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/install-k6-operator.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/install-k6-operator.md new file mode 100644 index 0000000000..315ec003ca --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/install-k6-operator.md @@ -0,0 +1,114 @@ +--- +weight: 100 +title: Install k6 Operator +--- + +# Install k6 Operator + +This guide provides step-by-step instructions on how to install k6 Operator. + +## Before you begin + +To install k6 Operator, you'll need: + +- A Kubernetes cluster, along with access to it. +- [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl). + +## Deploy the operator + +There are three different options that you can use to deploy the k6 Operator. + +### Deploy with bundle + +The easiest way to install the operator is with bundle: + +```bash +curl https://raw.githubusercontent.com/grafana/k6-operator/main/bundle.yaml | kubectl apply -f - +``` + +Bundle includes default manifests for k6 Operator, including a `k6-operator-system` namespace and k6 Operator deployment with the latest tagged Docker image. Customizations can be made on top of this manifest as needed, for example, with `kustomize`. + +### Deploy with Helm + +Helm releases of k6 Operator are published together with other Grafana Helm charts. You can install it with the following commands: + +```bash +helm repo add grafana https://grafana.github.io/helm-charts +helm repo update +helm install k6-operator grafana/k6-operator +``` + +You can also pass additional configuration options with a `values.yaml` file: + +```bash +helm install k6-operator grafana/k6-operator -f values.yaml +``` + +Refer to the [k6 Operator samples folder](https://github.com/grafana/k6-operator/blob/main/charts/k6-operator/samples/customAnnotationsAndLabels.yaml) for an example file. + +You can find a complete list of Helm options in the [k6 Operator charts folder](https://github.com/grafana/k6-operator/blob/main/charts/k6-operator/README.md). + +### Deploy with Makefile + +In order to install the operator with a Makefile, you'll need: + +- [go](https://go.dev/doc/install) +- [kustomize](https://kubectl.docs.kubernetes.io/installation/kustomize/) + +A more manual, low-level way to install the k6 operator is by running the command below: + +```bash +make deploy +``` + +This method may be more useful for development of the k6 Operator, depending on specifics of the setup. + +## Install the CRD + +The k6 Operator includes custom resources called `TestRun`, `PrivateLoadZone`, and `K6`. They're automatically installed when you do a deployment or install a bundle, but you can also manually install them by running: + +```bash +make install +``` + +{{< admonition type="warning" >}} + +The `K6` CRD has been replaced by the `TestRun` CRD and will be deprecated in the future. We recommend using the `TestRun` CRD. + +{{< /admonition >}} + +## Watch namespace + +By default, the k6 Operator watches the `TestRun` and `PrivateLoadZone` custom resources in all namespaces. You can also configure the k6 Operator to watch a specific namespace by setting the `WATCH_NAMESPACE` environment variable for the operator's deployment: + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: k6-operator-controller-manager + namespace: k6-operator-system +spec: + template: + spec: + containers: + - name: manager + image: ghcr.io/grafana/k6-operator:controller-v0.0.14 + env: + - name: WATCH_NAMESPACE + value: 'some-ns' +# ... +``` + +## Uninstall k6 Operator + +You can remove all of the resources created by the k6 Operator with `bundle`: + +```bash +curl https://raw.githubusercontent.com/grafana/k6-operator/main/bundle.yaml | kubectl delete -f - +``` + +Or with the `make` command: + +```bash +make delete +``` diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/troubleshooting.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/troubleshooting.md new file mode 100644 index 0000000000..a8bf60698b --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/troubleshooting.md @@ -0,0 +1,260 @@ +--- +weight: 400 +title: Troubleshooting +--- + +# Troubleshooting + +This topic includes instructions to help you troubleshoot common issues with the k6 Operator. + +## Common tricks + +### Test your script locally + +Always run your script locally before trying to run it with the k6 Operator: + +```bash +k6 run script.js +``` + +If you're using environment variables or CLI options, pass them in as well: + +```bash +MY_ENV_VAR=foo k6 run script.js --tag my_tag=bar +``` + +That ensures that the script has correct syntax and can be parsed with k6 in the first place. Additionally, running locally can help you check if the configured options are doing what you expect. If there are any errors or unexpected results in the output of `k6 run`, make sure to fix those prior to deploying the script elsewhere. + +### `TestRun` deployment + +#### The pods + +In case of one `TestRun` Custom Resource (CR) creation with `parallelism: n`, there are certain repeating patterns: + +1. There will be `n + 2` Jobs (with corresponding Pods) created: initializer, starter, and `n` runners. +1. If any of these Jobs didn't result in a Pod being deployed, there must be an issue with that Job. Some commands that can help here: + + ```bash + kubectl get jobs -A + kubectl describe job mytest-initializer + ``` + +1. If one of the Pods was deployed but finished with `Error`, you can check its logs with the following command: + + ```bash + kubectl logs mytest-initializer-xxxxx + ``` + +If the Pods seem to be working but not producing an expected result and there's not enough information in the logs, you can use the k6 [verbose option](https://grafana.com/docs/k6//using-k6/k6-options/#options) in the `TestRun` spec: + +```yaml +apiVersion: k6.io/v1alpha1 +kind: TestRun +metadata: + name: k6-sample +spec: + parallelism: 2 + script: + configMap: + name: 'test' + file: 'test.js' + arguments: --verbose +``` + +#### k6 Operator + +Another source of info is the k6 Operator itself. It's deployed as a Kubernetes `Deployment`, with `replicas: 1` by default, and its logs together with observations about the Pods from the previous section usually contain enough information to help you diagnose any issues. With the standard deployment, the logs of the k6 Operator can be checked with: + +```bash +kubectl -n k6-operator-system -c manager logs k6-operator-controller-manager-xxxxxxxx-xxxxx +``` + +#### Inspect `TestRun` resource + +After you deploy a `TestRun` CR, you can inspect it the same way as any other resource: + +```bash +kubectl describe testrun my-testrun +``` + +Firstly, check if the spec is as expected. Then, see the current status: + +```yaml +Status: + Conditions: + Last Transition Time: 2024-01-17T10:30:01Z + Message: + Reason: CloudTestRunFalse + Status: False + Type: CloudTestRun + Last Transition Time: 2024-01-17T10:29:58Z + Message: + Reason: TestRunPreparation + Status: Unknown + Type: TestRunRunning + Last Transition Time: 2024-01-17T10:29:58Z + Message: + Reason: CloudTestRunAbortedFalse + Status: False + Type: CloudTestRunAborted + Last Transition Time: 2024-01-17T10:29:58Z + Message: + Reason: CloudPLZTestRunFalse + Status: False + Type: CloudPLZTestRun + Stage: error +``` + +If `Stage` is equal to `error`, you can check the logs of k6 Operator. + +Conditions can be used as a source of info as well, but it's a more advanced troubleshooting option that should be used if the previous steps weren't enough to diagnose the issue. Note that conditions that start with the `Cloud` prefix only matter in the setting of k6 Cloud test runs, for example, for cloud output and PLZ test runs. + +### `PrivateLoadZone` deployment + +If the `PrivateLoadZone` CR was successfully created in Kubernetes, it should become visible in your account in Grafana Cloud k6 (GCk6) interface soon afterwards. If it doesn't appear in the UI, then there is likely a problem to troubleshoot. + +First, go over the [guide](https://grafana.com/docs/grafana-cloud/k6/author-run/private-load-zone-v2/) to double-check if all the steps have been done correctly and successfully. + +Unlike `TestRun` deployment, when a `PrivateLoadZone` is first created, there are no additional resources deployed. So, the only source for troubleshooting are the logs of k6 Operator. See the [previous subsection](#k6-operator) on how to access its logs. Any errors there might be a hint to diagnose the issue. Refer to [PrivateLoadZone: subscription error](#privateloadzone-subscription-error) for more details. + +### Running tests in `PrivateLoadZone` + +Each time a user runs a test in a PLZ, for example with `k6 cloud script.js`, there is a corresponding `TestRun` being deployed by the k6 Operator. This `TestRun` will be deployed in the same namespace as its `PrivateLoadZone`. If the test is misbehaving, for example, it errors out, or doesn't produce the expected result, then you can check: + +1. If there are any messages in the GCk6 UI. +2. If there are any messages in the output of the `k6 cloud` command. +3. The resources and their logs, the same way as with a [standalone `TestRun` deployment](#testrun-deployment) + +## Common scenarios + +### Issues with environment variables + +Refer to [Environment variables](https://github.com/grafana/k6-operator/blob/main/docs/env-vars.md) for details on how to pass environment variables to the k6 Operator. + +### Tags not working + +Tags are a rather common source of errors when using the k6 Operator. For example, the following tags would lead to parsing errors: + +```yaml + arguments: --tag product_id="Test A" + # or + arguments: --tag foo=\"bar\" +``` + +You can see those errors in the logs of either the initializer or the runner Pod, for example: + +```bash +time="2024-01-11T11:11:27Z" level=error msg="invalid argument \"product_id=\\\"Test\" for \"--tag\" flag: parse error on line 1, column 12: bare \" in non-quoted-field" +``` + +This is a common problem with escaping the characters. You can find an [issue](https://github.com/grafana/k6-operator/issues/211) in the k6 Operator repository that can be upvoted. + +### Initializer logs an error but it's not about tags + +This can happen because of lack of attention to the [preparation](#preparation) step. One command that you can use to help diagnose issues with your script is the following: + +```bash +k6 inspect --execution-requirements script.js +``` + +That command is a shortened version of what the initializer Pod is executing. If the command produces an error, there's a problem with the script itself and it should be solved outside of the k6 Operator. The error itself may contain a hint to what's wrong, such as a syntax error. + +If the standalone `k6 inspect --execution-requirements` executes successfully, then it's likely a problem with `TestRun` deployment specific to your Kubernetes setup. A couple of recommendations here are: + +- Review the output of the initializer Pod: is it logged by the k6 process or by something else? + - :information_source: k6 Operator expects the initializer logs to contain only the output of `k6 inspect`. If there are any other log lines present, then the k6 Operator will fail to parse it and the test won't start. Refer to this [issue](https://github.com/grafana/k6-operator/issues/193) for more details. +- Check events in the initializer Job and Pod as they may contain another hint about what's wrong. + +### Non-existent ServiceAccount + +A ServiceAccount can be defined as `serviceAccountName` in a PrivateLoadZone, and as `runner.serviceAccountName` in a TestRun CRD. If the specified ServiceAccount doesn't exist, k6 Operator will successfully create Jobs but corresponding Pods will fail to be deployed, and the k6 Operator will wait indefinitely for Pods to be `Ready`. This error can be best seen in the events of the Job: + +```bash +kubectl describe job plz-test-xxxxxx-initializer +... +Events: + Warning FailedCreate 57s (x4 over 2m7s) job-controller Error creating: pods "plz-test-xxxxxx-initializer-" is forbidden: error looking up service account plz-ns/plz-sa: serviceaccount "plz-sa" not found +``` + +k6 Operator doesn't try to analyze such scenarios on its own, but you can refer to the following [issue](https://github.com/grafana/k6-operator/issues/260) for improvements. + +#### How to fix + +To fix this issue, the incorrect `serviceAccountName` must be corrected, and the TestRun or PrivateLoadZone resource must be re-deployed. + +### Non-existent `nodeSelector` + +`nodeSelector` can be defined as `nodeSelector` in a PrivateLoadZone, and as `runner.nodeSelector` in the TestRun CRD. + +This case is very similar to the [ServiceAccount](#non-existent-serviceaccount): the Pod creation will fail, but the error is slightly different: + +```bash +kubectl describe pod plz-test-xxxxxx-initializer-xxxxx +... +Events: + Warning FailedScheduling 48s (x5 over 4m6s) default-scheduler 0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector. +``` + +#### How to fix + +To fix this issue, the incorrect `nodeSelector` must be corrected and the TestRun or PrivateLoadZone resource must be re-deployed. + +### Insufficient resources + +A related problem can happen when the cluster does not have sufficient resources to deploy the runners. There's a higher probability of hitting this issue when setting small CPU and memory limits for runners or using options like `nodeSelector`, `runner.affinity` or `runner.topologySpreadConstraints`, and not having a set of nodes matching the spec. Alternatively, it can happen if there is a high number of runners required for the test (via `parallelism` in TestRun or during PLZ test run) and autoscaling of the cluster has limits on the maximum number of nodes, and can't provide the required resources on time or at all. + +This case is somewhat similar to the previous two: the k6 Operator will wait indefinitely and can be monitored with events in Jobs and Pods. If it's possible to fix the issue with insufficient resources on-the-fly, for example, by adding more nodes, k6 Operator will attempt to continue executing a test run. + +### OOM of a runner Pod + +If there's at least one runner Pod that OOM-ed, the whole test will be [stuck](https://github.com/grafana/k6-operator/issues/251) and will have to be deleted manually: + +```bash +kubectl -f my-test.yaml delete +# or +kubectl delete testrun my-test +``` + +In case of OOM, it makes sense to review the k6 script to understand what kind of resource usage this script requires. It may be that the k6 script can be improved to be more performant. Then, set the `spec.runner.resources` in the TestRun CRD, or `spec.resources` in the PrivateLoadZone CRD accordingly. + +### PrivateLoadZone: subscription error + +If there's an issue with your Grafana Cloud k6 subscription, there will be a 400 error in the logs with the message detailing the problem. For example: + +```bash +"Received error `(400) You have reached the maximum Number of private load zones your organization is allowed to have. Please contact support if you want to create more.`. Message from server ``" +``` + +To fix this issue, check your organization settings in Grafana Cloud k6 or contact Support. + +### PrivateLoadZone: Wrong token + +There can be two major problems with the authentication token: + +1. If the token wasn't created, or was created in a wrong location, the logs will show the following error: + + ```bash + Failed to load k6 Cloud token {"namespace": "plz-ns", "name": "my-plz", "reconcileID": "67c8bc73-f45b-4c7f-a9ad-4fd0ffb4d5f6", "name": "token-with-wrong-name", "secretNamespace": "plz-ns", "error": "Secret \"token-with-wrong-name\" not found"} + ``` + +2. If the token contains a corrupted value, or it's not an organizational token, the logs will show the following error: + + ```bash + "Received error `(403) Authentication token incorrect or expired`. Message from server ``" + ``` + +### PrivateLoadZone: Networking setup + +If you see any dial or connection errors in the logs of the k6 Operator, it makes sense to double-check the networking setup. For a PrivateLoadZone to operate, outbound traffic to Grafana Cloud k6 [must be allowed](https://grafana.com/docs/grafana-cloud/k6/author-run/private-load-zone-v2/#before-you-begin). To check the reachability of Grafana Cloud k6 endpoints: + +```bash +kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml +kubectl exec -it dnsutils -- nslookup ingest.k6.io +kubectl exec -it dnsutils -- nslookup api.k6.io +``` + +For more resources on troubleshooting networking, refer to the [Kubernetes docs](https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/). + +### PrivateLoadZone: Insufficient resources + +The PrivateLoadZone insufficient resources problem is similar to [insufficient resources issue](#insufficient-resources). But, when running a PrivateLoadZone test, the k6 Operator will wait only for a timeout period. When the timeout period is up, the test will be aborted by Grafana Cloud k6 and marked as such, both in the PrivateLoadZone and in Grafana Cloud k6. In other words, there is a time limit to fix this issue without restarting the test run. diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/upgrade-k6-operator.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/upgrade-k6-operator.md new file mode 100644 index 0000000000..2a46ef392e --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/upgrade-k6-operator.md @@ -0,0 +1,10 @@ +--- +weight: 200 +title: Upgrade k6 Operator +_build: + list: false +--- + +# Upgrade k6 Operator + + diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/_index.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/_index.md new file mode 100644 index 0000000000..48ddb3b67c --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/_index.md @@ -0,0 +1,10 @@ +--- +weight: 300 +title: Usage +--- + +# Usage + +This section includes the following topics: + +{{< section depth=2 >}} diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/common-options.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/common-options.md new file mode 100644 index 0000000000..43d95d8625 --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/common-options.md @@ -0,0 +1,57 @@ +--- +weight: 300 +title: Common options +--- + +# Common options + + + +The only options that are required as part of the `TestRun` CRD spec are `script` and `parallelism`. This guide covers some of the most common options. + +## Parallelism + +`parallelism` defines how many instances of k6 runners you want to create. Each instance is assigned an equal execution segment. For instance, if your test script is configured to run 200 VUs and `parallelism` is set to 4, the k6 Operator creates four k6 jobs, each running 50 VUs to achieve the desired VU count. + +## Separate + +`separate: true` indicates that the jobs created need to be distributed across different nodes. This is useful if you're running a test with a really high VU count and want to make sure the resources of each node won't become a bottleneck. + +## Service account + +If you want to use a custom Service Account you'll need to pass it into both the starter and the runner object: + +```yaml +apiVersion: k6.io/v1alpha1 +kind: TestRun +metadata: + name: +spec: + script: + configMap: + name: '' + runner: + serviceAccountName: + starter: + serviceAccountName: +``` + +## Runner + +Defines options for the test runner pods. The non-exhaustive list includes: + +- Passing resource limits and requests. +- Passing in labels and annotations. +- Passing in affinity and anti-affinity. +- Passing in a custom image. + +## Starter + +Defines options for the starter pod. The non-exhaustive list includes: + +- Passing in a custom image. +- Passing in labels and annotations. + +## Initializer + +By default, the initializer Job is defined with the same options as the runner Jobs, but its options can be overwritten by setting `.spec.initializer`. diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd.md new file mode 100644 index 0000000000..64a7c626cb --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd.md @@ -0,0 +1,219 @@ +--- +weight: 100 +title: Run k6 scripts with TestRun CRD +--- + +# Run k6 scripts with TestRun CRD + +This guide covers how you can configure your k6 scripts to run using the k6 Operator. + +## Defining test scripts + +There are several ways to configure scripts in the `TestRun` CRD. The operator uses `configMap`, `volumeClaim` and `localFile` to serve test scripts to the jobs. + +### ConfigMap + +The main way to configure a script is to create a `ConfigMap` with the script contents: + +```bash +kubectl create configmap my-test --from-file /path/to/my/test.js +``` + +Then specify it in `TestRun`: + +```bash + script: + configMap: + name: my-test + file: test.js +``` + +{{< admonition type="note" >}} + +A single `ConfigMap` has a character limit of 1048576 bytes. If you need to have a larger test file, you have to use a `volumeClaim` or a `localFile` instead. + +{{< /admonition >}} + +### VolumeClaim + +If you have a PVC with the name `stress-test-volumeClaim` containing your script and any other supporting files, you can pass it to the test like this: + +```yaml +spec: + script: + volumeClaim: + name: 'stress-test-volumeClaim' + # test.js should exist inside /test/ folder. + # All the js files and directories test.js is importing + # should be inside the same directory as well. + file: 'test.js' +``` + +The pods will expect to find the script files in the `/test/` folder. If `volumeClaim` fails, that's the first place to check. The latest initializer pod doesn't generate any logs and when it can't find the file, it exits with an error. Refer to [this GitHub issue](https://github.com/grafana/k6-operator/issues/143) for potential improvements. + +#### Sample directory structure + +``` +├── test +│ ├── requests +│ │ ├── stress-test.js +│ ├── test.js +``` + +In the preceding example, `test.js` imports a function from `stress-test.js` and these files would look like this: + +```js +// test.js +import stressTest from './requests/stress-test.js'; + +export const options = { + vus: 50, + duration: '10s', +}; + +export default function () { + stressTest(); +} +``` + +```js +// stress-test.js +import { sleep, check } from 'k6'; +import http from 'k6/http'; + +export default () => { + const res = http.get('https://test-api.k6.io'); + check(res, { + 'status is 200': () => res.status === 200, + }); + sleep(1); +}; +``` + +### LocalFile + +If the script is present in the filesystem of a custom runner image, it can be accessed with the `localFile` option: + +```yaml +spec: + parallelism: 4 + script: + localFile: /test/test.js + runner: + image: +``` + +{{< admonition type="note" >}} + +If there is any limitation on the usage of `volumeClaim` in your cluster, you can use the `localFile` option. We recommend using `volumeClaim` if possible. + +{{< /admonition >}} + +### Multi-file tests + +In case your k6 script is split between multiple JavaScript files, you can create a `ConfigMap` with several data entries like this: + +```bash +kubectl create configmap scenarios-test --from-file test.js --from-file utils.js +``` + +If there are too many files to specify manually, using `kubectl` with a folder might be an option as well: + +```bash +kubectl create configmap scenarios-test --from-file=./test +``` + +Alternatively, you can create an archive with k6: + +```bash +k6 archive test.js [args] +``` + +The `k6 archive` command creates an `archive.tar` in your current folder. You can then use that file in the `configmap`, similarly to a JavaScript script: + +```bash +kubectl create configmap scenarios-test --from-file=archive.tar +``` + +If you use an archive, you must edit your YAML file for the `TestRun` deployment so that the `file` option is set to the correct entrypoint for the `k6 run` command: + +```yaml +# ... +spec: + script: + configMap: + name: 'crocodile-stress-test' + file: 'archive.tar' # <-- change here +``` + +## Run tests + +Tests are executed by applying the custom resource `TestRun` to a cluster where the k6 Operator is running. Additional optional properties of the `TestRun` CRD allow you to control some key aspects of a distributed execution. For example: + +```yaml +# k6-resource.yml + +apiVersion: k6.io/v1alpha1 +kind: TestRun +metadata: + name: k6-sample +spec: + parallelism: 4 + script: + configMap: + name: k6-test + file: test.js + separate: false + runner: + image: + metadata: + labels: + cool-label: foo + annotations: + cool-annotation: bar + securityContext: + runAsUser: 1000 + runAsGroup: 1000 + runAsNonRoot: true + resources: + limits: + cpu: 200m + memory: 1000Mi + requests: + cpu: 100m + memory: 500Mi + starter: + image: + metadata: + labels: + cool-label: foo + annotations: + cool-annotation: bar + securityContext: + runAsUser: 2000 + runAsGroup: 2000 + runAsNonRoot: true +``` + +A `TestRun` CR is created with this command: + +```bash +kubectl apply -f /path/to/your/k6-resource.yml +``` + +## Clean up resources + +After completing a test run, you need to clean up the test jobs that were created: + +```bash +kubectl delete -f /path/to/your/k6-resource.yml +``` + +Alternatively, you can configure the automatic deletion of all resources with the `cleanup` option: + +```yaml +spec: + cleanup: 'post' +``` + +With the `cleanup` option set, k6 Operator removes the `TestRun` CRD and all created resources once the test run ends. diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/extensions.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/extensions.md new file mode 100644 index 0000000000..e654032755 --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/extensions.md @@ -0,0 +1,61 @@ +--- +weight: 200 +title: Use k6 Operator with k6 extensions +--- + +# Use k6 Operator with k6 extensions + +By default, the k6 Operator uses `ghcr.io/grafana/k6-operator:latest-runner` as the container image for the test jobs. + +If you want to use k6 [extensions](https://grafana.com/docs/k6//extensions/) built with [xk6](https://github.com/grafana/xk6), you'll need to create your own image and override the `image` property on the `TestRun` Kubernetes resource. + +For example, this is a `Dockerfile` that builds a k6 binary with the `xk6-output-influxdb` extension: + +```Dockerfile +# Build the k6 binary with the extension +FROM golang:1.20 as builder + +RUN go install go.k6.io/xk6/cmd/xk6@latest + +# For our example, we'll add support for output of test metrics to InfluxDB v2. +# Feel free to add other extensions using the '--with ...'. +RUN xk6 build \ + --with github.com/grafana/xk6-output-influxdb@latest \ + --output /k6 + +# Use the operator's base image and override the k6 binary +FROM grafana/k6:latest +COPY --from=builder /k6 /usr/bin/k6 +``` + +You can build the image based on this `Dockerfile` by executing: + +```bash +docker build -t k6-extended:local . +``` + +After the build completes, you can push the resulting `k6-extended:local` image to an image repository accessible to your Kubernetes cluster. + +You can then use that image as follows: + +```yaml +# k6-resource-with-extensions.yml + +apiVersion: k6.io/v1alpha1 +kind: TestRun +metadata: + name: k6-sample-with-extensions +spec: + parallelism: 4 + script: + configMap: + name: my-stress-test + file: test.js + runner: + image: k6-extended:local + env: + - name: K6_OUT + value: xk6-influxdb=http://influxdb.somewhere:8086/demo +``` + +Note that this examples overrides the default image with `k6-extended:latest`, and it includes environment variables that are required by the `xk6-output-influxdb` extension. diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/k6-operator-to-gck6.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/k6-operator-to-gck6.md new file mode 100644 index 0000000000..600decab73 --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/k6-operator-to-gck6.md @@ -0,0 +1,73 @@ +--- +weight: 250 +title: Use the k6 Operator with Grafana Cloud k6 +--- + +# Use the k6 Operator with Grafana Cloud k6 + +Grafana Cloud k6 is the Grafana Cloud offering of k6, which gives you access to all of k6 capabilities, while Grafana handles the infrastructure, storage, and metrics aggregation and insights from your tests. + +When using the k6 Operator, you can still leverage Grafana Cloud k6 to get access to the metric storage and analysis that the platform offers. + +There are two ways to use the k6 Operator with Grafana Cloud k6: Private Load Zones and Cloud output. + +## Before you begin + +To use the k6 Operator with Grafana Cloud k6, you’ll need: + +- A [Grafana Cloud account](https://grafana.com/auth/sign-up/create-user). + +## Private Load Zones + +Private Load Zones (PLZ) are load zones that you can host inside your network by using the k6 Operator. You can start a cloud test in a PLZ by referencing it by name from your script, and the test will run in the nodes of your Kubernetes cluster. + +Refer to [Set up private load zones](https://grafana.com/docs/grafana-cloud/testing/k6/author-run/private-load-zone-v2/) for more details. + +## Cloud output + +With k6, you can send the [output from a test run to Grafana Cloud k6](https://grafana.com/docs/k6//results-output/real-time/cloud) with the `k6 run --out cloud script.js` command. This feature is also available in the k6 Operator if you have a Grafana Cloud account. + +{{< admonition type="note" >}} + +The cloud output option only supports a `parallelism` value of 20 or less. + +{{< /admonition >}} + +To use this option in k6 Operator, set the argument in YAML: + +```yaml +# ... +script: + configMap: + name: '' +arguments: --out cloud +# ... +``` + +Then, if you installed operator with bundle or Helm, create a secret with the following command: + +```bash +kubectl -n k6-operator-system create secret generic my-cloud-token \ + --from-literal=token= && kubectl -n k6-operator-system label secret my-cloud-token "k6cloud=token" +``` + +Alternatively, if you installed operator with a Makefile, you can uncomment the cloud output section in `config/default/kustomization.yaml` and copy your token from Grafana Cloud k6 there: + +```yaml +# Uncomment this section if you need cloud output and copy-paste your token +secretGenerator: + - name: cloud-token + literals: + - token= + options: + annotations: + kubernetes.io/service-account.name: k6-operator-controller + labels: + k6cloud: token +``` + +After updating the file, run `make deploy`. + +After these steps, you can run k6 with the cloud output and default values of `projectID` and `name`. + +Refer to [Cloud options](https://grafana.com/docs/grafana-cloud/testing/k6/author-run/cloud-scripting-extras/cloud-options/#cloud-options) for details on how to change the `projectID` and `name` options. diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/reference.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/reference.md new file mode 100644 index 0000000000..f6f8b6d06f --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/reference.md @@ -0,0 +1,12 @@ +--- +weight: 500 +title: Reference +_build: + list: false +--- + +# Reference + + + +{{< section depth=2 >}} diff --git a/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/scheduling-tests.md b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/scheduling-tests.md new file mode 100644 index 0000000000..02fc7503a5 --- /dev/null +++ b/docs/sources/v0.52.x/set-up/set-up-distributed-k6/usage/scheduling-tests.md @@ -0,0 +1,106 @@ +--- +weight: 400 +title: Schedule k6 tests +--- + +# Schedule k6 tests + +While the k6 Operator doesn't support scheduling k6 tests directly, you can schedule tests with the `CronJob` object from Kubernetes directly. The `CronJob` would run on a schedule and execute the creation and deletion of the `TestRun` object. + +Running these tests requires a little more setup than a standalone test run. + +## Create a `ConfigMap` with k6 scripts + +Refer to [Run k6 scripts with `TestRun` CRD](https://grafana.com/docs/k6//set-up/set-up-distributed-k6/usage/executing-k6-scripts-with-testrun-crd/) for details on how to create a `ConfigMap` with k6 scripts. + +## Create a ConfigMap of the YAML file for the `TestRun` job + + + +When using the `make deploy` installation method, add a `configMapGenerator` to the `kustomization.yaml`: + +```yaml +configMapGenerator: + - name: -config + files: + - .yaml +``` + +## Create a `ServiceAccount` for the `CronJob` + +For the `CronJob` to be able to create and delete `TestRun` objects, create a service account: + +```yaml +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: k6- +rules: + - apiGroups: + - k6.io + resources: + - testruns + verbs: + - create + - delete + - get + - list + - patch + - update + - watch +--- +kind: RoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: k6- +roleRef: + kind: Role + name: k6- + apiGroup: rbac.authorization.k8s.io +subjects: + - kind: ServiceAccount + name: k6- + namespace: +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: k6- +``` + +## Create a `CronJob` + +This is an example of how to define a `CronJob` in a YAML file: + +```yaml +# snapshotter.yml +apiVersion: batch/v1beta1 +kind: CronJob +metadata: + name: -cron +spec: + schedule: '' + concurrencyPolicy: Forbid + jobTemplate: + spec: + template: + spec: + serviceAccount: k6 + containers: + - name: kubectl + image: bitnami/kubectl + volumeMounts: + - name: k6-yaml + mountPath: /tmp/ + command: + - /bin/bash + args: + - -c + - 'kubectl delete -f /tmp/.yaml; kubectl apply -f /tmp/.yaml' + restartPolicy: OnFailure + volumes: + - name: k6-yaml + configMap: + name: -config +```