-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt benchmarks configuration to be in line with updated charts. #213
base: main
Are you sure you want to change the base?
Conversation
@rodrigo-lourenco-lopes it is about disk space not memory :D |
236a14b
to
16e76ae
Compare
Ups, I misread the intention. Just corrected it.
@Zelldon also it seems that we are having issues pulling our chart dependency, have you seen this issue before? |
I haven't seen this before 🤔 |
This commit also changed the templates and the configuration files, we will have to adapt the templates to this camunda/camunda-platform-helm@28d7927#diff-139e41c5d8eff029161d9db039ef8f0d46df8d6de4e91e42d7abb3c2103ba61a |
1c37bae
to
87e0ae0
Compare
@rodrigo-lourenco-lopes can you try to deploy this into a namespace and check whether it comes up? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rodrigo-lourenco-lopes I think two or three things we still need to validate
@@ -17,7 +17,7 @@ dependencies: | |||
version: 6.4.0 | |||
condition: "prometheus-elasticsearch-exporter.enabled" | |||
maintainers: | |||
- name: Zelldon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
charts/zeebe-benchmark/values.yaml
Outdated
# Persistent Volume Storage Class | ||
storageClass: "ssd" | ||
# Persistent Volume Size | ||
size: 16Gi | ||
size: 32Gi | ||
# Persistent Volume Access Modes | ||
accessModes: [ "ReadWriteOnce" ] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ I think we have to validate whether the values files also have changed.
@ChrisKujawa It is starting core pods but these are failing to initialize, do we need to change the values.yaml for this? |
Looks to me that a complete new Core section was introduced https://github.com/camunda/camunda-platform-helm/blob/main/charts/camunda-platform-alpha/values.yaml#L1968-L1980 |
I will validate with stakeholders |
ca6bb5f
to
a8af21e
Compare
charts/zeebe-benchmark/Chart.yaml
Outdated
@@ -10,7 +10,7 @@ sources: | |||
dependencies: | |||
- name: camunda-platform | |||
repository: "oci://ghcr.io/camunda/helm" | |||
version: "0.0.0-8.7.0-alpha1" | |||
version: "0.0.0-snapshot-alpha" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.0.0-8.7.0-alpha2
. For now we need this to be able to use the fixes we made in the camunda platform helm charts.
1ca03f3
to
12f8052
Compare
@rodrigo-lourenco-lopes sorry for hijacking your PR 😄 I can't assign you as a reviewer, but please have a look at the last 3 commits I made. These should fix benchmarks 🤞 Mine is currently running here @ChrisKujawa If you have the time I'd appreciate it if you could also look at my last 3 commits 🙂 |
charts/zeebe-benchmark/values.yaml
Outdated
pvc: | ||
accessModes: [ "ReadWriteOnce" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤡 Uh real break of the objects names
env: | ||
- name: OPERATE_LOG_APPENDER | ||
value: Stackdriver | ||
- name: OPERATE_LOG_STACKDRIVER_SERVICENAME | ||
value: operate | ||
- name: OPERATE_LOG_STACKDRIVER_SERVICEVERSION | ||
valueFrom: | ||
fieldRef: | ||
fieldPath: metadata.namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh that will be painful. This means we will not filter anymore by service name in stackdriver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're saying I should add these env variables to the core application?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general yes we need some to set the right service name. But this allows to set only one service name before we split between zeebe and operate and gateway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand it. I've addded these env variables to the core app now, but the way it sounds that's not sufficient.
|
||
exec /usr/local/camunda/bin/camunda | ||
application.yaml: | | ||
|
||
spring: | ||
profiles: | ||
active: auth | ||
active: "operate,tasklist,broker,auth" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ℹ️ Ah ok so it seems Operate is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be 🤔 I modified the SPRING_PROFILES_ACTIVE
to disable it.
Okay I've processed some of the review comments. I can run the regular benchmarks fine manually. The realistic benchmarks don't really work. sa soon as I start Operate the readiness probe returns a 503. I can't spend more time on this atm since I'm not medic anymore. @rodrigo-lourenco-lopes feel free to take over again 😄 FWIW, I believe we should merge this PR. It fixes the regular benchmarks so that's more than we have now and a fresh PR for any more changes would be a lot more comprehensible. |
@ChrisKujawa is it ok that we merge this now and start working on the next fixes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you both 🚀🚀
optional: true | ||
# To run Operate and Tasklist add these to the value of this variable | ||
- name: SPRING_PROFILES_ACTIVE | ||
value: "auth,broker,operate" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
limits: | ||
cpu: 3500m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check whether this fits in our nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the node selector to a bigger one for these realistic benchmarks 😄
To see the full reasoning check the slack thread: https://camunda.slack.com/archives/C06GF0JPY68/p1732001694427559?thread_ts=1731918845.694709&cid=C06GF0JPY68
The link to the commit in camunda-platform-helm camunda/camunda-platform-helm@28d7927#diff-624a96601c5010c87f441781ea4c0e803b469e0bf2507e7ea86a651b0967ca9d
…, tasklist, operate component
I have autogenerated these because there were so many changes
Add the resources previously used by the gateway to the core app.
We can't just override 1 env variable to activate operate. Instead all the env variables got overridden and thus we were missing most of them. By adding them to this values file we overcome that problem.
These images are not available in the default registry.
7500432
to
413d6d0
Compare
Further context to the latest chages: This is already a very messy PR which migrates these benchmarks to work with the latest changes in the helm charts.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you tested it because i feel there are some issues with it still? Not sure whether it makes sense to merge it like that tbh.
@@ -17,7 +17,7 @@ spec: | |||
spec: | |||
containers: | |||
- name: starter | |||
image: "{{ .Values.global.image.repository }}/starter:{{ .Values.global.image.tag }}" | |||
image: "{{ .Values.starter.image.repository }}/starter:{{ .Values.starter.image.tag }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ This change was not necessary we can keep the global. Otherwise this would mean we need to change our github actions etc.
@@ -18,7 +18,7 @@ spec: | |||
spec: | |||
containers: | |||
- name: {{ $workerName }}-worker | |||
image: "{{ $.Values.global.image.repository }}/worker:{{ $.Values.global.image.tag }}" | |||
image: "{{ $.Values.workers.image.repository }}/worker:{{ $.Values.workers.image.tag }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ same as above
@@ -19,13 +19,13 @@ spec: | |||
spec: | |||
containers: | |||
- name: publisher | |||
image: "gcr.io/zeebe-io/starter:SNAPSHOT" | |||
image: "/starter:" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ this is related to the removal/change of the global value
image: | ||
# Image.repository defines the repository from which to fetch the docker images | ||
repository: "gcr.io/zeebe-io" | ||
# Image.tag defines the tag / version which should be used in the chart | ||
tag: SNAPSHOT | ||
# Image.pullPolicy defines the image pull policy which should be used https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy | ||
pullPolicy: Always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be kept
fieldPath: metadata.namespace | ||
- name: OPERATE_LOG_APPENDER | ||
value: Stackdriver | ||
- name: OPERATE_LOG_STACKDRIVER_SERVICENAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think this makes sense
@@ -170,74 +180,72 @@ leaderBalancing: | |||
|
|||
# Zeebe configuration to configure Zeebe and Gateway | |||
zeebe: | |||
# Zeebe.config can be used to configure Zeebe Broker and Gateway additional without the need of overwriting all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we removed these?
enabled: false | ||
|
||
# ELASTIC | ||
# TODO: this configuration replaces the default provided in the configmaps in camunda-platform-helm, we should ideally try to add the configuration in the charts for these instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to use the config we have above
runAsUser: 1000 | ||
capabilities: | ||
add: [ "NET_ADMIN" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this for testing
This is already a very messy PR which migrates these benchmarks to work with the latest changes in the helm charts.
I will create some follow-up issues to:
core.configuration
but this will overide the defaults in the config values in core.configmap in the camunda-platform-helm repo, see https://github.com/camunda/camunda-platform-helm/blob/75a1a164e9c7eb4cf021d1de4eee9b5ae39b6438/charts/camunda-platform-alpha/templates/core/configmap.yaml#L17-L21Additionally this PR also addresses #214 fixing the tests related to assuring correct configurations with the golden files.
Closes: #214
Follow up issues: