Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy with new scheduler #729

Merged
merged 14 commits into from
Sep 26, 2022
Merged
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ env:
- TRAVIS_KUBE_VERSION=v1.19 OW_INCLUDE_SYSTEM_TESTS=false OW_CONTAINER_FACTORY=kubernetes OW_LEAN_MODE=true
- TRAVIS_KUBE_VERSION=v1.20 OW_INCLUDE_SYSTEM_TESTS=false OW_CONTAINER_FACTORY=kubernetes
- TRAVIS_KUBE_VERSION=v1.21 OW_INCLUDE_SYSTEM_TESTS=false OW_CONTAINER_FACTORY=kubernetes
- TRAVIS_KUBE_VERSION=v1.21 OW_INCLUDE_SYSTEM_TESTS=false OW_CONTAINER_FACTORY=kubernetes OW_SCHEDULER_ENABLED=true

services:
- docker
Expand Down
12 changes: 12 additions & 0 deletions docs/configurationChoices.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,16 @@ components is not currently supported:
better management of the database and decouples its lifecycle from that of the OpenWhisk deployment.
- The event providers: alarmprovider and kafkaprovider.

### Openwhisk Scheduler

By default, the scheduler is disabled. To enable the scheduler, add the following
to your `mycluster.yaml`

```yaml
scheduler:
enabled: true
```

### Using an external database

You may want to use an external CouchDB or Cloudant instance instead
Expand Down Expand Up @@ -180,6 +190,8 @@ k8s:
enabled: false
```

Currently, etcd persistence is not supported.
hunhoffe marked this conversation as resolved.
Show resolved Hide resolved

### Selectively Deploying Event Providers

The default settings of the Helm chart will deploy OpenWhisk's alarm
Expand Down
2 changes: 1 addition & 1 deletion docs/k8s-custom-build-cluster-scaleup.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Modifying the above mentioned parameters, one can easily increase the concurrenc
In order to further increase the scale-up beyond `Small Scale`, one needs to modify the following additional configurations appropriately (on top of the above mentioned):
* `invoker:jvmHeapMB`: jvmHeap memory available to each invoker instance. May or may not require increase based on running functions. For more information check `troubleshooting` below.
* `invoker:containerFactory:_:replicaCount`: number of invoker instances that will be used to handle the incoming workload. By default, there is only one invoker instance which can become overwhelmed if workload goes beyond a certain threshold.
* `controller:replicaCount`: number of controller instances that will be used to handle the incoming workload. Same as invoker instances.
* `controller:replicaCount`: number of controller instances that will be used to handle the incoming workload. Same as invoker and scheduler instances.
* `invoker:options`: Log processing at the invoker can become a bottleneck for the KubernetesContainerFactory. One might try disabling invoker log processing by setting it to `-Dwhisk.spi.LogStoreProvider=org.apache.openwhisk.core.containerpool.logging.LogDriverLogStoreProvider`. In general, one needs to offload log processing from the invoker to a node-level log store provider if one is trying to push a large load through the system.

## Troubleshooting
Expand Down
4 changes: 2 additions & 2 deletions docs/k8s-kind.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,8 +94,8 @@ OpenWhisk apihost property to be set to localhost:31001
## Hints and Tips

If you are working on the core OpenWhisk system and want
to use a locally built controller or invoker image to test
your changes, you need to push the image to the docker image
to use a locally built controller, invoker, or scheduler image
to test your changes, you need to push the image to the docker image
repository inside the `kind` cluster.

For example, suppose I had a local change to the controller
Expand Down
43 changes: 43 additions & 0 deletions helm/openwhisk/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@ app: {{ template "openwhisk.fullname" . }}
{{ .Release.Name }}-controller.{{ .Release.Namespace }}.svc.{{ .Values.k8s.domain }}
{{- end -}}

{{/* hostname for scheduler */}}
{{- define "openwhisk.scheduler_host" -}}
{{ .Release.Name }}-scheduler.{{ .Release.Namespace }}.svc.{{ .Values.k8s.domain }}
{{- end -}}

{{/* hostname for database */}}
{{- define "openwhisk.db_host" -}}
{{- if .Values.db.external -}}
Expand All @@ -68,6 +73,15 @@ app: {{ template "openwhisk.fullname" . }}
{{- end -}}
{{- end -}}

{{/* hostname for etcd */}}
{{- define "openwhisk.etcd_host" -}}
{{- if .Values.etcd.external -}}
{{ .Values.etcd.host }}
{{- else -}}
{{ .Release.Name }}-etcd.{{ .Release.Namespace }}.svc.{{ .Values.k8s.domain }}
{{- end -}}
{{- end -}}

{{/* client connection string for zookeeper cluster (server1:port,server2:port, ... serverN:port)*/}}
{{- define "openwhisk.zookeeper_connect" -}}
{{- if .Values.zookeeper.external -}}
Expand Down Expand Up @@ -196,10 +210,24 @@ app: {{ template "openwhisk.fullname" . }}
value: {{ .Values.whisk.limits.activation.payload.max | quote }}
{{- end -}}

{{/* Environment variables for configuring etcd */}}
{{- define "openwhisk.etcdConfigEnvVars" -}}
- name: "CONFIG_whisk_cluster_name"
value: {{ .Values.etcd.clusterName | quote }}
- name: "CONFIG_whisk_etcd_hosts"
value: {{ include "openwhisk.etcd_host" . }}:{{ .Values.etcd.port }}
- name: "CONFIG_whisk_etcd_lease_timeout"
value: {{ .Values.etcd.leaseTimeout | quote }}
- name: "CONFIG_whisk_etcd_pool_threads"
value: {{ .Values.etcd.poolThreads | quote }}
{{- end -}}

{{/* Environment variables for configuring kafka topics */}}
{{- define "openwhisk.kafkaConfigEnvVars" -}}
- name: "CONFIG_whisk_kafka_replicationFactor"
value: {{ .Values.whisk.kafka.replicationFactor | quote }}
- name: "CONFIG_whisk_kafka_topics_prefix"
value: {{ .Values.whisk.kafka.topics.prefix | quote }}
- name: "CONFIG_whisk_kafka_topics_cacheInvalidation_retentionBytes"
value: {{ .Values.whisk.kafka.topics.cacheInvalidation.retentionBytes | quote }}
- name: "CONFIG_whisk_kafka_topics_cacheInvalidation_retentionMs"
Expand All @@ -224,12 +252,27 @@ app: {{ template "openwhisk.fullname" . }}
value: {{ .Values.whisk.kafka.topics.health.retentionMs | quote }}
- name: "CONFIG_whisk_kafka_topics_health_segmentBytes"
value: {{ .Values.whisk.kafka.topics.health.segmentBytes | quote }}

- name: "CONFIG_whisk_kafka_topics_invoker_retentionBytes"
value: {{ .Values.whisk.kafka.topics.invoker.retentionBytes | quote }}
- name: "CONFIG_whisk_kafka_topics_invoker_retentionMs"
value: {{ .Values.whisk.kafka.topics.invoker.retentionMs | quote }}
- name: "CONFIG_whisk_kafka_topics_invoker_segmentBytes"
value: {{ .Values.whisk.kafka.topics.invoker.segmentBytes | quote }}

- name: "CONFIG_whisk_kafka_topics_scheduler_retentionBytes"
value: {{ .Values.whisk.kafka.topics.scheduler.retentionBytes | quote }}
- name: "CONFIG_whisk_kafka_topics_scheduler_retentionMs"
value: {{ .Values.whisk.kafka.topics.scheduler.retentionMs | quote }}
- name: "CONFIG_whisk_kafka_topics_scheduler_segmentBytes"
value: {{ .Values.whisk.kafka.topics.scheduler.segmentBytes | quote }}
hunhoffe marked this conversation as resolved.
Show resolved Hide resolved

- name: "CONFIG_whisk_kafka_topics_creationAck_retentionBytes"
value: {{ .Values.whisk.kafka.topics.creationAck.retentionBytes | quote }}
- name: "CONFIG_whisk_kafka_topics_creationAck_retentionMs"
value: {{ .Values.whisk.kafka.topics.creationAck.retentionMs | quote }}
- name: "CONFIG_whisk_kafka_topics_creationAck_segmentBytes"
value: {{ .Values.whisk.kafka.topics.creationAck.segmentBytes | quote }}
{{- end -}}

{{/* tlssecretname for ingress */}}
Expand Down
22 changes: 22 additions & 0 deletions helm/openwhisk/templates/_readiness.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,17 @@
command: ["sh", "-c", 'cacert="/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"; token="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"; while true; do rc=$(curl -sS --cacert $cacert --header "Authorization: Bearer $token" https://kubernetes.default.svc/api/v1/namespaces/{{ .Release.Namespace }}/endpoints/{{ .Release.Name }}-kafka | jq -r ".subsets[].addresses | length"); echo "num ready kafka endpoints is $rc"; if [ $rc -gt 0 ]; then echo "Success: ready kafka endpoint!"; break; fi; echo "kafka not ready yet; sleeping for 3 seconds"; sleep 3; done;']
{{- end -}}

{{/* Init container that waits for etcd to be ready */}}
{{- define "openwhisk.readiness.waitForEtcd" -}}
- name: "wait-for-etcd"
image: "{{- .Values.docker.registry.name -}}{{- .Values.utility.imageName -}}:{{- .Values.utility.imageTag -}}"
imagePullPolicy: "IfNotPresent"
env:
- name: "READINESS_URL"
value: http://{{ include "openwhisk.etcd_host" . }}:{{ .Values.etcd.port }}/health
command: ["sh", "-c", "while true; do echo 'checking etcd readiness'; health_result=$(curl -m 5 $READINESS_URL) && echo $health_result | jq -e '. | select(.health==\"true\")'; result=$?; if [ $result -eq 0 ]; then echo 'Success: etcd is ready!'; break; fi; echo '...not ready yet; sleeping 3 seconds before retry'; sleep 3; done;"]
{{- end -}}

{{/* Init container that waits for zookeeper to be ready */}}
{{- define "openwhisk.readiness.waitForZookeeper" -}}
- name: "wait-for-zookeeper"
Expand All @@ -57,6 +68,17 @@
command: ["sh", "-c", "result=1; until [ $result -eq 0 ]; do echo 'Checking controller readiness'; wget -T 5 --spider $READINESS_URL; result=$?; sleep 1; done; echo 'Success: controller is ready'"]
{{- end -}}

{{/* Init container that waits for scheduler to be ready */}}
{{- define "openwhisk.readiness.waitForScheduler" -}}
- name: "wait-for-scheduler"
image: "{{- .Values.docker.registry.name -}}{{- .Values.busybox.imageName -}}:{{- .Values.busybox.imageTag -}}"
imagePullPolicy: "IfNotPresent"
env:
- name: "READINESS_URL"
value: http://{{ include "openwhisk.scheduler_host" . }}:{{ .Values.scheduler.endpoints.port }}/ping
command: ["sh", "-c", "result=1; until [ $result -eq 0 ]; do echo 'Checking scheduler readiness'; wget -T 5 --spider $READINESS_URL; result=$?; sleep 1; done; echo 'Success: scheduler is ready'"]
{{- end -}}

{{/* Init container that waits for at least 1 healthy invoker */}}
{{- define "openwhisk.readiness.waitForHealthyInvoker" -}}
- name: "wait-for-healthy-invoker"
Expand Down
29 changes: 28 additions & 1 deletion helm/openwhisk/templates/controller-pod.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@ spec:
{{- if not .Values.controller.lean }}
# The controller must wait for kafka and/or couchdb to be ready before it starts
{{ include "openwhisk.readiness.waitForKafka" . | indent 6 }}
{{- if .Values.scheduler.enabled }}
{{ include "openwhisk.readiness.waitForEtcd" . | indent 6 }}
{{- end }}
{{- end }}
{{ include "openwhisk.readiness.waitForCouchDB" . | indent 6 }}
{{- if eq .Values.activationStoreBackend "ElasticSearch" }}
Expand All @@ -85,7 +88,7 @@ spec:
- name: controller
containerPort: {{ .Values.controller.port }}
- name: akka-remoting
containerPort: 2552
containerPort: 25520
- name: akka-mgmt-http
containerPort: 19999
{{- if .Values.controller.lean }}
Expand Down Expand Up @@ -114,6 +117,11 @@ spec:
- name: "TZ"
value: {{ .Values.docker.timezone | quote }}

- name: "POD_IP"
valueFrom:
fieldRef:
fieldPath: status.podIP

- name: "CONFIG_whisk_info_date"
valueFrom:
configMapKeyRef:
Expand All @@ -137,6 +145,15 @@ spec:
- name: "RUNTIMES_MANIFEST"
value: {{ template "openwhisk.runtimes_manifest" . }}

# scheduler settings
{{ if .Values.scheduler.enabled }}
- name: "CONFIG_whisk_spi_LoadBalancerProvider"
value: "org.apache.openwhisk.core.loadBalancer.FPCPoolBalancer"

- name: "CONFIG_whisk_spi_EntitlementSpiProvider"
value: "org.apache.openwhisk.core.entitlement.FPCEntitlementProvider"
{{ end }}

# Action limits
{{ include "openwhisk.limitsEnvVars" . | indent 8 }}

Expand All @@ -151,11 +168,17 @@ spec:
value: "{{ include "openwhisk.kafka_connect" . }}"
{{ include "openwhisk.kafkaConfigEnvVars" . | indent 8 }}

# etcd properties
{{- if .Values.scheduler.enabled }}
{{ include "openwhisk.etcdConfigEnvVars" . | indent 8 }}
{{- end }}

# properties for DB connection
{{ include "openwhisk.dbEnvVars" . | indent 8 }}

- name: "CONTROLLER_INSTANCES"
value: {{ .Values.controller.replicaCount | quote }}

{{- if gt (int .Values.controller.replicaCount) 1 }}
- name: "CONFIG_whisk_cluster_useClusterBootstrap"
value: "true"
Expand All @@ -169,7 +192,11 @@ spec:
value: "name={{ .Release.Name }}-controller"
- name: "CONFIG_akka_discovery_kubernetesApi_podPortName"
value: "akka-mgmt-http"
{{- else }}
- name: "CONFIG_akka_cluster_seedNodes_0"
value: "akka://controller-actor-system@$(POD_IP):25520"
{{- end }}

{{- if .Values.metrics.prometheusEnabled }}
- name: "OPENWHISK_ENCODED_CONFIG"
value: {{ template "openwhisk.whiskconfig" . }}
Expand Down
114 changes: 114 additions & 0 deletions helm/openwhisk/templates/etcd-pod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

{{ if not .Values.etcd.external }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}-etcd
labels:
name: {{ .Release.Name }}-etcd
{{ include "openwhisk.label_boilerplate" . | indent 4 }}
spec:
replicas: {{ .Values.etcd.replicaCount }}
selector:
matchLabels:
name: {{ .Release.Name }}-etcd
{{- if .Values.k8s.persistence.enabled }}
strategy:
type: "Recreate"
{{- end }}
template:
metadata:
labels:
name: {{ .Release.Name }}-etcd
{{ include "openwhisk.label_boilerplate" . | indent 8 }}
spec:
restartPolicy: {{ .Values.etcd.restartPolicy }}

{{- if .Values.affinity.enabled }}
affinity:
{{ include "openwhisk.affinity.core" . | indent 8 }}
{{ include "openwhisk.affinity.selfAntiAffinity" ( printf "%s-etcd" .Release.Name | quote ) | indent 8 }}
{{- end }}

{{- if .Values.toleration.enabled }}
tolerations:
{{ include "openwhisk.toleration.core" . | indent 8 }}
{{- end }}

{{- if .Values.k8s.persistence.enabled }}
volumes:
- name: etcd-data
persistentVolumeClaim:
claimName: {{ .Release.Name }}-etcd-pvc
{{- end }}

{{- if .Values.k8s.persistence.enabled }}
initContainers:
- name: etcd-init
image: "{{- .Values.docker.registry.name -}}{{- .Values.busybox.imageName -}}:{{- .Values.busybox.imageTag -}}"
command:
- chown
- -v
- -R
- 999:999
- /data
volumeMounts:
- mountPath: /data
name: etcd-data
readOnly: false
{{- end }}
{{ include "openwhisk.docker.imagePullSecrets" . | indent 6 }}
# current command will always restart from scratch (no persistence)
containers:
- name: etcd
image: "{{- .Values.docker.registry.name -}}{{- .Values.etcd.imageName -}}:{{- .Values.etcd.imageTag -}}"
command:
- /usr/local/bin/etcd
- --data-dir=/data
- --name
- etcd0
- --initial-advertise-peer-urls
- http://127.0.0.1:2480
- --advertise-client-urls
- http://0.0.0.0:{{ .Values.etcd.port }}
- --listen-peer-urls
- http://127.0.0.1:2480
- --listen-client-urls
- http://0.0.0.0:{{ .Values.etcd.port }}
- --initial-cluster
- etcd0=http://127.0.0.1:2480
- --initial-cluster-state
- new
- --initial-cluster-token
- openwhisk-etcd-token
- --quota-backend-bytes
- "0"
- --snapshot-count
- "100000"
- --auto-compaction-retention
- "1"
- --auto-compaction-mode
- periodic
- --log-level
- info
imagePullPolicy: {{ .Values.etcd.imagePullPolicy | quote }}
ports:
- name: etcd
containerPort: {{ .Values.etcd.port }}
{{ end }}
34 changes: 34 additions & 0 deletions helm/openwhisk/templates/etcd-pvc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

{{- if and (not .Values.etcd.external) .Values.k8s.persistence.enabled }}
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: {{ .Release.Name }}-etcd-pvc
labels:
{{ include "openwhisk.label_boilerplate" . | indent 4 }}
spec:
{{- if not .Values.k8s.persistence.hasDefaultStorageClass }}
storageClassName: {{ .Values.k8s.persistence.explicitStorageClass }}
{{- end }}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.etcd.persistence.size }}
{{- end }}
Loading