Skip to content

Commit

Permalink
docs: cleanup README
Browse files Browse the repository at this point in the history
Signed-off-by: Mathew Wicks <[email protected]>
  • Loading branch information
thesuperzapper committed Apr 13, 2022
1 parent a1bc9af commit 57a3bce
Show file tree
Hide file tree
Showing 35 changed files with 2,574 additions and 1,828 deletions.
12 changes: 6 additions & 6 deletions charts/airflow/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,11 +105,11 @@ TBD
> 🟨 __NOTES__ 🟨
>
> - You can now use Secrets and ConfigMaps to define your `airflow.{users,connections,pools,variables}`, see the docs:
> - [How to create airflow users?](https://github.com/airflow-helm/charts/tree/main/charts/airflow#how-to-create-airflow-users)
> - [How to create airflow connections?](https://github.com/airflow-helm/charts/tree/main/charts/airflow#how-to-create-airflow-connections)
> - [How to create airflow variables?](https://github.com/airflow-helm/charts/tree/main/charts/airflow#how-to-create-airflow-variables)
> - [How to create airflow pools?](https://github.com/airflow-helm/charts/tree/main/charts/airflow#how-to-create-airflow-pools)
> - You may now use Secrets and ConfigMaps to define your `airflow.{users,connections,pools,variables}`:
> - [How to manage airflow users?](docs/faq/security/airflow-users.md)
> - [How to manage airflow connections?](docs/faq/dags/airflow-connections.md)
> - [How to manage airflow variables?](docs/faq/dags/airflow-variables.md)
> - [How to manage airflow pools?](docs/faq/dags/airflow-pools.md)
### Added
- allow referencing Secrets/ConfigMaps in `airflow.{users,connections,pools,variables}` ([#281](https://github.com/airflow-helm/charts/pull/281))
Expand Down Expand Up @@ -262,7 +262,7 @@ TBD
- native support for [Airflow 2.0's HA scheduler](https://airflow.apache.org/docs/apache-airflow/stable/scheduler.html#running-more-than-one-scheduler), see the new `scheduler.replicas` value
- significantly improved git-sync system by moving to [kubernetes/git-sync](https://github.com/kubernetes/git-sync)
- significantly improved pip installs by moving to an init-container
- added a [guide for integrating airflow with your "Microsoft AD" or "OAUTH"](README.md#how-to-authenticate-airflow-users-with-ldapoauth)
- added docs for [How to integrate airflow with LDAP or OAUTH?](docs/faq/security/ldap-oauth.md)
- general cleanup of almost every helm file
- significant docs/README rewrite

Expand Down
2 changes: 1 addition & 1 deletion charts/airflow/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Most non-patch changes will require documentation updates.

If you __ADD a value__:
- ensure the value has a descriptive docstring in `values.yaml`
- ensure the value is listed under `Values Reference` in [README.md](README.md#values-reference)
- ensure the value is listed under `Helm Values` in [README.md](README.md#helm-values)
- Note, only directly include the value if it's a top-level value like `airflow.level_1`, otherwise only include `airflow.level_1.*`

If you __bump the version__:
Expand Down
1,955 changes: 135 additions & 1,820 deletions charts/airflow/README.md

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion charts/airflow/UPGRADE.md

This file was deleted.

119 changes: 119 additions & 0 deletions charts/airflow/docs/faq/configuration/airflow-configs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
[🔗 Return to `Table of Contents` for more FAQ topics 🔗](https://github.com/airflow-helm/charts/tree/main/charts/airflow#frequently-asked-questions)

> Note, this page was written for the [`User-Community Airflow Helm Chart`](https://github.com/airflow-helm/charts/tree/main/charts/airflow)
# How to set airflow configs?

## airflow.cfg

While we don't expose the `airflow.cfg` file directly, you may use [environment variables](https://airflow.apache.org/docs/stable/howto/set-config.html) to set Airflow configs.

The `airflow.config` value makes this easier, each key-value is mounted as an environment variable on each Pod:

```yaml
airflow:
config:
## security
AIRFLOW__WEBSERVER__EXPOSE_CONFIG: "False"

## dags
AIRFLOW__CORE__LOAD_EXAMPLES: "False"
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: "30"

## email
AIRFLOW__EMAIL__EMAIL_BACKEND: "airflow.utils.email.send_email_smtp"
AIRFLOW__SMTP__SMTP_HOST: "smtpmail.example.com"
AIRFLOW__SMTP__SMTP_MAIL_FROM: "[email protected]"
AIRFLOW__SMTP__SMTP_PORT: "25"
AIRFLOW__SMTP__SMTP_SSL: "False"
AIRFLOW__SMTP__SMTP_STARTTLS: "False"

## domain used in airflow emails
AIRFLOW__WEBSERVER__BASE_URL: "http://airflow.example.com"

## ether environment variables
HTTP_PROXY: "http://proxy.example.com:8080"
```
> 🟦 __Tip__ 🟦
>
> To store sensitive configs in Kubernetes secrets, you may use the `airflow.extraEnv` value.
>
> For example, to set `AIRFLOW__CORE__FERNET_KEY` from a Secret called `airflow-fernet-key` containing a key called `value`:
>
> ```yaml
> airflow:
> extraEnv:
> - name: AIRFLOW__CORE__FERNET_KEY
> valueFrom:
> secretKeyRef:
> name: airflow-fernet-key
> key: value
> ```

## webserver_config.py

We expose the `web.webserverConfig.*` values to define your Flask-AppBuilder `webserver_config.py` file.

For example, a minimal `webserver_config.py` file that uses [`AUTH_DB`](https://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-database):

```yaml
web:
webserverConfig:
## the full content of the `webserver_config.py` file, as a string
stringOverride: |
from airflow import configuration as conf
from flask_appbuilder.security.manager import AUTH_DB
# the SQLAlchemy connection string
SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN')
# use embedded DB for auth
AUTH_TYPE = AUTH_DB
## the name of an existing Secret containing a `webserver_config.py` key
## NOTE: if set, takes precedence over `web.webserverConfig.stringOverride`
#existingSecret: "my-airflow-webserver-config"
```

> 🟦 __Tip__ 🟦
>
> We also provide more detailed docs on [how to integrate airflow with LDAP or OAUTH](../security/ldap-oauth.md).
## airflow_local_settings.py

We expose the `airflow.localSettings.*` values to define your `airflow_local_settings.py` file.

For example, an `airflow_local_settings.py` file that sets a [cluster policy](https://airflow.apache.org/docs/apache-airflow/stable/concepts/cluster-policies.html) to reject dags with no tags:

```yaml
airflow:
localSettings:
## the full content of the `airflow_local_settings.py` file, as a string
stringOverride: |
from airflow.models import DAG
from airflow.exceptions import AirflowClusterPolicyViolation
def dag_policy(dag: DAG):
"""Ensure that DAG has at least one tag"""
if not dag.tags:
raise AirflowClusterPolicyViolation(
f"DAG {dag.dag_id} has no tags. At least one tag required. File path: {dag.fileloc}"
)
## the name of an existing Secret containing a `airflow_local_settings.py` key
## NOTE: if set, takes precedence over `airflow.localSettings.stringOverride`
#existingSecret: "my-airflow-local-settings"
```

For example, an `airflow_local_settings.py` file that sets the default KubernetesExecutor container image:

```yaml
airflow:
localSettings:
## the full content of the `airflow_local_settings.py` file, as a string
stringOverride: |
# use a custom `xcom_sidecar` image for KubernetesPodOperator()
from airflow.kubernetes.pod_generator import PodDefaults
PodDefaults.SIDECAR_CONTAINER.image = "gcr.io/PROJECT-ID/custom-sidecar-image"
```
170 changes: 170 additions & 0 deletions charts/airflow/docs/faq/configuration/airflow-plugins.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
[🔗 Return to `Table of Contents` for more FAQ topics 🔗](https://github.com/airflow-helm/charts/tree/main/charts/airflow#frequently-asked-questions)

> Note, this page was written for the [`User-Community Airflow Helm Chart`](https://github.com/airflow-helm/charts/tree/main/charts/airflow)
# How to load airflow plugins?

There are multiple ways to load [airflow plugins](https://airflow.apache.org/docs/apache-airflow/stable/plugins.html) when using the chart.

## Option 1 - embedded into container image (recommended)

This chart uses the official [apache/airflow](https://hub.docker.com/r/apache/airflow) images, you may extend the airflow container image with your airflow plugins.

For example, here is a Dockerfile that extends `airflow:2.1.4-python3.8` with custom plugins:

```dockerfile
FROM apache/airflow:2.1.4-python3.8

# plugin files can be copied under `/home/airflow/plugins`
# (where `./plugins` is relative to the docker build context)
COPY plugins/* /home/airflow/plugins/

# plugins exposed as python packages can be installed with pip
RUN pip install --no-cache-dir \
example==1.0.0
```

After building and tagging your Dockerfile as `MY_REPO:MY_TAG`, you may use it with the chart by specifying `airflow.image.*`:

```yaml
airflow:
image:
repository: MY_REPO
tag: MY_TAG

## WARNING: even if set to "Always" do not reuse tag names, as containers only pull the latest image when restarting
pullPolicy: IfNotPresent
```
## Option 2 - git-sync dags repo
> 🟥 __Warning__ 🟥
>
> With "Option 2", you must manually restart the webserver and scheduler pods for plugin changes to take effect.
If you are using git-sync to [load your DAG definitions](../dags/load-dag-definitions.md), you may also include your plugins in this repo.
For example, if your DAG git repo includes plugins under `./PATH/TO/PLUGINS`:

```yaml
airflow:
configs:
## NOTE: there is an extra `/repo/` in the path
AIRFLOW__CORE__PLUGINS_FOLDER: /opt/airflow/dags/repo/PATH/TO/PLUGINS

dags:
## NOTE: this is the default value
#path: /opt/airflow/dags

gitSync:
enabled: true
repo: "[email protected]:USERNAME/REPOSITORY.git"
branch: "master"
revision: "HEAD"
syncWait: 60
sshSecret: "airflow-ssh-git-secret"
sshSecretKey: "id_rsa"

# "known_hosts" verification can be disabled by setting to ""
sshKnownHosts: |-
github.com ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ==
```
## Option 3 - persistent volume
> 🟥 __Warning__ 🟥
>
> With "Option 3", you must manually restart the webserver and scheduler pods for plugin changes to take effect.
You may load airflow plugins that are stored in a Kubernetes [Persistent Volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) by using the `airflow.extraVolumeMounts` and `airflow.extraVolumes` values.

For example, to mount a PersistentVolumeClaim called `airflow-plugins` that contains airflow plugin files at its root:

```yaml
airflow:
configs:
## NOTE: this is the default value
#AIRFLOW__CORE__PLUGINS_FOLDER: /opt/airflow/plugins
extraVolumeMounts:
- name: airflow-plugins
mountPath: /opt/airflow/plugins
## NOTE: if plugin files are not at the root of the volume, you may set a subPath
#subPath: "path/to/plugins"
readOnly: true
extraVolumes:
- name: airflow-plugins
persistentVolumeClaim:
claimName: airflow-plugins
```

## Option 4 - ConfigMaps or Secrets

> 🟥 __Warning__ 🟥
>
> With "Option 4", you must manually restart the webserver and scheduler pods for plugin changes to take effect.

You may load airflow plugins that are sored in Kubernetes Secrets or ConfigMaps by using the `airflow.extraVolumeMounts` and `airflow.extraVolumes` values.

For example, to mount airflow plugin files from a ConfigMap called `airflow-plugins`:

```yaml
workers:
configs:
## NOTE: this is the default value
#AIRFLOW__CORE__PLUGINS_FOLDER: /opt/airflow/plugins
extraVolumeMounts:
- name: airflow-plugins
mountPath: /opt/airflow/plugins
readOnly: true
extraVolumes:
- name: airflow-plugins
configMap:
name: airflow-plugins
```

> 🟦 __Tip__ 🟦
>
> Your `airflow-plugins` ConfigMap might look something like this.
>
> ```yaml
> apiVersion: v1
> kind: ConfigMap
> metadata:
> name: airflow-plugins
> data:
> my_airflow_plugin.py: |
> from airflow.plugins_manager import AirflowPlugin
>
> class MyAirflowPlugin(AirflowPlugin):
> name = "my_airflow_plugin"
> ...
> ```

> 🟦 __Tip__ 🟦
>
> You may include the ConfigMap as an [extra manifest](../kubernetes/extra-manifests.md) of the chart using the `extraManifests` value.
>
> ```yaml
> extraManifests:
> - |
> apiVersion: v1
> kind: ConfigMap
> metadata:
> name: airflow-plugins
> labels:
> app: {{ include "airflow.labels.app" . }}
> chart: {{ include "airflow.labels.chart" . }}
> release: {{ .Release.Name }}
> heritage: {{ .Release.Service }}
> data:
> my_airflow_plugin.py: |
> from airflow.plugins_manager import AirflowPlugin
>
> class MyAirflowPlugin(AirflowPlugin):
> name = "my_airflow_plugin"
> ...
> ```
61 changes: 61 additions & 0 deletions charts/airflow/docs/faq/configuration/airflow-version.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
[🔗 Return to `Table of Contents` for more FAQ topics 🔗](https://github.com/airflow-helm/charts/tree/main/charts/airflow#frequently-asked-questions)

> Note, this page was written for the [`User-Community Airflow Helm Chart`](https://github.com/airflow-helm/charts/tree/main/charts/airflow)
# How to set the airflow version?

> 🟦 __Tip__ 🟦
>
> There is a default version (`airflow.image.tag`) of airflow shipped with each version of the chart, see the default [values.yaml](../../../values.yaml) for the current one.
> 🟦 __Tip__ 🟦
>
> Many versions of airflow versions are supported by the chart, please see the [Airflow Version Support](../../..#airflow-version-support) matrix.
## Airflow 2.X

For example, to use airflow `2.1.4`, with python `3.7`:

```yaml
airflow:
image:
repository: apache/airflow
tag: 2.1.4-python3.7
```
## Airflow 1.10
> 🟥 __Warning__ 🟥
>
> To use an `airflow.image.tag` with Airflow `1.10+`, you must set `airflow.legacyCommands` to `true`.

For example, to use airflow `1.10.15`, with python `3.8`:

```yaml
airflow:
# WARNING: this must be "true" for airflow 1.10
legacyCommands: true
image:
repository: apache/airflow
tag: 1.10.15-python3.8
```

## Building a Custom Image

Airflow provides documentation on [building custom docker images](https://airflow.apache.org/docs/docker-stack/build.html), you may follow this process to create a custom image.

For example, after building and tagging your Dockerfile as `MY_REPO:MY_TAG`, you may use it with the chart by specifying `airflow.image.*`:

```yaml
airflow:
# WARNING: this must be "true" for airflow 1.10
#legacyCommands: true
image:
repository: MY_REPO
tag: MY_TAG
## WARNING: even if set to "Always" do not reuse tag names, as containers only pull the latest image when restarting
pullPolicy: IfNotPresent
```
Loading

0 comments on commit 57a3bce

Please sign in to comment.