Skip to content

Commit

Permalink
Add documentation about operator_arguments
Browse files Browse the repository at this point in the history
  • Loading branch information
tatiana committed Sep 6, 2023
1 parent 7f996a0 commit 2705510
Show file tree
Hide file tree
Showing 2 changed files with 91 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/configuration/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,5 @@ Cosmos offers a number of configuration options to customize its behavior. For m
Scheduling <scheduling>
Testing Behavior <testing-behavior>
Selecting & Excluding <selecting-excluding>
Operator Args <operator-args>
Compiled SQL <compiled-sql>
90 changes: 90 additions & 0 deletions docs/configuration/operator-args.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
.. _operator-args:

Operator arguments
==================

It is possible to pass arguments to Cosmos operators in two ways. Either by passing them when directly instantiating Cosmos Operators
or by defining the ``operator_args`` within a ``DbtDag`` or a ``DbtTaskGroup`` instance.
The value of ``operator_args`` should be a dictionary that will become the underlining operators' ``kwargs``.

Example of how to set Kubernetes-specific operator arguments:

.. code-block:: python
DbtDag(
# ...
operator_args={
"queue": "kubernetes",
"image": "dbt-jaffle-shop:1.0.0",
"image_pull_policy": "Always",
"get_logs": True,
"is_delete_operator_pod": False,
"namespace": "default",
},
execution_config=ExecutionConfig(
execution_mode=ExecutionMode.KUBERNETES,
),
)
Example of setting a Cosmos-specific operator argument:

.. code-block:: python
DbtDag(
# ...
operator_args={"dbt_cmd_global_flags": ["--cache-selected-only"]}
)
Summary of Cosmos-specific arguments
------------------------------------

dbt-related
...........

- ``append_env``: Expose the operating system environment variables to the ``dbt`` command. The default is ``False``.
- ``dbt_cmd_flags``: List of command flags to pass to ``dbt`` command, added after dbt subcommand
- ``dbt_cmd_global_flags``: List of ``dbt`` `global flags <https://docs.getdbt.com/reference/global-configs/about-global-configs>`_ to be passed to the ``dbt`` command, before the subcommand
- ``dbt_executable_path``: Path to dbt executable.
- ``env``: Declare, using a Python dictionary, values to be set as environment variables when running ``dbt`` commands.
- ``fail_fast``: ``dbt`` exits immediately if ``dbt`` fails to process a resource.
- ``models``: Specifies which nodes to include.
- ``no_version_check``: If set, skip ensuring ``dbt``'s version matches the one specified in the ``dbt_project.yml``.
- ``quiet``: run ``dbt`` in silent mode, only displaying its error logs.
- ``vars``: Supply variables to the project. This argument overrides variables defined in the ``dbt_project.yml``.
- ``warn_error``: convert ``dbt`` warnings into errors.

Airflow-related
...............

- ``cancel_query_on_kill``: If true, cancel any running queries when the task's ``on_kill()`` is executed.
- ``output_encoding``: Declare the character encoding to parse the ``dbt`` command output.
- ``skip_exit_code``: If the task exits with this exit code, leave the task in ``skipped`` state (default: 99).

Sample usage
............

.. code-block:: python
DbtTaskGroup(
# ...
operator_args={
"append_env": True,
"dbt_cmd_flags": ["--models", "stg_customers"],
"dbt_cmd_global_flags": ["--cache-selected-only"],
"dbt_executable_path": Path("/home/user/dbt"),
"env": {"MY_ENVVAR": "some-value"},
"fail_fast": True,
"no_version_check": True,
"quiet": True,
"vars": {
"start_time": "{{ data_interval_start.strftime('%Y%m%d%H%M%S') }}",
"end_time": "{{ data_interval_end.strftime('%Y%m%d%H%M%S') }}",
},
"warn_error": True,
"cancel_query_on_kill": False,
"output_enconding": "utf-8",
"skip_exit_code": 1,
}
)

0 comments on commit 2705510

Please sign in to comment.