Skip to content

Commit

Permalink
Big design change (#4)
Browse files Browse the repository at this point in the history
* Docstrings added

* compatibility table added

* name changed

* Big design change

* removed node group

* Additional changes

* updates

* try/catch added

* Updates

* Updates to code

* added pkg

* github workflow updated

* update

* updated
  • Loading branch information
venkatajagannath authored Jul 18, 2024
1 parent 0410e04 commit e5526b4
Show file tree
Hide file tree
Showing 32 changed files with 1,237 additions and 1,304 deletions.
15 changes: 9 additions & 6 deletions .astro-registry.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,18 @@
package-name: astro-provider-ray

display-name: Kuberay
display-name: Ray

docs-url: https://github.com/astronomer/astro-provider-ray/blob/main/README.md

hooks:
- module: ray_provider.hooks.ray.RayHook

decorators:
- module: ray_provider.decorators.kuberay.ray_task
- module: ray_provider.decorators.ray.ray_task

operators:
- module: ray_provider.operators.kuberay.RayClusterOperator
- module: ray_provider.operators.kuberay.SubmitRayJob
- module: ray_provider.operators.ray.SetupRayCluster
- module: ray_provider.operators.ray.SubmitRayJob
- module: ray_provider.operators.ray.DeleteRayCluster

triggers:
- module: ray_provider.triggers.kuberay.RayJobTrigger
- module: ray_provider.triggers.ray.RayJobTrigger
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -142,4 +142,3 @@ dmypy.json

# Cython debug symbols
cython_debug/

2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,4 +88,4 @@ ci:
autoupdate_commit_msg: ⬆ [pre-commit.ci] pre-commit autoupdate
skip:
- mypy # build of https://github.com/pre-commit/mirrors-mypy:types-PyYAML,types-attrs,attrs,types-requests,
#types-python-dateutil,[email protected] for python@python3 exceeds tier max size 250MiB: 262.6MiB
#types-python-dateutil,[email protected] for python@python3 exceeds tier max size 250MiB: 262.6MiB
52 changes: 0 additions & 52 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,52 +0,0 @@
CHANGELOG
=========

1.0.0 (2024-06-03)
------------------

* Initial release, with the following decorators, operators, and triggers:

.. list-table::
:header-rows: 1

* - Decorator Class
- Import Path
- Example DAG

* - ``_RayDecoratedOperator``
- .. code-block:: python

from ray_provider.decorators.kuberay import ray_task
- N/A

.. list-table::
:header-rows: 1

* - Operator Class
- Import Path
- Example DAG

* - ``RayClusterOperator``
- .. code-block:: python

from ray_provider.operators.kuberay import RayClusterOperator
- Example DAG

* - ``SubmitRayJob``
- .. code-block:: python

from ray_provider.operators.kuberay import SubmitRayJob
- Example DAG

.. list-table::
:header-rows: 1

* - Trigger Class
- Import Path
- Example DAG

* - ``RayJobTrigger``
- .. code-block:: python

from ray_provider.triggers.kuberay import RayJobTrigger
- N/A
2 changes: 1 addition & 1 deletion CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -1 +1 @@
* @venkatajagannath
* @venkatajagannath @schnie @tatiana
166 changes: 0 additions & 166 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
@@ -1,166 +0,0 @@
Contributions
=============

Hi there! We're thrilled that you'd like to contribute to this project. Your help is essential for keeping it great.

Please note that this project is released with a `Contributor Code of Conduct <CODE_OF_CONDUCT.md>`_.
By participating in this project you agree to abide by its terms.


Issues, PRs & Discussions
-------------------------

If you have suggestions for how this project could be improved, or want to
report a bug, open an issue! We'd love all and any contributions. If you have questions, too, we'd love to hear them.

We'd also love PRs. If you're thinking of a large PR, we advise opening up an issue first to talk about it,
though! Look at the links below if you're not sure how to open a PR.

If you have other questions, use `Github Discussions <https://github.com/astronomer/astro-provider-ray/discussions/>`_


Prepare PR
----------

1. Update the local sources to address the issue you are working on.

* Make sure your fork's main is synced with airflow-provider-anyscale's main before you create a branch. See
`How to sync your fork <#how-to-sync-your-fork>`_ for details.

* Create a local branch for your development. Make sure to use latest
``astro-provider-ray/main`` as base for the branch. This allows you to easily compare
changes, have several changes that you work on at the same time and many more.

* Add necessary code and unit tests.

* Run the unit tests from the IDE or local virtualenv as you see fit.

* Ensure test coverage is above **90%** for each of the files that you are changing.

* Run and fix all the static checks. If you have
pre-commits installed, this step is automatically run while you are committing your code.
If not, you can do it manually via ``git add`` and then ``pre-commit run``.

2. Remember to keep your branches up to date with the ``main`` branch, squash commits, and
resolve all conflicts.

3. Re-run static code checks again.

4. Make sure your commit has a good title and description of the context of your change, enough
for the committer reviewing it to understand why you are proposing a change. Make sure to follow other
PR guidelines described in `pull request guidelines <#pull-request-guidelines>`_.
Create Pull Request!

Pull Request Guidelines
-----------------------

Before you submit a pull request (PR), check that it meets these guidelines:

- Include tests unit tests and example DAGs (wherever applicable) to your pull request.
It will help you make sure you do not break the build with your PR and that you help increase coverage.

- `Rebase your fork <http://stackoverflow.com/a/7244456/1110993>`__, and resolve all conflicts.

- When merging PRs, Committer will use **Squash and Merge** which means then your PR will be merged as one commit,
regardless of the number of commits in your PR.
During the review cycle, you can keep a commit history for easier review, but if you need to,
you can also squash all commits to reduce the maintenance burden during rebase.

- If your pull request adds functionality, make sure to update the docs as part
of the same PR. Doc string is often sufficient. Make sure to follow the
Sphinx compatible standards.

- Run tests locally before opening PR.

- Adhere to guidelines for commit messages described in this `article <http://chris.beams.io/posts/git-commit/>`__.
This makes the lives of those who come after you a lot easier.

Static code checks
------------------

We check our code quality via static code checks. The static code checks in airflow-provider-anyscale are used to verify
that the code meets certain quality standards. All the static code checks can be run through pre-commit hooks.

Your code must pass all the static code checks in the CI in order to be eligible for Code Review.
The easiest way to make sure your code is good before pushing is to use pre-commit checks locally
as described in the static code checks documentation.

You can also run some static code checks via make command using available bash scripts.

.. code-block:: bash
make run-static-checks
Pre-commit hooks
----------------

Pre-commit hooks help speed up your local development cycle and place less burden on the CI infrastructure.
Consider installing the pre-commit hooks as a necessary prerequisite.

The pre-commit hooks by default only check the files you are currently working on and make
them fast. Yet, these checks use exactly the same environment as the CI tests
use. So, you can be sure your modifications will also work for CI if they pass
pre-commit hooks.

We have integrated the fantastic `pre-commit <https://pre-commit.com>`__ framework
in our development workflow. To install and use it, you need at least Python 3.7 locally.


Installing pre-commit hooks
^^^^^^^^^^^^^^^^^^^^^^^^^^^

It is the best to use pre-commit hooks when you have your local virtualenv or conda environment
for airflow-provider-anyscale activated since then pre-commit hooks and other dependencies are
automatically installed. You can also install the pre-commit hooks manually
using ``pip install``.

.. code-block:: bash
pip install pre-commit
After installation, pre-commit hooks are run automatically when you commit the code and they will
only run on the files that you change during your commit, so they are usually pretty fast and do
not slow down your iteration speed on your changes. There are also ways to disable the ``pre-commits``
temporarily when you commit your code with ``--no-verify`` switch or skip certain checks that you find
to much disturbing your local workflow.

Enabling pre-commit hooks
^^^^^^^^^^^^^^^^^^^^^^^^^

To turn on pre-commit checks for ``commit`` operations in git, enter:

.. code-block:: bash
pre-commit install
To install the checks also for ``pre-push`` operations, enter:

.. code-block:: bash
pre-commit install -t pre-push
For details on advanced usage of the install method, use:

.. code-block:: bash
pre-commit install --help
Coding style and best practices
-------------------------------

Most of our coding style rules are enforced programmatically by flake8 and mypy (which are run automatically
on every pull request), but there are some rules that are not yet automated and are more Airflow specific or
semantic than style.

Testing
-------

All tests are inside ``./tests`` directory.

- Just run ``pytest filepath+filename`` to run the tests.


For more information, please see the contributing guide available `here <https://github.com/astronomer/astro-provider-ray/blob/main/CONTRIBUTING.rst>`
77 changes: 45 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This repository provides a set of tools for integrating Ray with Apache Airflow,
- **_RayDecoratedOperator**: This decorator allows you to submit a job to a Ray cluster. It simplifies the integration process by decorating your task functions to work seamlessly with Ray.

#### Operators
- **RayClusterOperator**: This operator sets up a Ray cluster. It requires access to the kubeconfig file, the Ray cluster specification, and the services specification. For an example, refer to the [example_dags](https://github.com/astronomer/astro-provider-ray/tree/main/ray_provider/example_dags) folder.

- **SubmitRayJob**: This operator is used to submit a job to a Ray cluster using a specified host name. It facilitates scheduling Ray jobs to execute at defined intervals.

#### Triggers
Expand All @@ -23,67 +23,80 @@ These operators have been tested with the below versions. They will most likely
| 3.11 | 2.9.0 | 2.23.0 |


### Compatibility

These operators have been tested with the below versions. They will most likely be compatible with future versions but have not yet been tested.

| Python Version | Airflow Version | Ray Version |
|----------------|-----------------|-------------|
| 3.11 | 2.9.0 | 2.23.0 |


### Example Usage

The provided `start_ray_cluster.py` script demonstrates how to configure and use the `RayClusterOperator` and `SubmitRayJob` operators within an Airflow DAG:
The provided `setup_teardown.py` script demonstrates how to configure and use the `SetupRayCluster`, `DeleteRayCluster` and the `SubmitRayJob` operators within an Airflow DAG:

```python
import os
from airflow import DAG
from datetime import datetime, timedelta
from ray_provider.operators.kuberay import RayClusterOperator, SubmitRayJob
from ray_provider.operators.ray import SetupRayCluster, DeleteRayCluster, SubmitRayJob

default_args = {
'owner': 'airflow',
'start_date': datetime(2024, 3, 26),
'retries': 1,
'retry_delay': timedelta(minutes=1),
"owner": "airflow",
"start_date": datetime(2024, 3, 26),
"retries": 1,
"retry_delay": timedelta(minutes=0),
}

CLUSTERNAME = 'RayCluster'
REGION = 'us-east-2'
K8SPEC = '/usr/local/airflow/dags/scripts/k8.yaml'
RAY_SPEC = '/usr/local/airflow/dags/scripts/ray.yaml'
RAY_SVC = '/usr/local/airflow/dags/scripts/ray-service.yaml'
RAY_RUNTIME_ENV = {"working_dir": '/usr/local/airflow/dags/ray_scripts'}
kubeconfig_directory = f"/tmp/airflow_kubeconfigs/{REGION}/{CLUSTERNAME}/"
os.makedirs(kubeconfig_directory, exist_ok=True) # Ensure the directory exists
KUBECONFIG_PATH = os.path.join(kubeconfig_directory, "kubeconfig.yaml")
CLUSTERNAME = "RayCluster"
REGION = "us-east-2"
K8SPEC = "/usr/local/airflow/dags/scripts/k8.yaml"
RAY_SPEC = "/usr/local/airflow/dags/scripts/ray.yaml"
RAY_SVC = "/usr/local/airflow/dags/scripts/ray-service.yaml"
RAY_RUNTIME_ENV = {"working_dir": "/usr/local/airflow/dags/ray_scripts"}

dag = DAG(
'start_ray_cluster',
"Setup_Teardown",
default_args=default_args,
description='Setup EKS cluster with eksctl and deploy KubeRay operator',
schedule_interval='@daily',
description="Setup Ray cluster and submit a job",
schedule_interval=None,
)

ray_cluster = RayClusterOperator(
task_id="RayClusterOperator",
cluster_name=CLUSTERNAME,
region=REGION,
ray_namespace="ray",
setup_cluster = SetupRayCluster(
task_id="SetupRayCluster",
conn_id="ray_conn",
ray_cluster_yaml=RAY_SPEC,
ray_svc_yaml=RAY_SVC,
kubeconfig=KUBECONFIG_PATH,
ray_gpu=False,
env={},
use_gpu=False,
dag=dag,
)

submit_ray_job = SubmitRayJob(
task_id="SubmitRayJob",
host="{{ task_instance.xcom_pull(task_ids='RayClusterOperator', key='dashboard') }}",
entrypoint='python script.py',
conn_id="ray_conn",
entrypoint="python script.py",
runtime_env=RAY_RUNTIME_ENV,
num_cpus=1,
num_gpus=0,
memory=0,
resources={},
xcom_task_key="SetupRayCluster.dashboard",
dag=dag,
)

delete_cluster = DeleteRayCluster(
task_id="DeleteRayCluster",
conn_id="ray_conn",
ray_cluster_yaml=RAY_SPEC,
ray_svc_yaml=RAY_SVC,
use_gpu=False,
dag=dag,
)

# Create Ray cluster and submit Ray job
ray_cluster >> submit_ray_job
# Create ray cluster and submit ray job
setup_cluster.as_setup() >> submit_ray_job >> delete_cluster.as_teardown()
setup_cluster >> delete_cluster
```

### Changelog
Expand All @@ -99,4 +112,4 @@ __________________

All contributions, bug reports, bug fixes, documentation improvements, enhancements are welcome.

A detailed overview an how to contribute can be found in the [Contributing Guide](https://github.com/astronomer/astro-provider-ray/blob/main/CONTRIBUTING.rst)
A detailed overview an how to contribute can be found in the [Contributing Guide](https://github.com/astronomer/astro-provider-ray/blob/main/CONTRIBUTING.rst)
Loading

0 comments on commit e5526b4

Please sign in to comment.