Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation of hooks LIFO order #3013

Merged
merged 27 commits into from
Sep 18, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
9d4d1b0
Update release note
noklam Sep 6, 2023
6c079b3
Merge branch 'main' into noklam/document-the-lifo-order-1940
noklam Sep 6, 2023
2b2d51e
add placeholder
noklam Sep 6, 2023
cf6e755
update link
noklam Sep 6, 2023
2983609
add hook execution order
noklam Sep 7, 2023
8cca7e3
update template
noklam Sep 7, 2023
8c87790
Update template
noklam Sep 7, 2023
d274023
add pyproject.toml example
noklam Sep 7, 2023
37b0fc6
update plugin list
noklam Sep 7, 2023
3bcbd05
Merge branch 'main' into noklam/document-the-lifo-order-1940
noklam Sep 7, 2023
ead8abe
Introducing a weasel word
stichbury Sep 8, 2023
661a890
Further changes to Hooks docs
stichbury Sep 8, 2023
77ad03f
fix typo with review suggestions.
noklam Sep 8, 2023
66c1f1e
Apply suggestions from code review
noklam Sep 8, 2023
fcc8672
Add a new command `make language-lint` for doc lint. Trigger only man…
noklam Sep 8, 2023
5511c97
Merge branch 'noklam/document-the-lifo-order-1940' of https://github.…
noklam Sep 8, 2023
424c14f
fix makefile
noklam Sep 8, 2023
6a83c1b
fix lint
noklam Sep 8, 2023
7db3db2
Fix broken links
noklam Sep 8, 2023
f253436
Fix links
noklam Sep 8, 2023
930d1cf
Remove setup.py mention, reference it to 0.18.13 documentation
noklam Sep 8, 2023
18746e5
Merge branch 'main' into noklam/document-the-lifo-order-1940
noklam Sep 13, 2023
1ffb3b3
remove debug message
noklam Sep 14, 2023
38c5e3a
Fix release note
noklam Sep 14, 2023
4f45096
Apply suggestions from code review
noklam Sep 14, 2023
7460c59
Apply suggestions from code review
noklam Sep 14, 2023
1e4bce4
Merge branch 'main' into noklam/document-the-lifo-order-1940
noklam Sep 18, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
* Updated dataset factories to resolve nested catalog config properly.

## Documentation changes
* Added documentation to clarify execution orders of hooks.
noklam marked this conversation as resolved.
Show resolved Hide resolved
## Breaking changes to the API
## Upcoming deprecations for Kedro 0.19.0

Expand Down
47 changes: 33 additions & 14 deletions docs/source/extend_kedro/plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,15 @@ def to_json(metadata):
pipeline = pipelines["__default__"]
print(pipeline.to_json())
```
Starting from 0.18.14, Kedro switch to replace `setup.py` with `pyproject.toml`. The plugin need to provide the entry points in either file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Starting from 0.18.14, Kedro switch to replace `setup.py` with `pyproject.toml`. The plugin need to provide the entry points in either file.
From version 0.18.14, Kedro replaced `setup.py` with `pyproject.toml`. The plugin needs to provide entry points in either file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this -- can you add the entry point in either file? Should you, or shouldn't we make a recommendation not to use setup.py since we've replaced it with pyproject.toml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stichbury We have moved to pyproject.toml if you created starter with the new version, but there are projects created with older version of Kedro and they only upgrade kedro but not the project template. Maybe its's too confusing and I should just keep the new one?


The plugin provides the following `entry_points` config in `setup.py`:
To use `pyproject.toml`, specifiy
noklam marked this conversation as resolved.
Show resolved Hide resolved
```toml
[project.entry-points."kedro.project_commands"]
kedrojson = kedrojson.plugin.commands
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did test it with the plugin hooks🥲 maybe I mess it up when I copy-paste it back to the doc. Thanks for spotting this.

```

To use `setup.py`, the plugin need to provide the following `entry_points` config in `setup.py`:
noklam marked this conversation as resolved.
Show resolved Hide resolved

```python
setup(
Expand Down Expand Up @@ -81,6 +88,13 @@ setup(
)
```

Alternatively, to use `pyproject.toml`, you need to register the specifications as follow:

```toml
[project.entry-points."kedro.starters"]
starter = plugin.starters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be plugin:starters? (Same typo as above)

```

After that you can use this starter with `kedro new --starter=test_plugin_starter`.

```{note}
Expand Down Expand Up @@ -127,7 +141,17 @@ We use the following command convention: `kedro <plugin-name> <command>`, with `

## Hooks

You can develop hook implementations and have them automatically registered to the project context when the plugin is installed. To enable this for your custom plugin, simply add the following entry in your `setup.py`:
You can develop hook implementations and have them automatically registered to the project context when the plugin is installed.

To enable this for your custom plugin, simply add the following entry in your `pyproject.toml`
noklam marked this conversation as resolved.
Show resolved Hide resolved

To use `pyproject.toml`, specifiy
noklam marked this conversation as resolved.
Show resolved Hide resolved
```toml
[project.entry-points."kedro.hooks"]
plugin_name = plugin_name.plugin.hooks
```

If you prefer to use `setup.py`:

```python
setup(entry_points={"kedro.hooks": ["plugin_name = plugin_name.plugin:hooks"]})
Expand Down Expand Up @@ -162,6 +186,12 @@ You can also develop hook implementations to extend Kedro's CLI behaviour in you
setup(entry_points={"kedro.cli_hooks": ["plugin_name = plugin_name.plugin:cli_hooks"]})
```

Alternatively, to use `pyproject.toml`
noklam marked this conversation as resolved.
Show resolved Hide resolved
```toml
[project.entry-points."kedro.cli_hooks"]
plugin_name = plugin_name.plugin.cli_hooks
```

where `plugin.py` is the module where you declare hook implementations:
noklam marked this conversation as resolved.
Show resolved Hide resolved

```python
Expand Down Expand Up @@ -204,28 +234,17 @@ connectors are implementations of the `AbstractDataset`

## Community-developed plugins

See the full list of plugins using the GitHub tag [kedro-plugin](https://github.com/topics/kedro-plugin).
See the full list of plugins using the GitHub repository [awesome-kedro](https://github.com/kedro-org/awesome-kedro?tab=readme-ov-file#kedro-plugins).
noklam marked this conversation as resolved.
Show resolved Hide resolved


```{note}
Your plugin needs to have an [Apache 2.0 compatible license](https://www.apache.org/legal/resolved.html#category-a) to be considered for this list.
```

- [Kedro-Pandas-Profiling](https://github.com/BrickFrog/kedro-pandas-profiling), by [Justin Malloy](https://github.com/BrickFrog), uses [Pandas Profiling](https://github.com/pandas-profiling/pandas-profiling) to profile datasets in the Kedro catalog
- [find-kedro](https://github.com/WaylonWalker/find-kedro), by [Waylon Walker](https://github.com/WaylonWalker), automatically constructs pipelines using `pytest`-style pattern matching
- [kedro-static-viz](https://github.com/WaylonWalker/kedro-static-viz), by [Waylon Walker](https://github.com/WaylonWalker), generates a static [Kedro-Viz](https://github.com/kedro-org/kedro-viz) site (HTML, CSS, JS)
- [steel-toes](https://github.com/WaylonWalker/steel-toes), by [Waylon Walker](https://github.com/WaylonWalker), prevents stepping on toes by automatically branching data paths
- [kedro-wings](https://github.com/tamsanh/kedro-wings), by [Tam-Sanh Nguyen](https://github.com/tamsanh), simplifies and speeds up pipeline creation by auto-generating catalog datasets
- [kedro-great](https://github.com/tamsanh/kedro-great), by [Tam-Sanh Nguyen](https://github.com/tamsanh), integrates Kedro with [Great Expectations](https://greatexpectations.io), enabling catalog-based expectation generation and data validation on pipeline run
- [Kedro-Accelerator](https://github.com/deepyaman/kedro-accelerator), by [Deepyaman Datta](https://github.com/deepyaman), speeds up pipelines by parallelizing I/O in the background
- [kedro-dataframe-dropin](https://github.com/mzjp2/kedro-dataframe-dropin), by [Zain Patel](https://github.com/mzjp2), lets you swap out pandas datasets for modin or RAPIDs equivalents for specialised use to speed up your workflows (e.g on GPUs)
- [kedro-mlflow](https://github.com/Galileo-Galilei/kedro-mlflow), by [Yolan Honoré-Rougé](https://github.com/galileo-galilei) and [Takieddine Kadiri](https://github.com/takikadiri), facilitates [MLflow](https://www.mlflow.org/) integration within a Kedro project. Its main features are modular configuration, automatic parameters tracking, datasets versioning, Kedro pipelines packaging and serving and automatic synchronization between training and inference pipelines for high reproducibility of machine learning experiments and ease of deployment. A tutorial is provided in the [kedro-mlflow-tutorial repo](https://github.com/Galileo-Galilei/kedro-mlflow-tutorial). You can find more information in the [kedro-mlflow documentation](https://kedro-mlflow.readthedocs.io/en/stable/).
- [Kedro-Neptune](https://github.com/neptune-ai/kedro-neptune), by [Jakub Czakon](https://github.com/jakubczakon) and [Rafał Jankowski](https://github.com/Raalsky), lets you have all the benefits of a nicely organized Kedro pipeline with Neptune: a powerful user interface built for ML metadata management. It lets you browse and filter pipeline executions, compare nodes and pipelines on metrics and parameters, and visualize pipeline metadata like learning curves, node outputs, and charts. For more information, tutorials and videos, go to the [Kedro-Neptune documentation](https://docs.neptune.ai/integrations-and-supported-tools/automation-pipelines/kedro).
- [kedro-dolt](https://www.dolthub.com/blog/2021-06-16-kedro-dolt-plugin/), by [Max Hoffman](https://github.com/max-hoffman) and [Oscar Batori](https://github.com/oscarbatori), allows you to expand the data versioning abilities of data scientists and engineers
- [kedro-kubeflow](https://github.com/getindata/kedro-kubeflow), by [GetInData](https://github.com/getindata), lets you run and schedule pipelines on Kubernetes clusters using [Kubeflow Pipelines](https://www.kubeflow.org/docs/components/pipelines/overview/)
- [kedro-airflow-k8s](https://github.com/getindata/kedro-airflow-k8s), by [GetInData](https://github.com/getindata), enables running a Kedro pipeline with Airflow on a Kubernetes cluster
- [kedro-vertexai](https://github.com/getindata/kedro-vertexai), by [GetInData](https://github.com/getindata), enables running a Kedro pipeline with Vertex AI Pipelines service
- [kedro-azureml](https://github.com/getindata/kedro-azureml), by [GetInData](https://github.com/getindata), enables running a Kedro pipeline with Azure ML Pipelines service
- [kedro-sagemaker](https://github.com/getindata/kedro-sagemaker), by [GetInData](https://github.com/getindata), enables running a Kedro pipeline with Amazon SageMaker service
- [kedro-partitioned](https://github.com/ProjetaAi/kedro-partitioned), by [Gabriel Daiha Alves](https://github.com/gabrieldaiha) and [Nickolas da Rocha Machado](https://github.com/nickolasrm), extends the functionality on processing partitioned data.
- [kedro-auto-catalog](https://github.com/WaylonWalker/kedro-auto-catalog), by [Waylon Walker](https://github.com/WaylonWalker) A configurable replacement for `kedro catalog create` that allows you to create default dataset types other than MemoryDataset.
15 changes: 15 additions & 0 deletions docs/source/hooks/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,9 @@ Kedro also has auto-discovery enabled by default. This means that any installed
Auto-discovered Hooks will run *first*, followed by the ones specified in `settings.py`.
```

#### Auto-registered Hook with Plugin
You can make your hook auto-registered (pip-installable) by creating a [Kedro plugin](https://docs.kedro.org/en/stable/extend_kedro/plugins.html#hooks). Kedro provides `kedro.hooks` entrypoints to extend this easily.


#### Disable auto-registered plugins' Hooks

Expand All @@ -126,6 +129,18 @@ DISABLE_HOOKS_FOR_PLUGINS = ("<plugin_name>",)

where `<plugin_name>` is the name of an installed plugin for which the auto-registered Hooks must be disabled.

### Hook Execution Order
Hooks follow a Last-In-First-Out (LIFO) order, which means the first registered hook will be exeucted last.

Hooks are registered in the following order:
1. Project hooks in `settings.py` - If you have `HOOKS = (hook_a, hookb,)`, `hook_b` will be executed before `hook_a`.
2. Plugins hooks registered in `kedro.hooks`, and it follows alphabetical order.

In general, hook execution order are not guaranteed and you shouldn't rely on it. If you need to make sure some hook get executed first/last, you can use the the [`tryfirst` or `trylast` argument](https://pluggy.readthedocs.io/en/stable/index.html#call-time-order) for `hook_impl`.

## Under the hood

Under the hood, we use [pytest's pluggy](https://pluggy.readthedocs.io/en/latest/) to implement Kedro's Hook mechanism. We recommend reading their documentation if you have more questions about the underlying implementation.

### Plugin Hooks
Plugin hooks are registered using []`importlib_metadata`'s `EntryPoints` API](https://docs.python.org/3/library/importlib.metadata.html)
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,9 @@ docs = [
"myst-parser~=0.17.2",
]

[project.entry-points."kedro.hooks"]

[tool.setuptools.dynamic]
dependencies = {file = "requirements.txt"}


Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
# Instantiated project hooks.
# For example, after creating a hooks.py and defining a ProjectHooks class there, do
# from {{cookiecutter.python_package}}.hooks import ProjectHooks
# Hooks are executed in a Last-In-First-Out (LIFO) order.
# HOOKS = (ProjectHooks(),)

# Installed plugins for which to disable hook auto-registration.
Expand Down
Loading