Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add section on documentation and duck typing. #509

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 42 additions & 3 deletions docs/source/_get-started/making_connections.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
Connecting Nodes
================

Building a Connection
---------------------

..
**# TODO: Connect function and class based nodes**

Expand All @@ -13,7 +16,7 @@ We can connect these Nodes as follows:
.. code-block:: python

generate_data = GenerateData(**kwargs)
process_data = ProcessData(data=load_data)
process_data = ProcessData(data=generate_data)

Now, the ``process_data.data`` attribute will be the loaded instance of ``GenerateData``, when running ``dvc repro``.
The following connection has been established:
Expand All @@ -22,7 +25,7 @@ The following connection has been established:
:alt: mermaid diagram

In some cases it is useful to connect Node attributes instead of Nodes.
In the above example the Node ``ProcessData`` has to know the correct attributes of ``GenerateData`` to e.g. access the ``data``.
In the above example the Node ``ProcessData`` has to know the correct attributes of ``GenerateData`` to e.g. access the ``data`` attribute.
Therefore, one can also connect attributes of Nodes.
This is done by appending ``@`` and the attribute name to the Node.
With this, any attribute of any Node can be connected.
Expand All @@ -41,4 +44,40 @@ With this, any attribute of any Node can be connected.

with zntrack.DiGraph() as graph:
generate_data = GenerateData(**kwargs)
process_data = ProcessData(data=generate_data.data)
process_data = ProcessData(data=generate_data.data)

Write documentation
-------------------
ZnTrack Nodes are designed to be flexible and easy to connect.
This is achieved through the concept of `duck typing <https://en.wikipedia.org/wiki/Duck_typing>`_, which allows Nodes to be connected as long as they have the correct attributes and methods.
To help document the required attributes and methods, we suggest using the ``typing.Protocol`` module.

For example, consider the Node ``ProcessData``, which expects an attribute ``data`` of type ``np.ndarray``.
We can define such a ``typing.Protocol`` as follows:

.. code-block:: python

import typing
import numpy as np

class HasData(typing.Protocol):
"""Protocol for Nodes with data attribute"""
data: np.ndarray

Then, we can use this ``HasData`` protocol as a type hint for the ``input_data`` attribute of the ``ProcessData`` Node:

.. code-block:: python

from zntrack import Node, zn

class ProcessData(Node):
input_data: HasData = zn.deps()

def run(self):
"""Do something with 'self.input_data.data'."""
print(self.input_data.data.shape)


This highlights, that the ``ProcessData`` Node should work with any Node that has a ``data`` attribute of type ``np.ndarray``, whether it is a NumPy array generated by another Node from the same package, or generated by a Node from an entirely different library.
For a single attribute one could also expect ``np.ndarray`` as type hint directly.
But e.g. if more than one attribute is required, it is feasible to define a ``typing.Protocol``.
4 changes: 3 additions & 1 deletion examples/docs/01_Intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,9 @@
"metadata": {},
"source": [
"The ZnTrack class generates an `__init__` method for all `zn.params` and other inputs automatically.\n",
"When writing a custom `__init__` it is important to add `super().__init__(**kwargs)` for ZnTrack to work.\n",
"The generated `__init__()` code will call a method named `_post_init_()` if it exists.\n",
"You can use override this method to add custom initialization code.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this sentence grammatically

"Alternatively, when writing a custom `__init__` it is important to add `super().__init__(**kwargs)` for ZnTrack to work.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perphaps i'm not familiar enough with data classes but it is not clear what the advantage is of writing a custom _init_ over a _post_init_.

"```python\n",
"class RandomNumber(Node):\n",
" def __init__(self, maximum=None, **kwargs):\n",
Expand Down