Support of generic aiida plugins #20

agoscinski · 2024-10-16T20:57:42Z

Idea

AiiDA plugins define their inputs and outputs in their CalcJobs and Ẁorkchains with specific names. For example the arithmetic add CalcJob) has the inputs x and y as well as the output sum. We therefore need to specify these ports (how AiiDA calls them) in the yaml file to create the workgraph. In the aiida-shell plugin we did not need to do this because
Each plugin defines a entry point which we can use to load the corresponding CalcJob or WorkChain using the factories

from aiida.plugins.factories import CalculationFactory
ArithmeticAddCalculation = CalculationFactory("core.arithmetic.add")
# Retrieve input ports
print(ArithmeticAddCalculation.spec().inputs)

So with these two additional information (the entry point and the port names )in the YAML file we can run almost arbitrary calculations from aiida plugins (including aiida-icon). The reason why we did not need the port names for aiida-shell is because ShellJob creates dynamically its output ports from the outputs that are provided as inputs, so we took this to our advantage and use the name specified in the yaml file as output port names. For the input ports we also simplify the actual ports that would be nodes and arguments (see code). The gist is that we treat aiida-shell differently, and we should continue to do so, because otherwise it becomes cumbersome to use.

YAML syntax

Here you find (an example to run arithmetic add)[https://github.com/C2SM/ETHIOPIA/blob/plugins/tests/files/configs/test_config_small.yml]. A snippet of it to show how it is used to define a workflow.

- adder1:
    inputs:
        - a:
            port_name: x
        - b:
            port_name: y
    outputs:
        - sum1: 
            port_name: sum
- adder2:
    inputs:
        - sum1:
            port_name: x
        - c:
            port_name: y
    outputs:
        - sum2:
            port_name: sum

Since the same data object can be used for different ports we need this information in the cycles.

Definition of computer and code

We follow more the aiida logic to define computer and code information by just specifying the label given on definition.

tasks:
  - adder1:
      plugin: core.arithmetic.add
      code: bash 
      computer: localhost

This has the strong advantage that we do not have to write our own logic to parse all the computer information and can use the well maintained CLI verdi from aiida to allow the user to create it before. It is in this PR because it was required for testing, but should be separated out in a different PR

Current state of the code

Currently the code in the workgraph.py using different functions to create plugins that are not ShellJobs, and I am not sure if this is smart or not. It is a tradeoff between code duplications and flexibility, and requires a bit more thoughts and decisions how we go with this.

agoscinski · 2024-10-17T15:32:47Z

Icon change of namelist

For icon we need to adapt the behavior, because we rather want to change the namelist and keeping it over the calculation constant, so we can move it to the task definition. We can maybe make a calcjob out of the calcfunction https://github.com/aiida-icon/aiida-icon/blob/a982d8792006bf234fe79c18aa76fd2af7a3463f/src/aiida_icon/iconutils/masternml.py#L43-L51 that adapts the name list so we can provide a simpler use for the user, for arbitrary changes. We will use then this calcjob also to adapt it for the inputs we can infer from the workflow (date, output of last icon last run will not be passed in the aiida way but just calls this calcjob to update the namelist with the new file).

Naming of port_name

The port_name, maybe rename to input_key or input_slot.

Specify computer and code

We discussed how we deal with computer and code. For computer definition we stick with verdi, but for codes it might be useful to just just pass the filepath since we want not that the user creates a new code all the time when icon is recompiled. How to create a label in this case is still an open question. We could hashing the binary but for icon this can be 200MB which needs to be send over the transport plugin. One proposition was to hash the filepath as label, it was an open question if we generate a new uuid that preserve provenance also in cases the code is recompiled.

leclairm · 2024-10-21T12:08:02Z

We will use then this calcjob also to adapt it for the inputs we can infer from the workflow (date, output of last icon last run will not be passed in the aiida way but just calls this calcjob to update the namelist with the new file).

Here, for input data, we have to choices: either we adapt the namelist with the valid absolute path to the corresponding data or we leave a constant relative path in the namelist and symlink the actual data to the correct relative path in the working directory of the job.

leclairm · 2024-10-22T11:51:35Z

src/wcflow/workgraph.py

+        self._add_aiida_task_nodes()
+        self._add_aiida_links()


Creating the sockets and the links all at once in _add_aiida_links() relies on the fact that the order in which unrolled objects are stored is compatible with the rank of the nodes in the graph. This is not guaranteed and should not be required from the user. For instance, in the yaml file, inverting the order of tasks in a cycle description or the order of cycles should not affect the final WorkGraph and its correctness.

For that to happen, like in the parser PoC, there's a need for an intermediate step where the sockets corresponding to task outputs are created, e.g. self._add_aiida_sockets , and only then add the links.

Or I insert the visualization graph before that and we can use it as the IR before translating it to a WorkGraph. So I'll definitely do the first part and insert it there. Then we can see if it just leaves on the side or is part of the chain from config to WorkGraph (I'd be more confortable with the later actually)

For that to happen, like in the parser PoC, there's a need for an intermediate step where the sockets corresponding to task outputs are created, e.g. self._add_aiida_sockets , and only then add the links.

Actually, it's even easier, just move the call to _link_output_to_task() from _add_aiida_links() to _add_aiida_task_nodes(). That should fix the whole issue.

Yes I think this is a fair point.

- Add case for `RemoteData` to `_utils.py` - Remove undefined `path` property from `core.py` - Add computer as property to `Workflow` -> Name to be discussed, could be `host`. Also, could be specified in `root` task instead - Bug fix in for `port.valid_type` in `Workflow` to make it always iterable - Add `test_config_small_icon_localhost.yaml` (still with hardcoded paths, but this is what I used for the ICON run) - Clean up `test_config_small.yaml` with unused ICON-related stuff - Add dependencies to `pyproject.toml`

agoscinski · 2024-12-04T13:56:40Z

We will not continue work on this for now.

agoscinski force-pushed the plugins branch 2 times, most recently from faebeaa to 6618dbb Compare October 16, 2024 21:06

agoscinski mentioned this pull request Oct 17, 2024

How exactly are file arguments passed to the executables? #8

Closed

Base automatically changed from workgraph-from-pydantic to main October 17, 2024 08:50

WIP: testing how to integrate plugins

68828bf

updates

1154ac9

agoscinski force-pushed the plugins branch from 6618dbb to 1154ac9 Compare October 21, 2024 12:08

leclairm reviewed Oct 22, 2024

View reviewed changes

agoscinski mentioned this pull request Oct 23, 2024

Specifying the change of namelist options in the ICON namelist file #27

Open

agoscinski and others added 2 commits October 23, 2024 16:57

merge shell and generic plugin logical path in creation of workgraph

9fe78c5

agoscinski closed this Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support of generic aiida plugins #20

Support of generic aiida plugins #20

agoscinski commented Oct 16, 2024 •

edited

Loading

agoscinski commented Oct 17, 2024

leclairm commented Oct 21, 2024

leclairm Oct 22, 2024

leclairm Oct 22, 2024

leclairm Oct 22, 2024

agoscinski Oct 23, 2024

agoscinski commented Dec 4, 2024

Support of generic aiida plugins #20

Support of generic aiida plugins #20

Conversation

agoscinski commented Oct 16, 2024 • edited Loading

Idea

YAML syntax

Definition of computer and code

Current state of the code

agoscinski commented Oct 17, 2024

Icon change of namelist

Naming of port_name

Specify computer and code

leclairm commented Oct 21, 2024

leclairm Oct 22, 2024

Choose a reason for hiding this comment

leclairm Oct 22, 2024

Choose a reason for hiding this comment

leclairm Oct 22, 2024

Choose a reason for hiding this comment

agoscinski Oct 23, 2024

Choose a reason for hiding this comment

agoscinski commented Dec 4, 2024

agoscinski commented Oct 16, 2024 •

edited

Loading