Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PythonJob spports exit code output #300

Merged
merged 6 commits into from
Sep 11, 2024
Merged

Conversation

superstar54
Copy link
Member

@superstar54 superstar54 commented Sep 9, 2024

This PR allows custom exit_code for PythonJob. Add a built-in output socket, exit_code, which serves as a mechanism for error handling and status reporting during task execution. This exit_code is dict with status and message. For the status, an integer value where 0 indicates a successful completion, and any non-zero value signals that an error occurred.

How it Works:

When the function returns a dictionary with an exit_code key, the system automatically parses and uses this code to indicate the task's status. In the case of an error, the non-zero exit_code value helps identify the specific problem.

Benefits of exit_code:

  1. Error Reporting:
    If the task encounters an error, the exit_code can communicate the reason. This is helpful during process inspection to determine why a task failed.

  2. Error Handling and Recovery:
    You can utilize exit_code to add specific error handlers for particular exit codes. This allows you to modify the task's parameters and restart it.

Below is an example Python function that uses exit_code:

from aiida_workgraph import WorkGraph, task

@task.pythonjob(outputs=[{"name": "sum"}])
def add(x: int, y: int) -> int:
    sum = x + y
    if sum < 0:
        exit_code = {"status": 1, "message": "Sum is negative"}
        return {"sum": sum, "exit_code": exit_code}
    return {"sum": sum}

wg = WorkGraph("test_PythonJob")
wg.add_task(add, name="add", x=1, y=-2)
wg.submit(wait=True)

print("exit status: ",  wg.tasks["add"].node.exit_status)
print("exit message: ",  wg.tasks["add"].node.exit_message)
>>> WorkGraph process created, PK: 146751
>>> exit status:  1
>>> exit message:  Sum is negative

In this example, the task failed with exit_code = 1 due to the condition Sum is negative, which is also reflected in the state message.

> verdi process show 146718                                                               (aiida) 
Property     Value
-----------  ------------------------------------
type         PythonJob<add>
state        Finished [1] Sum is negative
pk           146718
uuid         2ffe92ed-0634-4a02-95fe-c14ecb778f92
label        add
description
ctime        2024-09-09 13:21:14.818233+02:00
mtime        2024-09-09 13:21:18.138330+02:00
computer     [1] localhost

Here is another example form PW calculation

> verdi process show 146793                                                             (aiida) 
Property     Value
-----------  --------------------------------------------------------------------------------
type         PythonJob<scf>
state        Finished [410] The electronic minimization cycle did not reach self-consistency.
pk           146793
uuid         5ed2228d-42ca-46f2-8e7f-d862c6991fba
label        scf

@superstar54 superstar54 linked an issue Sep 9, 2024 that may be closed by this pull request
@codecov-commenter
Copy link

codecov-commenter commented Sep 9, 2024

Codecov Report

Attention: Patch coverage is 79.54545% with 18 lines in your changes missing coverage. Please review.

Project coverage is 80.69%. Comparing base (5937b88) to head (45d1c0f).
Report is 66 commits behind head on main.

Files with missing lines Patch % Lines
aiida_workgraph/tasks/pythonjob.py 71.79% 11 Missing ⚠️
tests/test_python.py 77.27% 5 Missing ⚠️
aiida_workgraph/calculations/python_parser.py 80.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #300      +/-   ##
==========================================
+ Coverage   75.75%   80.69%   +4.94%     
==========================================
  Files          70       66       -4     
  Lines        4615     4947     +332     
==========================================
+ Hits         3496     3992     +496     
+ Misses       1119      955     -164     
Flag Coverage Δ
python-3.11 80.61% <79.54%> (+4.94%) ⬆️
python-3.12 80.61% <79.54%> (?)
python-3.9 80.65% <79.54%> (+4.91%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

1) serialize and deserialize the properties when saving and loading the task
2) update the outputs of the task when loading the task.
@superstar54 superstar54 merged commit 3b6d616 into main Sep 11, 2024
8 checks passed
@superstar54 superstar54 deleted the feature/pythonjob_exit_code branch September 11, 2024 08:28
GeigerJ2 pushed a commit to GeigerJ2/aiida-workgraph that referenced this pull request Sep 13, 2024
This PR adds a built-in output socket, `exit_code`, which serves as a mechanism for error handling and status reporting during task execution. This exit_code is `aiida.engine.ExitCode` with status and message. For the status, an integer value where 0 indicates a successful completion, and any non-zero value signals that an error occurred.

* Move all function related to PythonJob to the PythonJob Task, so that we can handle serialization and deserialization correctly for PythonJob Task
* Update the outputs of the task when loading the task in the error handler. so that we can use the outputs to update the input for next run.
GeigerJ2 added a commit to GeigerJ2/aiida-workgraph that referenced this pull request Sep 13, 2024
This PR adds a built-in output socket, `exit_code`, which serves as a mechanism for error handling and status reporting during task execution. This exit_code is `aiida.engine.ExitCode` with status and message. For the status, an integer value where 0 indicates a successful completion, and any non-zero value signals that an error occurred.

* Move all function related to PythonJob to the PythonJob Task, so that we can handle serialization and deserialization correctly for PythonJob Task
* Update the outputs of the task when loading the task in the error handler. so that we can use the outputs to update the input for next run.
agoscinski pushed a commit that referenced this pull request Sep 19, 2024
This PR adds a built-in output socket, `exit_code`, which serves as a mechanism for error handling and status reporting during task execution. This exit_code is `aiida.engine.ExitCode` with status and message. For the status, an integer value where 0 indicates a successful completion, and any non-zero value signals that an error occurred.

* Move all function related to PythonJob to the PythonJob Task, so that we can handle serialization and deserialization correctly for PythonJob Task
* Update the outputs of the task when loading the task in the error handler. so that we can use the outputs to update the input for next run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supporting custom exit code for PythonJob.
2 participants