Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process is only stored after it fires ProcessListener.on_process_finished event resulting in unfinished process in next step #6579

Open
agoscinski opened this issue Oct 4, 2024 · 0 comments
Assignees
Labels

Comments

@agoscinski
Copy link
Contributor

agoscinski commented Oct 4, 2024

Bug report from a question on discourse @t-reents https://aiida.discourse.group/t/workchain-continues-before-finishing-the-pervious-step/472 also observed by @superstar54 in the workgraph development

Describe the bug

Already well described here https://aiida.discourse.group/t/workchain-continues-before-finishing-the-pervious-step/472

Here my results from investigating it: The workchain starts the next step before the finished process is stored in the database and thus loads the process before the process state was updated to Finished resulting in the nonzero exit code (is_finished_ok checks if the process state is running).

So on the plumpy side the event is already fired before the process is stored in the database. A simplified backtrace of
the process that fires the event (I guess broadcasting) that it finishes, and by that continuing the next process before it updated its process_state:
In the aiida.engine.processes.process.Process.on_entered function

super().on_entered(from_state)

the parent method is invoked that is in plumpy
https://github.com/aiidateam/plumpy/blob/b3837fc9dbf7dc5aca0785e93b94cf5b89d04a91/src/plumpy/processes.py#L701
This invokes much later
https://github.com/aiidateam/plumpy/blob/b3837fc9dbf7dc5aca0785e93b94cf5b89d04a91/src/plumpy/processes.py#L837-L840

        self._fire_event(ProcessListener.on_process_finished, self.future().result())

that broadcasts the event to all processes resulting in the next process being continued. The update of the process state to Finished in the database (the object's process state has been updated but not in the database!) happens later in the aiida.engine.processes.process.Process.on_enteredfunction

self.node.set_process_state(self._state.LABEL) # type: ignore[arg-type]

I will try to switch the events and see what it breaks.

Environment

I think it happens on all AiiDA version (tried newest 2.6.2 and 2.2) and all backends, since this looks like a bug in the engine.

Supplementary

Backtrace log of up to the _fire_event backtrace.log

agoscinski added a commit to agoscinski/aiida-core that referenced this issue Oct 4, 2024
…e broadcasting that process event has changed

Processes start to broadcast their event before they update their
process status in the database. This can cause issues if the next
process directly tries to access the last process state retrieving it
from the database while it has not been updated in the database.
@agoscinski agoscinski self-assigned this Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant