Make (part of) processin-chain async #51

leclairm · 2023-11-24T12:23:00Z

2 approaches to the async problem

The comprehensive one which would enable all the jobs to be actual slurm jobs with dependencies that can be submited ahead of time. This would allow:
- all the current configurations to run async
- users to write arbitrary scripts and turn them automatically in such jobs.
The targeted approach where we only focus on some jobs of interest. This would have the following limitations:
- Only jobs that explicitly submit a corresponding slurm job can run async
- This also implies that any job in an async config must be implemented this way, including user provided jobs.

Implications

Although approach 1 seems more appealing, it implies a lot of refactoring as it contradicts with a lot of design choices. In particular, the current structure assumes that

jobs have access to the only one running python interpreter and its memory.
jobs run sequentially

In order to generate slurm jobs out of any of these jobs we'd need to ensure that

the job python module is transferred to the working directory along with all its imported modules
the configuration objects are dumped to files in that working directory

all of this knowing that some of the jobs currently act one after each other in the same directory...

Proposed Road map

As a conclusion, here is the proposed road map:

target the jobs of interest, namely icon and prepare_data and make their main function return the job id(s) that they submitted
Implement the dependency mechanism in run_chain with an error when trying to run async with jobs not ready for it.
Later: make as many job async as possible so that other config than icon can be made async

Most of the work is in prepare_data. It will need to be either broken into pieces or equiped with error messages for all the pieces not ready for async yet.

The text was updated successfully, but these errors were encountered:

leclairm linked a pull request Nov 28, 2023 that will close this issue

Make jobs fit for async #52

Merged

5 tasks

mjaehn linked a pull request Jan 29, 2024 that will close this issue

Enable asynchronous jobs #54

Merged

18 tasks

mjaehn mentioned this issue Jan 29, 2024

Add asynchronous job submissions #28

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make (part of) processin-chain async #51

Make (part of) processin-chain async #51

leclairm commented Nov 24, 2023 •

edited

Loading

Make (part of) processin-chain async #51

Make (part of) processin-chain async #51

Comments

leclairm commented Nov 24, 2023 • edited Loading

2 approaches to the async problem

Implications

Proposed Road map

leclairm commented Nov 24, 2023 •

edited

Loading