ENH: Parallel mode for monte-carlo simulations #619

brunosorban · 2024-06-09T13:27:25Z

This pull request implements the option to run simulations in parallel to the MonteCarlo class. The feature is using a context manager named MonteCarloManager to centralize all workers and shared objects, ensuring proper termination of the sub-processes.

A second feature is the possibility to export (close to) all simulation inputs and outputs to an .h5 file. The file can be visualized via HDF View (or similar) software. Since it's a not so conventional file, method to read and a structure to post-process multiple simulations was also added under rocketpy/stochastic/post_processing. There's a cache handling the data manipulation where a 3D numpy array is returned with all simulations, the shape corresponds to (simulation_index, time_index, column). column is reserved for vector data, where x,y and z, for example, may be available under the same data. For example, under cache.read_inputs('motors/thrust_source') time and thrust will be found.

Pull request type

Code changes (bugfix, features)

Checklist

Tests for the changes have been added (if needed)
Docs have been reviewed and added / updated
Lint (black rocketpy/ tests/) has passed locally
All tests (pytest tests -m slow --runslow) have passed locally
CHANGELOG.md has been updated (if relevant)

Current behavior

In the current moment, montecarlo simulations must run in parallel and all outputs a txt file

New behavior

The montecarlo simulations may now be executed in parallel and all outputs may be exported to a txt or an h5 file, saving some key data or everything.

Breaking change

Yes
No

Additional information

None

brunosorban · 2024-06-09T19:08:24Z

Benchmark of the results. A machine with 6 cores(12 threads) was used.

phmbressan

Amazing feature, as the results show the MonteCarlo class has great potential for parallelization.

The only blocking issue I see with this PR is the serialization code. It still does not support all of rocketpy features and requires a lot of maintanance and updates on our end.

Do you see any other option for performing the serialization of inputs?

Gui-FernandesBR · 2024-06-18T10:47:56Z

Amazing feature, as the results show the MonteCarlo class has great potential for parallelization.

The only blocking issue I see with this PR is the serialization code. It still does not support all of rocketpy features and requires a lot of maintanance and updates on our end.

Do you see any other option for performing the serialization of inputs?

@phmbressan we should make all the classes json serializable, it's an open issue at #522 . In the meantime, maybe we could still use the _encoders module to serialize inputs.

I agree with you that implementing flight class serialization within this PR may conflict create maintenance issues for us. The simplest solution would be to delete the flightv1_serializer (and similar) function.

rocketpy/simulation/monte_carlo.py

Gui-FernandesBR

@phmbressan really good modifications to this PR. Great work.

Before merging, please run 1000 simulations so the example becomes better illustrated on the documentation, please.

rocketpy/simulation/monte_carlo.py

…carlo

phmbressan · 2024-08-23T15:36:57Z

I have pushed a fix for the issue on file writing when running on Windows (more accurately on processes spawn mode). I have tested it on a Windows machine and it was running correctly, but I invite reviewers to test also in different OS configs.

Issues solved by this PR:

MonteCarlo simulations have a parallel mode;
Both the simulation execution and data saving are executed in parallel (producer - consumer);
There are performance gains on large simulations;
The serial simulations can be executed in the same fasion and the outputs of both ways are compatible.

Points of Improvement:

Soft Interrupts of parallel simulations (e.g. an exception or Ctrl-C) are only effective on Linux. Spawned processes (Windows) currently are hard stopping.
On Windows, the Jupyter notebook will not show the status update prints (running the simulations in a terminal is fine). This seems to be a OS level std output change that is not easily solved.

Some of these points could become issues of the repository. Stating them here for proper PR documentation.

Future Considerations:

Python 3.14 and forward will make the spawn the default start method for all OS. We could change RocketPy start method stay as fork on Linux if this undermines too much the performance;
The Python GIL should be removed some years from now (PEP703), this could bring performance benefits, since Threads are generally faster to start.

Gui-FernandesBR · 2024-08-28T04:36:43Z

@phmbressan I like the way this PR was refactored. Many thanks for your effort.

Please fix the pylint errors and solve all the open conversations in this PR so we can approve and merge it onto develop!

Optionally, try to rebase the PR to get the latest commits from develop.

MateusStano · 2024-08-30T09:45:28Z

rocketpy/simulation/monte_carlo.py

+        if n_workers is None or n_workers > os.cpu_count():
+            n_workers = os.cpu_count()
+
+        if n_workers < 2:
+            raise ValueError("Number of workers must be at least 2 for parallel mode.")


We should print the number of workers being used with _SimMonitor.reprint here.

MateusStano · 2024-08-30T10:07:48Z

rocketpy/simulation/monte_carlo.py

+
+            sim_consumer.start()
+
+            for seed in seeds:


Very minor, but I think the consumer should start after the producers

MateusStano · 2024-08-30T10:16:09Z

rocketpy/simulation/monte_carlo.py

+                )
+                processes.append(sim_producer)
+
+            for sim_producer in processes:
+                sim_producer.start()
+


There is an extra for loop here

Suggested change

)

processes.append(sim_producer)

for sim_producer in processes:

sim_producer.start()

)

processes.append(sim_producer)

sim_producer.start()

MateusStano · 2024-08-30T11:02:54Z

rocketpy/simulation/monte_carlo.py

+            while sim_monitor.keep_simulating():
+                sim_idx = sim_monitor.increment() - 1
+
+                self.environment._set_stochastic(seed)
+                self.rocket._set_stochastic(seed)
+                self.flight._set_stochastic(seed)
+


Every single iteration needs to be re-seeded?

If this was done before the while loop, wouldn't it be enough?

MateusStano · 2024-08-30T11:24:38Z

rocketpy/simulation/monte_carlo.py

@@ -253,114 +491,52 @@ def __run_single_simulation(self, input_file, output_file):
            ]
            for item in d.items()
        )
+        inputs_dict["idx"] = sim_idx


Suggested change

inputs_dict["idx"] = sim_idx

inputs_dict["index"] = sim_idx

For clarity on the files

MateusStano · 2024-08-30T11:26:51Z

rocketpy/simulation/monte_carlo.py

+        outputs_dict = {
+            export_item: getattr(monte_carlo_flight, export_item)
+            for export_item in self.export_list
+        }


Suggested change

outputs_dict = {

export_item: getattr(monte_carlo_flight, export_item)

for export_item in self.export_list

}

outputs_dict = {

export_item: getattr(monte_carlo_flight, export_item)

for export_item in self.export_list

}

outputs_dict["index"] = sim_idx

Really useful to have index on both input and output

Gui-FernandesBR · 2024-09-08T23:22:37Z

Converted to draft until you solve the remaining issues, specially the random number generation problem,
@phmbressan

…carlo

…s inside methods of Components

Lucas-Prates · 2024-12-18T13:17:34Z

I believe this PR is ready again for another round of review. These are the changes since the previous review:

@phmbressan has done some great work simplifying and optimizing even further the parallel structure, and a sim_consumer process is no longer needed;
@phmbressan and I fixed the random number generator bug. The solution consisted in resetting all stochastic structures inside the StochasticRocket and their position. The simplest solution we found, without changing things that go directly to either Rocket and Flight, is implemented in the methods _set_stochastic and __reset_components of StochasticRocket, so please take a closer look at both;
a very very minor fix in some of the methods of Components, just make sure that they make sense.

Overall, it seems that the time per iteration is even faster now, at least by my local measurements. @phmbressan might want to complement the information provided here, he knows this PR much better than I do!

Please, make sure to take a careful look at the Monte Carlo .input file to check that there is indeed no dependency on the generated random variables.

Lucas-Prates · 2024-12-18T15:05:31Z

Another important issue: I currently can not interrupt the MonteCarlo.simulate method smoothly when it is run in parallel, all attempts lead to killing the notebook 😨 ! Would be great to check if the same is happening in your own machines.

Gui-FernandesBR · 2024-12-20T06:52:20Z

rocketpy/rocket/components.py

This is a basic class, can we add unit tests to cover the modified lines?

Gui-FernandesBR · 2024-12-20T06:53:45Z

.github/CITATION.cff

looks odd to have this file in the git tree... I think we should rebase the branch to develop

brunosorban added 18 commits May 4, 2024 14:17

Basic paralllel structure added

2d5ff8d

added counter

6fbe0f7

Working version with shared objects

2927448

Write mode added

1b50e94

Enable both export modes for serial and parallel

46f5f00

Style changes

6ea6ef8

Added post-processing scripts

be32a75

using queue to manage simulations

1146e20

one lock per file

175a025

Added append logic to h5 file

9cef636

Enabled number of workers control

d57e436

Added central post-processing script

1fe04e1

Updated example notebook

5a6547d

removed test file

75bc96b

Removed dev files

918cbe0

Updated append mode

b3dcfc6

removed unsused file

ee06b9d

Added documentation

38a29b1

brunosorban requested a review from phmbressan June 9, 2024 13:27

brunosorban requested a review from a team as a code owner June 9, 2024 13:27

brunosorban changed the title ~~Parallel mode for monte-carlo simulations~~ ENH: Parallel mode for monte-carlo simulations Jun 9, 2024

brunosorban added 4 commits June 11, 2024 13:37

Centralized simulation control in SimCounter

98ce6ba

Updated start time

2b8dc4b

Working 2 way semaphore

d421a83

Added cpu limit

ceb1832

Gui-FernandesBR requested a review from MateusStano June 13, 2024 23:27

phmbressan requested changes Jun 18, 2024

View reviewed changes

Gui-FernandesBR reviewed Jun 18, 2024

View reviewed changes

rocketpy/simulation/monte_carlo.py Show resolved Hide resolved

Gui-FernandesBR reviewed Aug 22, 2024

View reviewed changes

phmbressan added 5 commits August 23, 2024 10:17

MNT: improve process ordering for spawned workers.

d22c957

DOC: run 1000 MonteCarlo simulations for better documentatiion example.

6dab002

MNT: solve review comments on docstrings and code comments.

d07fcc2

Merge remote-tracking branch 'origin/develop' into enh/parallel_monte…

6fa90b7

…carlo

DOC: improve docstrings regarding number of workers.

e40a871

MateusStano reviewed Aug 30, 2024

View reviewed changes

phmbressan added 5 commits September 5, 2024 16:00

MNT: add index to outputs of monte carlo.

4246809

MNT: avoid unnecessary reseedings on parallel monte carlo.

9f7325c

MNT: simplify process start up syntax.

1baedf6

MNT: add number of workers in parallel mode to prints.

df07955

MNT: fix pylint messages on file handling.

8082710

Gui-FernandesBR assigned phmbressan Sep 8, 2024

Gui-FernandesBR marked this pull request as draft September 8, 2024 23:21

Gui-FernandesBR linked an issue Dec 8, 2024 that may be closed by this pull request

ENH: Monte Carlo Analysis Enhancements #269

Open

17 tasks

MNT: Simplify Monte Carlo parallel export structure.

00d9d02

phmbressan force-pushed the enh/parallel_montecarlo branch from 53ba8ed to 00d9d02 Compare December 16, 2024 21:17

phmbressan and others added 2 commits December 16, 2024 22:35

Merge remote-tracking branch 'origin/develop' into enh/parallel_monte…

ebf6bd0

…carlo

BUG: fixing random number generator bug in StochasticRocket and issue…

0009c24

…s inside methods of Components

Lucas-Prates requested review from phmbressan, MateusStano and Gui-FernandesBR December 18, 2024 13:01

Lucas-Prates marked this pull request as ready for review December 18, 2024 13:17

Gui-FernandesBR reviewed Dec 20, 2024

View reviewed changes

rocketpy/rocket/components.py

Copy link

Member

Gui-FernandesBR Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a basic class, can we add unit tests to cover the modified lines?

Gui-FernandesBR reviewed Dec 20, 2024

View reviewed changes

.github/CITATION.cff

Copy link

Member

Gui-FernandesBR Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks odd to have this file in the git tree... I think we should rebase the branch to develop

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Parallel mode for monte-carlo simulations #619

ENH: Parallel mode for monte-carlo simulations #619

brunosorban commented Jun 9, 2024 •

edited

Loading

brunosorban commented Jun 9, 2024 •

edited

Loading

phmbressan left a comment

Gui-FernandesBR commented Jun 18, 2024

Gui-FernandesBR left a comment

phmbressan commented Aug 23, 2024 •

edited

Loading

Gui-FernandesBR commented Aug 28, 2024

MateusStano Aug 30, 2024

MateusStano Aug 30, 2024

MateusStano Aug 30, 2024

MateusStano Aug 30, 2024

MateusStano Aug 30, 2024

MateusStano Aug 30, 2024

Gui-FernandesBR commented Sep 8, 2024

Lucas-Prates commented Dec 18, 2024

Lucas-Prates commented Dec 18, 2024 •

edited

Loading

Gui-FernandesBR Dec 20, 2024

Gui-FernandesBR Dec 20, 2024

ENH: Parallel mode for monte-carlo simulations #619

Are you sure you want to change the base?

ENH: Parallel mode for monte-carlo simulations #619

Conversation

brunosorban commented Jun 9, 2024 • edited Loading

Pull request type

Checklist

Current behavior

New behavior

Breaking change

Additional information

brunosorban commented Jun 9, 2024 • edited Loading

phmbressan left a comment

Choose a reason for hiding this comment

Gui-FernandesBR commented Jun 18, 2024

Gui-FernandesBR left a comment

Choose a reason for hiding this comment

phmbressan commented Aug 23, 2024 • edited Loading

Gui-FernandesBR commented Aug 28, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Gui-FernandesBR commented Sep 8, 2024

Lucas-Prates commented Dec 18, 2024

Lucas-Prates commented Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brunosorban commented Jun 9, 2024 •

edited

Loading

brunosorban commented Jun 9, 2024 •

edited

Loading

phmbressan commented Aug 23, 2024 •

edited

Loading

Lucas-Prates commented Dec 18, 2024 •

edited

Loading