Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Implementing Multivariate Rejection Sampling (MRS) in RocketPy #738

Draft
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

Lucas-Prates
Copy link
Contributor

@Lucas-Prates Lucas-Prates commented Nov 28, 2024

Pull request type

  • Code changes (bugfix, features)

Checklist

  • Tests for the changes have been added (if needed)
  • Docs have been reviewed and added / updated
  • Lint (black rocketpy/ tests/) has passed locally
  • All tests (pytest tests -m slow --runslow) have passed locally
  • CHANGELOG.md has been updated (if relevant)
  • RST documentation

New behavior

This PR implements the MRS requested in #162 and described in RocketPy paper.

Breaking change

  • No

Additional information

It is possibly a good idea to use setters when distribution_dict is modified by the user. Otherwise, the sample method can provide incorrect results due to attributes computed in the initialization.

@Lucas-Prates Lucas-Prates requested a review from a team as a code owner November 28, 2024 21:36
@Lucas-Prates Lucas-Prates added Enhancement New feature or request, including adjustments in current codes Monte Carlo Monte Carlo and related contents labels Nov 28, 2024
@Lucas-Prates Lucas-Prates linked an issue Nov 28, 2024 that may be closed by this pull request
@Lucas-Prates Lucas-Prates marked this pull request as draft November 28, 2024 21:38
Copy link

codecov bot commented Nov 28, 2024

Codecov Report

Attention: Patch coverage is 17.30769% with 86 lines in your changes missing coverage. Please review.

Project coverage is 75.92%. Comparing base (83aa20e) to head (3a3c9e6).
Report is 9 commits behind head on develop.

Files with missing lines Patch % Lines
...ketpy/simulation/multivariate_rejection_sampler.py 17.30% 86 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #738      +/-   ##
===========================================
- Coverage    76.42%   75.92%   -0.50%     
===========================================
  Files           95       97       +2     
  Lines        11090    11198     +108     
===========================================
+ Hits          8475     8502      +27     
- Misses        2615     2696      +81     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Lucas-Prates
Copy link
Contributor Author

This PR is ready for a first "Design Review." I would like to get your opinion if this implementation provides what you think on how the user should use the MRS.

Albeit a implementation as a function seems natural, I implemented as a class because:

  1. the function would be humongous;
  2. if the user wants to resample several times, the data monte carlo data is only read once from the harddrive.

It currently works as follows:

  1. Input: monte carlo filepath prefix, mrs filepath prefix, distribution dictionary;
  2. Load input and output data from a monte carlo simulation into memory (python objects - lists of jsons);
  3. To avoid having to read data twice, while loading, precompute some important properties required in the sampler algorithm;
  4. Select and save iteratively accepted samples;
  5. Output: files are saved in the same "scheme" as the MonteCarlo simulation.

I provided a quick and dirty notebook, which will be removed, just to show how the class is being used at the moment.

Copy link
Member

@Gui-FernandesBR Gui-FernandesBR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much time so I'll be short

  • The class dies after you sample the data once, this makes it pointless to have a class
  • Instead I'd remove the sample dictionary from arguments.
  • First read the data during initialization. Then you use a function to set the variables you are going to allow be varied. This way u can already anticipate which variables may be varied.
  • We want almost instant results for a MonteCarlo simulation after using MRS.
  • Other thing is that the user must supply the original pdf. Could we possile estimate this from data? (imagine 70k)
  • Finally, plotting is crucial for MRS, or even tables. We'd love to see more on that later.

rocketpy/simulation/multivariate_rejection_sampler.py Outdated Show resolved Hide resolved
rocketpy/simulation/multivariate_rejection_sampler.py Outdated Show resolved Hide resolved
rocketpy/simulation/multivariate_rejection_sampler.py Outdated Show resolved Hide resolved
rocketpy/simulation/multivariate_rejection_sampler.py Outdated Show resolved Hide resolved
rocketpy/simulation/multivariate_rejection_sampler.py Outdated Show resolved Hide resolved
rocketpy/simulation/multivariate_rejection_sampler.py Outdated Show resolved Hide resolved
rocketpy/simulation/multivariate_rejection_sampler.py Outdated Show resolved Hide resolved
f"the monte carlo input file {input_filename}!"
) from e

input_file.close()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be inside a finally block, I believe. The same applies to the method load_output.

"""Loads input information from monte carlo in a SampleInformation
object
"""
input_filename = f"{self.monte_carlo_filepath}.inputs.txt"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, it is better practice to handle paths with the pathlib standard library. Check out the parallel monte carlo PR as an example of handling the filename in the MonteCarlo class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request, including adjustments in current codes Monte Carlo Monte Carlo and related contents
Projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

ENH: Implement MRS method on RocketPy!
3 participants