Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run batch simulations over set of startup params in yaml file #125

Merged
merged 32 commits into from
Jan 31, 2023

Conversation

andermi
Copy link
Collaborator

@andermi andermi commented Jan 24, 2023

  1. Takes an input yaml file formatted like the example:
seed: 42
duration: 5
door_state: ['closed', 'open']
scale_factor: [0.5, 0.625, 0.75, 0.875, 1.0, 1.125, 1.25, 1.375, 1.4]
  1. Generates a matrix of tests
sim_params_yaml: seed parameter not yet supported
Generated 18 simulation runs:
Seed, Duration, Door State, Scale Factor
42, 5.0, closed, 0.5
42, 5.0, open, 0.5
42, 5.0, closed, 0.625
42, 5.0, open, 0.625
42, 5.0, closed, 0.75
42, 5.0, open, 0.75
42, 5.0, closed, 0.875
42, 5.0, open, 0.875
42, 5.0, closed, 1.0
42, 5.0, open, 1.0
42, 5.0, closed, 1.125
42, 5.0, open, 1.125
42, 5.0, closed, 1.25
42, 5.0, open, 1.25
42, 5.0, closed, 1.375
42, 5.0, open, 1.375
42, 5.0, closed, 1.4
42, 5.0, open, 1.4
  1. Creates results directory, runs sim which creates a bagfile per run and writes the list of params to a log

NOTE: Needs latest updates to ros2 branch in ros_gz repo: gazebosim/ros_gz#355
To run example batch sim params yaml (located in buoy_gazebo/scripts/example_sim_params.yaml):

$ ros2 launch buoy_gazebo mbari_wec_batch.launch.py sim_params_yaml:=../install/buoy_gazebo/share/buoy_gazebo/example_sim_params.yaml

$ ls in the created results directory (batch_results_<timestamp>) will show bagfiles and a log have been created.

A yaml file can also be provided as an argument to the launch file

@andermi andermi marked this pull request as ready for review January 24, 2023 19:17
Signed-off-by: Michael Anderson <[email protected]>
Signed-off-by: Michael Anderson <[email protected]>
Signed-off-by: Michael Anderson <[email protected]>
@andermi andermi linked an issue Jan 24, 2023 that may be closed by this pull request
7 tasks
Comment on lines 37 to 46
if 'open' in door_state:
mu_zz = 3000.0

# Heave cone
heave_total_mass = 817
trefoil_mass = 10
trefoil_mass = 20
try:
mu_zz
except NameError:
mu_zz = 10000.0 # not defined so default
Copy link
Collaborator Author

@andermi andermi Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I set mu_zz value for added mass depending on the door state, but this PR doesn't have the added mass implemented yet. So mu_zz is currently unused but ready when #115 is merged (or I can base this PR from that one)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, mu_zz in the .sdf.em is in the "add-addedmass-to-sdf" branch and PR, but we can't really merge that until the garden debians are available with the handling of the phantom weight of the added mass terms. Also, there is one other parameter that depends upon the doors open/closed state. That's Z_WW and changes a bit with open/closed, shown here: https://osrf.github.io/buoy_entrypoint/pr-preview/pr-17/theory/ That has not yet been parametrized in a .sdf file anywhere b/c we need the viscous drag branch/pull-request to be completed (goal of this friday...)

Copy link
Collaborator Author

@andermi andermi Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hamilton8415 I now set z_ww as well (commit d990b32) but also unused until #115. I made a TODO to update

Signed-off-by: Michael Anderson <[email protected]>
@andermi
Copy link
Collaborator Author

andermi commented Jan 24, 2023

Still having issues with #117

@quarkytale
Copy link
Contributor

quarkytale commented Jan 24, 2023

Correct me if I'm wrong but I noticed its always the (push) workflow that fails and not the (pull_request)

Signed-off-by: Michael Anderson <[email protected]>
@andermi
Copy link
Collaborator Author

andermi commented Jan 24, 2023

So, I actually do think it's just timing out from the CI being slow. The error message was a bit confusing to me, but I think after trying myself to kill the process, the wait on server run blocks the program from exiting (and has more prints afterward) and makes it hard to tell why it really failed.

@andermi
Copy link
Collaborator Author

andermi commented Jan 24, 2023

Either something is getting stuck, or the machines are even slower than usual lately. I only have a single test running right now and it took more than 20 minutes to pass.

@andermi
Copy link
Collaborator Author

andermi commented Jan 24, 2023

noticed its always the (push)

yeah, seems like it. I wonder if those are slower machines than the PR?

Signed-off-by: Michael Anderson <[email protected]>
Signed-off-by: Michael Anderson <[email protected]>
@andermi
Copy link
Collaborator Author

andermi commented Jan 25, 2023

I've traced it down to getting stuck on a random (not the same one every time) future.wait() from a command service call. I'll try to see if the service is hanging.

Signed-off-by: Michael Anderson <[email protected]>
andermi added a commit that referenced this pull request Jan 25, 2023
Signed-off-by: Michael Anderson <[email protected]>
Signed-off-by: Michael Anderson <[email protected]>
@hamilton8415
Copy link
Collaborator

hamilton8415 commented Jan 26, 2023

Regarding the batch file work, nice job. Looking like a good approach and I like it. Below is a number of items for consideration and discussion. I think it's easy to list off a ton of things for something like this and we should consider the value of each as we go to keep things from getting too complicated. But, here goes...

  1. As you know, there are a few other parameters to include, and the tricky one is the sea state because there are a few ways to characterize it. Simplest is Wave Height (Hs) and Peak Period (Tp), this pair of numbers could describe a single frequency wave, or characterize a specific spectrum (such as the pierson-moskowitz spectrum). In either case, one is unlikely to want to run a batch of jobs with a range of peak periods at each wave-height for instance. Rather it's the pair that needs to be specified and considered as a single dimension of the resulting matrix of runs. For instance, one may specify (Hs,Tp) as [(1.0, 8), (2.0,9), (3.9,10), ...] resulting in three runs. In the past I've made the distinction between a monochromatic wave and a spectrum of waves by specifying the peak period as negative for monochromatic waves. A bit course, but effective! :) In fact, at the moment this is how the peak period specification in the .sdf is being interpreted in the hamilton8415/fs_hydrodynamics branch, but this can be changed easily. Finally, a feature that isn't in the code yet so is only mentioned here is that one may want to specify a unique spectrum, as a collection of a dozen or so (freq,energy) points describing the spectrum. Not sure how to do that compactly but maybe the yaml syntax has some help for all of this?
  2. It'd be great if we could specify the time-step size and sim duration, instead of number of iterations.
  3. The .log file that shows the specifics of each run is really nice, I "tail -f"'d it as the batch of runs was going and could see it was filling in the lines as they progressed. It might be worth considering including the start time of each job, and perhaps more importantly, listing which directory contains the output for that collection of parameters.
  4. Related to number 3, it may be worth a simple file in the results directory that indicates which parameters were used for that run. Obviously the data there from the ordering of the log file and creation number, but it may be good to make the output directories self-describing. Also, when I re-ran a changed .yaml file in the same directory, a new .log file was created with a date as expected, but the output directory (i.e. "rosbag2_batch_sim_x" didn't over-write the old names, until the "_x" part was unique. Seemingly one might not want to over-write old runs so perhaps we need to think about the naming of those directories.
  5. As we implement the .csv log files, the simulator will generate a directory full of txt files that match what the buoy does. Would be really great if those got stashed here along with the other outputs when running this way. More to do to figure out how that will work.
  6. I am not sure we need a default .yaml file, perhaps better to require that always be an argument to this script, but more thought required.
  7. Finally (yeah right!), I think we can think about this batch file as running a batch of simulated buoys. Of course, one may then run separate ros-enabled code that interacts with the simulator. Point of discussion here is if we want the management of those executables to be done totally separately, or to also be started by this process. Pros and Cons, and I'm not sure of the complications it introduces, but one could consider external controller executables as another dimension in the run matrix, and of course, if not specified, none are started...
  8. Finally again, as discussed we may need a way to specify the a random seed to control if the batch of runs is using the same inputs, or not. I'll put that functionality in the .sdf parameters for the wave-interaction plugin, unless there's a different way.

Good stuff here.

@andermi
Copy link
Collaborator Author

andermi commented Jan 26, 2023

  1. yaml doesn't support tuples, but we can have an array of Hs and another of Tp that need to be the same size and taken as pairs. I can handle the pairing in python. Also, we can keep growing the number of params with future batch processing PRs as more params become available or as we think of them.
  • 2. I can do that but I'm hoping not to have to template the mbari_wec.sdf world. @quarkytale @mabelzhang might know if there's a command line option that overrides what's in the sdf for the physics step. For now I might just leave our physics step constant and just let the user specify time in seconds (which I translate to the command line option --iterations)
  1. see 4
  • 4. I can certainly include the timestamp in the log. Also, the log file I generate lists the params used in each run, and I was using the Run Index to identify which rosbag went with which params. I had envisioned running the batch launch file from a new main results directory for every batch of experiments (each yaml file) rather than worrying about special naming. In my example in the PR description I refer to this main results dir as batch_sim_example but I could certainly do it another way.
  1. agreed!
  • 6. I can switch it to not use a default.
  1. This is one reason I used a launch file as it can include or be included in any other launch file with a set of external nodes or have the other nodes just running the whole time in the background on some other terminal. I have been mulling over a way to specify which nodes and configuration of those external control nodes, but I feel like it could get a bit messy and I don't know what external nodes someone might want to write. I feel like it's best to leave that up to the user to include their nodes.
  • 8. I was planning to put a single random seed in the yaml to pass to gz-sim via command line --seed. Your code would access this from API within gz-sim that the plugins have access to.

Signed-off-by: Michael Anderson <[email protected]>
Signed-off-by: Michael Anderson <[email protected]>
…mestamp everything; put rosbag names in log

Signed-off-by: Michael Anderson <[email protected]>
@andermi
Copy link
Collaborator Author

andermi commented Jan 27, 2023

@hamilton8415 check latest :)

Signed-off-by: Michael Anderson <[email protected]>
Signed-off-by: Michael Anderson <[email protected]>
@hamilton8415
Copy link
Collaborator

Looking good, works nicely. It is of course not lost on me that the seed is chosen as 42 in the example. The only way I could break it is to not specify a seed in the yaml. We may want to allow that and have it indicate a new random seed on each run. Not sure about the implications of the --seed CLI options, presumably there is a default behavior if the seed isn't specified, or I'd imagine we can find a way...

@andermi
Copy link
Collaborator Author

andermi commented Jan 28, 2023

Not sure about the implications of the --seed CLI options

Per slack: @mabelzhang will work on backporting the --seed functionality

@andermi
Copy link
Collaborator Author

andermi commented Jan 30, 2023

@hamilton8415 I added some error checking and defaults for params. I think this is a good stopping point for this PR (assuming the latest commit works for you), and we can add more params in future PRs as they get implemented.

@hamilton8415
Copy link
Collaborator

Ran fine for me, I agree we should merge this at this point and re-visit it when we have wave-heights and similar to set. On mine, there's a ton of INFO that comes out with the runs, perhaps we want to eventually explore a quiet mode that just outputs where it is in the progression as a default.

@andermi
Copy link
Collaborator Author

andermi commented Jan 31, 2023 via email

@andermi andermi enabled auto-merge (squash) January 31, 2023 21:23
@andermi andermi merged commit bc7e914 into main Jan 31, 2023
@andermi andermi deleted the andermi/batch_sim branch January 31, 2023 21:41
@andermi andermi mentioned this pull request Feb 27, 2023
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parameterization at startup
4 participants