Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Allow verbosity to be controlled or save? #43

Closed
emstruong opened this issue Nov 29, 2024 · 9 comments
Closed

[Feature Request] Allow verbosity to be controlled or save? #43

emstruong opened this issue Nov 29, 2024 · 9 comments

Comments

@emstruong
Copy link

ret <- runSimulation(design=dsub, replications=replications, seed=seed,
verbose=FALSE, save_details=save_details,
parallel=parallel, cl=cl,
control=control, save=FALSE, ...)

It'd be nice to have either the verbose or save be configured for monitoring jobs. Sometimes there are issues that only happen with job arrays and not single jobs through runSimulation().

Verbosity can be directed to files via >> and 2>>, and that's nice...

@philchalmers
Copy link
Owner

Not quite sure I follow. You can certainly pipe the console IO information, but runSimulation() suppresses a good amount of printed info as it can get quite noisy. Do you have a particular use-case in mind?

@emstruong
Copy link
Author

emstruong commented Nov 29, 2024

In my particular use-case, the success and correctness of the Analyze function is very sensitive to both the peripheral computational environment and the focal statistical problem.

So I've found that being able to pipe both >> and 2>> to various files to be very helpful for diagnosing my simulation.

I anticipate that this will automatically be a relevant issue anytime someone is using a library whose chain of dependencies is very very deep (e.g., brms relies on cmdstanr relies on processx and cmdstan relies on C++ things relies on I/O latency and other things).

@philchalmers
Copy link
Owner

In my particular use-case, the success and correctness of the Analyze function is very sensitive to both the peripheral computational environment and the focal statistical problem.

That's a huge issue. I'm not really sure you want to be managing this on a per replication basis. Is there some way to make the implementation more defensive and anticipate the incoming problems? From my end of the runSimulation() wrapper filtering through all possible information is quite tricky ...

So I've found that being able to pipe both >> and 2>> to various files to be very helpful for diagnosing my simulation.

You should be able to do this on SLURM distributions via the .out files. Maybe that's sufficient? I'm not sure.

I anticipate that this will automatically be a relevant issue anytime someone is using a library whose chain of dependencies is very very deep (e.g., brms relies on cmdstanr relies on processx and cmdstan relies on C++ things relies on I/O latency and other things).

Possibly true, though again this seems like a problem that requires careful considerations beforehand. Nevertheless if you have a brief example to work with where piping the output is useful I'd be happy to take a look and see what can be done.

@emstruong
Copy link
Author

That's a huge issue. I'm not really sure you want to be managing this on a per replication basis. Is there some way to make the implementation more defensive and anticipate the incoming problems? From my end of the runSimulation() wrapper filtering through all possible information is quite tricky ...

It is a huge issue and AFAICT, through a lot of fine-tuning of the computational environment, I've minimized the rate of errors as much as I could possible could. Most of the time, it never happens. For future readers, I detail my attempts here: stan-dev/cmdstanr#1044 (comment) and for Compute Canada users in particular, I also isolated the R packages onto the relatively faster scratch folder, as opposed to storing them on the very slow home.

You should be able to do this on SLURM distributions via the .out files. Maybe that's sufficient? I'm not sure.

...You can? ...How? Well anyways, what I was hoping is that through runArraySimulation() that I could adjust the verbose and maybe even save arguments in runSimulation? Is that not something we can do easily?

Possibly true, though again this seems like a problem that requires careful considerations beforehand. Nevertheless if you have a brief example to work with where piping the output is useful I'd be happy to take a look and see what can be done.

I'll see if I can get a brief example... Really trying to finish the simulation and analysing the data before the holidays...

@philchalmers
Copy link
Owner

In my examples I suppress the .out files as I find them noisy and uninformative, but you may find them useful in this case. The SBATCH command I use is here:

#SBATCH --output=/dev/null    ## (optional) delete .out files

From the documentation:

-o, --output=<filename_pattern>

Instruct Slurm to connect the batch script's standard output directly to the file name specified in 
the "filename pattern". By default both standard output and standard error are directed to the 
same file. For job arrays, the default file name is "slurm-%A_%a.out", "%A" is replaced by the job 
ID and "%a" with the array index. For other jobs, the default file name is "slurm-%j.out", where 
the "%j" is replaced by the job ID. See the filename pattern section below for filename specification options.

@philchalmers
Copy link
Owner

Looks like there isn't going to be movement on this for a bit, and I don't think this is something that SimDesign can manage anyway (I assume system configurations and compilers are behaving lawfully.....). Closing for now as the issue is active elsewhere.

@emstruong
Copy link
Author

Hi Phil,

Sorry about the delay--while the SBATCH output is interesting and not something I knew about before, I think it's different from the SimDesign output.

I'd still like to be able to view the output from runSimulation() if it's possible... Right now, the default output from runArraySimulation is different from runSimulation() because of the differences in the default value for verbose argument.

Whether this will actually solve my current issue or not is one thing, but I think in general, it'd be nice for users to be able to view the output or messages of whatever they'd like to see. What's the harm in allowing passing verbose from runArraySimulation?

@philchalmers
Copy link
Owner

So you're just looking for an argument like runArraySimulation(..., verbose=TRUE)? Sure, there's no real harm in that, but unlike runSimulation() I think this should be set as FALSE for interactive sessions (generally for testing) and TRUE for non-interactive sessions given the nature of the distribution jobs. That would at least make .out files potentially more useful, though again I'm skeptical given how much information that package allows through to the console. I'll push this momentarily.

@emstruong
Copy link
Author

So you're just looking for an argument like runArraySimulation(..., verbose=TRUE)? Sure, there's no real harm in that, but unlike runSimulation() I think this should be set as FALSE for interactive sessions (generally for testing) and TRUE for non-interactive sessions given the nature of the distribution jobs. That would at least make .out files potentially more useful, though again I'm skeptical given how much information that package allows through to the console. I'll push this momentarily.

oh yes, it should definitely be FALSE by default, but I think it's potentially handy if you really need to monitor things... Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants