[Feature request]: HPC Resource Estimation Tooling #445
Labels
batch
Relating to batch processing.
enhancement
Request for improvement or addition of new feature(s).
medium priority
Medium priority.
Label
batch, enhancement
Priority Label
medium priority
Is your feature request related to a problem? Please describe.
Users often have a difficult time estimating the HPC resources (time/CPU/memory) required for a batch job. There are useful rules of thumb that have been documented, but time and memory in particular are very dependent on the model itself. CPU less so, usually there are constraints on the inference method that determine this one.
Is your feature request related to a new application, scenario round, pathogen? Please describe.
No response
Describe the solution you'd like
Add an
--estimate
flag to the futureflepimop batch
command (described in GH-440) that will submit a small number of sample jobs varying underlying parameters (say if doingflepimop batch calibrate
then chains and samples) would be varied. The time and peak memory usage would then be measured and could fit a multivariate linear regression and then take the 95% prediction interval estimate as the bounds for time and memory for a job of a larger size. An example might look like:# --estimate=time is specifically for time, but could leave empty for time & memory $ flepimop batch --estimate=time --vary=chains:2,4,6,8 --vary=iterations:100,200,300,400
I don't particularly like the
--vary
syntax as written above so open to suggestions. Also need to think about how the results will be outputted. Can:I think the answer to the above as well as the exact syntax to use will become more clear/natural after implementing GH-440 so not worth thinking to hard about at the moment. Blocked by GH-440.
The text was updated successfully, but these errors were encountered: