-
Notifications
You must be signed in to change notification settings - Fork 449
MultiSize
The difference in throughput between a slow processor (e.g. an Android device that runs infrequently) and a fast processor (e.g. a GPU that's always on) can be a factor of 1,000 or more. Having a single job size can therefore present problems:
- If the size is small, hosts with GPUs get huge numbers of jobs. This causes performance problems on the client and a high DB load on the server.
- If the size is large, slow hosts can't get jobs, or they get jobs that take weeks to finish.
To address this, BOINC provides a mechanism that tries to send large jobs to fast devices and small jobs to slow devices.
A multi-size application has a set of N size classes, 0 ... N-1. Each job belongs to a size class. Jobs of size class i are smaller than those of size class i+1. You decide how many size classes to have, and how large the jobs of a given size class are.
A size_census script periodically computes statistics about the "effective speed" of devices for each multi-size app, where effective speed is the device speed times host availability. In particular, it computes and maintains the boundaries of the N quantiles.
When a host requests work for a particular device, the scheduler computes its quantile for each multi-size application. It preferentially sends it jobs of the corresponding size class. If it must send jobs of a different size class, it prefers smaller classes.
To make an app multi-size, set the n_size_classes field of its database entry. Currently this must be done manually, e.g.
update app set n_size_classes=3 where id=14;
Set the size class of jobs as you create them. From C++:
...
wu.size_class = 2;
ret = create_work(wu, ...);
From scripts or command line:
create_work ... --size_class 2
Don't forget to set wu.rsc_fpops_est and wu.rsc_fpops_bound appropriately as well.
You may want your work generator to maintain a supply of jobs of each size class. To find the number of unsent jobs of a given size class, use
int count_unsent_results(int&, int appid, int size_class);
The script size_census.php computes effective speed statistics for multi-size apps, and writes them to flat files (named size_census_APPNAME) in the project directory. Arrange to run it periodically by putting the following in your config.xml:
<task>
<cmd>run_in_ops size_census.php</cmd>
<output>size_census.out</output>
<period>24 hour</period>
</task>
If you run the script with the --all_apps option, it will compute the statistics of all apps, not just multi-size ones. This is useful when you're getting things set up.
For each multi-size app, you must run a daemon size_regulator that regulates the flow of jobs into the shared-memory job cache, making sure that cache doesn't get clogged with jobs of a single size
<daemon>
<cmd>size_regulator --app_name uppercase --lo 10 --hi 30 --sleep_time 10</cmd>
<output>size_regulator_uppercase.out</output>
<pid_file>size_regulator_uppercase.pid</pid_file>
<disabled>1</disabled>
</daemon>
The command-line options of size_regulator are
name of the application
keep at least this many jobs of each size class in cache
keep at most this many jobs of each size class in cache
sleep this long if nothing to do
The follow options correspond to those for feeder; use the same one.
To use this feature you must use include the following in your config.xml:
<job_size_matching>1</job_size_matching>