-
Notifications
You must be signed in to change notification settings - Fork 15
Utilities
JobSpec provides a couple of functions which allow plugins to easily group input or output files and to keep track of status for each group. This plugin shows how those functions are used.
def get_input_file_attributes(self, skip_ready=False)
This method returns a dictionary of input file attributes. The key of the dictionary is the logical file name (LFN) of the input file and the value is a dictionary of file attributes (fsize, guid, checksum, scope, dataset, and endpoint). If skip_ready is set to True, files are ignored if they are already in ready state. Concerning file status see the next section.
def set_groups_to_files(self, id_map)
To set group information to files. id_map is a dictionary of {identifier_string: {'lfns': [LFN, ...], 'groupStatus': status_string}. Identifier_string is the identifier of the file group, which contains files with the lfns, and can be an arbitrary string. Status_string can also be an arbitrary string, but groups are ignored for the file->group lookup once the status_string is set to 'failed'.
def update_group_status_in_files(self, identifier_string, status_string)
This method updates status of the group. Status_string is explained in the set_groups_to_files method.
def get_groups_of_input_files(self, skip_ready=False)
To get a dictionary of {identifier_string: a dictionary of the group information} for input files. If skip_ready is set to True, the method returns groups of the input files which are not in ready state, which could be useful in the check_status method of preparator plugins. Keys of the group information dictionary are groupStatus and groupUpdateTime which are updated when the set_groups_to_files
or update_group_status_in_files
method is called.
If multiple jobs are fetched and they use the same input files, preparatory triggers stage-in only for the first job while keeping the others on hold until input files are successfully transferred. First, file status is set to to_prepare
for the first job and preparing
doe the other jobs. Once the check_status method of preparator plugin returns True for a job, file status is changed to ready
. If the file status changes from preparing
it inherits the grouping information, which is explained in the above section, of the first job.
The following parameters are available to optimize logging in etc/panda/panda_common.cfg.
Name | Description |
---|---|
log_level | Logging level. See python doc. Can be CRITICAL, ERROR, WARNING, INFO, DEBUG (default), or NOTSET. |
rotating_policy | Policy for log rotation. Can be time, size, or none. time : rotation at certain timed intervals, size : rotation at a predetermined size, none : no rotation (default). |
rotating_backup_count | How many old log files should be saved. Effective unless rotating_policy=none. 1 by default. |
rotating_max_size | Rotation happens when the file size (in MB) is about to be exceeded. Effective only when rotating_policy=size. 1024 by default. |
rotating_interval | Rotation interval in hours. Effective only when rotating_policy=time. 24 by default. |
Harvester support statistic, deterministic, or thread-aware profiling. Statistic profiling is done with python's standard profilers, while deterministic or thread-aware profiling is done with the pprofile package which needs to be installed using pip:
$ pip install pprofile
Harvester is launched with a profiler if the --profiler_output
option is given to master.py. The option specifies the filename where the results of the profiler are dumped. If profiling is in the deterministic or thread-aware mode and the filename starts with "cachegrind.out", the results are written in the callgrind profile format which allows the file to be browsed with kcachegrind. If profiling is in the statistic mode, the dumo file can be analyzed using python's standard pstats package. Profiling is in the statistic mode by default, and can be changed with the --profile_mode
option, "d" for the deterministic mode and "t" for the thread-aware mode. You can find detailed explanations about profiling modes in the pprofile's page. Note that the dump file is produced only when harvester is properly terminated with the USR2 or TERM signal, i.e.,
$ kill -USR2 `cat $PWD/tmp.pid`
or
$ kill `cat $PWD/tmp.pid`
Getting started |
---|
Installation and configuration |
Testing and running |
Debugging |
Work with Middleware |
Admin FAQ |
Development guides |
---|
Development workflow |
Tagging |
Production & commissioning |
---|
Scale up submission |
Condor experiences |
Commissioning on the grid |
Production servers |
Service monitoring |
Auto Queue Configuration with CRIC |
SSH+RPC middleware setup |
Kubernetes section |
---|
Kubernetes setup |
X509 credentials |
AWS setup |
GKE setup |
CERN setup |
CVMFS installation |
Generic service accounts |
Advanced payloads |
---|
Horovod integration |