Utilities

File grouping for file transfers

JobSpec and DBInterface provide a couple of functions which allow plugins to easily group input or output files and to keep track of status for each group. This plugin shows how JobSpec functions are used, and another plugin shows how DBInterface functions are used.

JobSpec Methods

def get_input_file_attributes(self, skip_ready=False)

This method returns a dictionary of input file attributes. The key of the dictionary is the logical file name (LFN) of the input file and the value is a dictionary of file attributes (fsize, guid, checksum, scope, dataset, attemptNr and endpoint). attemptNr show how many times the file was tried for the action such as checking and preparing. If skip_ready is set to True, files are ignored if they are already in ready state. Concerning file status see the next section.

def get_output_file_specs(self, skip_done=False)

This method returns a list of output FileSpecs. If skip_done is set to True, files are ignored if they are already finished or failed. FileSpec.attemptNr shows how many times the file was tried for the action such as checking and staging out.

def set_groups_to_files(self, id_map)

To set group information to files. id_map is a dictionary of {identifier_string: {'lfns': [LFN, ...], 'groupStatus': status_string}. Identifier_string is the identifier of the file group, which contains files with the lfns, and can be an arbitrary string. Status_string can also be an arbitrary string, but groups are ignored for the file->group lookup once the status_string is set to 'failed'.

def update_group_status_in_files(self, identifier_string, status_string)

This method updates status of the group. Status_string is explained in the set_groups_to_files method.

def get_groups_of_input_files(self, skip_ready=False)

To get a dictionary of {identifier_string: a dictionary of the group information} for input files. If skip_ready is set to True, the method returns groups of the input files which are not in ready state, which could be useful in the check_status method of preparator plugins. Keys of the group information dictionary are groupStatus and groupUpdateTime which are updated when the set_groups_to_files or update_group_status_in_files method is called.

def get_groups_of_output_files(self)

This methods works for output files as get_groups_of_input_files(). It doesn't take an extra argument since finished or failed files are automatically removed.

DBInterface Methods

Note that plugins can access the DB through the self.dbInterface member which is automatically installed by plugin_factory when plugins are instantiated.

def get_files_with_group_id(self, identifier_string)

This method returns the list of FileSpecs which have the same group identifier and may belong to different jobs.

def set_file_group(self, file_specs, identifier_string, status_string)

This method sets group information (identifier and status) to files.

def get_object_lock(self, object_name, lock_interval)

This method locks an object for lock_interval sec. This could be useful for plugins to take an action exclusively in multi-threading environment.

def release_object_lock(self, object_name)

This method releases the lock for an object, so that another thread can take the exclusive action next.

Protection against double input file transfers

If multiple jobs are fetched and they use the same input files, preparatory triggers stage-in only for the first job while keeping the others on hold until input files are successfully transferred. First, file status is set to to_prepare for the first job and preparing doe the other jobs. Once the check_status method of preparator plugin returns True for a job, file status is changed to ready. If the file status changes from preparing it inherits the grouping information, which is explained in the above section, of the first job.

Logging

The following parameters are available to optimize logging in etc/panda/panda_common.cfg.

Name	Description
log_level	Logging level. See python doc. Can be CRITICAL, ERROR, WARNING, INFO, DEBUG (default), or NOTSET.
rotating_policy	Policy for log rotation. Can be time, size, or none. time : rotation at certain timed intervals, size : rotation at a predetermined size, none : no rotation (default).
rotating_backup_count	How many old log files should be saved. Effective unless rotating_policy=none. 1 by default.
rotating_max_size	Rotation happens when the file size (in MB) is about to be exceeded. Effective only when rotating_policy=size. 1024 by default.
rotating_interval	Rotation interval in hours. Effective only when rotating_policy=time. 24 by default.

Profiling

Harvester support statistic, deterministic, or thread-aware profiling. Statistic profiling is done with python's standard profilers, while deterministic or thread-aware profiling is done with the pprofile package which needs to be installed using pip:

$ pip install pprofile

Harvester is launched with a profiler if the --profiler_output option is given to master.py. The option specifies the filename where the results of the profiler are dumped. If profiling is in the deterministic or thread-aware mode and the filename starts with "cachegrind.out", the results are written in the callgrind profile format which allows the file to be browsed with kcachegrind. If profiling is in the statistic mode, the dumo file can be analyzed using python's standard pstats package. Profiling is in the statistic mode by default, and can be changed with the --profile_mode option, "d" for the deterministic mode and "t" for the thread-aware mode. You can find detailed explanations about profiling modes in the pprofile's page. Note that the dump file is produced only when harvester is properly terminated with the USR2 or TERM signal, i.e.,

$ kill -USR2 `cat $PWD/tmp.pid`

or

$ kill `cat $PWD/tmp.pid`

Error propagation

Payloads

Payloads can report errors by adding 'pilotErrorCode', 'pilotErrorDiag', 'exeErrorCode', and 'exeErrorDiag' in the workerAttributesFile, which is explained in this section, or something equivalent for each messenger plugin.

Plugins

Plugins can report errors in their methods by calling

from pandaharvester.harvestercore.pilot_errors import PilotErrors
jobSpec.set_pilot_error(PilotErrors.ERR_XYZ, diagnostic_message)

if those methods take JobSpec or a lost of JobSpecs, or

workSpec.set_pilot_error(PilotErrors.ERR_XYZ, diagnostic_message)

if those methods take WorkSpec or a lost of WorkSpecs, where ERR_XYZ is an error code defined in the PilotErrors class and diagnostic_message is a string of a diagnostic message. If a pilot error is set to a WorkSpec, all JobSpecs associated to the WorkSpec inherit the pilot error unless they set own pilot errors.

Home

Getting started
Installation and configuration
Testing and running
Debugging
Work with Middleware
Admin FAQ

Developer pages
Code structure
DB structure
DB-schema-changes
State and sequence diagrams
Plugin API specifications
Agents and Plugins descriptions
Plugin utilities
Workflows supported by harvester
Developer Q&A
Release

Development guides
Development workflow
Tagging

Production & commissioning
Scale up submission
Condor experiences
Commissioning on the grid
Production servers
Service monitoring
Auto Queue Configuration with CRIC
SSH+RPC middleware setup

Kubernetes section
Kubernetes setup
X509 credentials
AWS setup
GKE setup
CERN setup
CVMFS installation
Generic service accounts

Advanced payloads
Horovod integration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly