-
Notifications
You must be signed in to change notification settings - Fork 15
Developer Q&A
Harvester has a multi-threading architecture. Each thread executes plugin's function synchronously. Plugins can spawn background processes and/or threads to process tasks asynchronously, but the number of background processes or threads must be controlled by plugins.
Each worker is identified by a unique identifier in the batch system like batch jobid and condor jobid. Plugins take actions with the identifier.
Harvester's propagator agents send heartbeats every 30 min for running jobs or immediately for finished/failed jobs.
It uses external components as libraries. i.e. in the same process and same memory space.
Normally pilots kill themselves once they get the kill command from PanDA through heartbeats. However, even if pilots stop sending heartbeats Harvester will be able to get the list of stuck pilots from PanDA to directly kill them using condor_rm etc.
Each harvester instance will have a unique identifier. Config files for harvester instances are stored on PanDA. A config file is downloaded with the identifier when the instance is up. The config file contains the list of PQs for which the instance works.
It is possible to have multiple harvester instances per PQ. For example, queue depth can be dynamically set by PanDA in an harvester instance. An easiest solution would be to set queue depth to 1000 when only one instance is running, then it would be reduced to 500 when another instance is up for the same PQ.
In the pull model workflow ultimately status would be enough since the pilot directly reports other information to PanDA. In the push model workflow all information which the pilot reports would be desirable.
Job attributes are stored in a clob field in the harvester DB. The field contains a dictionary so that it is easy to add new attributes.
The idea is to periodically upload contents of harvester DB to Oracle. There will be a full or slimmed mirror table of the harvester DB in Oracle. BigPandaMon will show views on the table.
See dummy plugins like DummySubmitter and DummyMonitor.
Getting started |
---|
Installation and configuration |
Testing and running |
Debugging |
Work with Middleware |
Admin FAQ |
Development guides |
---|
Development workflow |
Tagging |
Production & commissioning |
---|
Scale up submission |
Condor experiences |
Commissioning on the grid |
Production servers |
Service monitoring |
Auto Queue Configuration with CRIC |
SSH+RPC middleware setup |
Kubernetes section |
---|
Kubernetes setup |
X509 credentials |
AWS setup |
GKE setup |
CERN setup |
CVMFS installation |
Generic service accounts |
Advanced payloads |
---|
Horovod integration |