Skip to content

DB structure

Fernando Barreiro edited this page Nov 6, 2017 · 29 revisions

Local tables in SQLite

DB ER diagram

SQLite DB tables are generated in the harvestercore/db_proxy.py. Individual table structure is defined in each *_spec.py file. The initial definitions below, but will evolve over time, so please consult the relevant spec file for the up to date structure.

alt text

job_table

'PandaID:integer primary key',
'taskID:integer',
'attemptNr:integer',
'status:text',
'subStatus:text',
'currentPriority:integer',
'computingSite:text',
'creationTime:timestamp',
'modificationTime:timestamp',
'stateChangeTime:timestamp',
'startTime:timestamp',
'endTime:timestamp',
'nCore:integer',
'jobParams:blob',
'jobAttributes:blob',
'hasOutFile:integer',
'metaData:blob',
'outputFilesToReport:blob',
'lockedBy:text',
'propagatorLock:text',
'propagatorTime:timestamp',
'preparatorTime:timestamp',
'submitterTime:timestamp',
'stagerLock:text',
'stagerTime:timestamp',
'zipPerMB:integer'             

work_table

'workerID:integer',
'batchID:text',
'mapType:text',
'queueName:text',
'status:text',
'hasJob:integer',
'workParams:blob',
'workAttributes:blob',
'eventsRequestParams:blob',
'eventsRequest:integer',
'computingSite:text',
'creationTime:timestamp',
'submitTime:timestamp',
'startTime:timestamp',
'endTime:timestamp',
'nCore:integer',
'walltime:timestamp',
'accessPoint:text',
'modificationTime:timestamp',
'stateChangeTime:timestamp',
'eventFeedTime:timestamp',
'lockedBy:text',
'postProcessed:integer'

jw_table (job worker)

'PandaID:integer',
'workerID:integer',
'relationType:text'

file_table

'fileID:integer primary key',
'PandaID:integer',
'taskID:integer',
'lfn:text',
'status:text',
'fsize:integer',
'chksum:text',
'path:text',
'fileType:text',
'eventRangeID:text',
'modificationTime:timestamp',
'fileAttributes:blob',
'isZip:integer',
'zipFileID:integer',
'objstoreID:integer'

cache_table

'mainKey:text',
'subKey:text',
'data:blob',
'lastUpdate:timestamp'

event_table

'eventRangeID:text',
'PandaID:integer',
'eventStatus:text',
'coreCount:integer',
'cpuConsumptionTime:integer',
'subStatus:text',
'fileID:integer'

seq_table

'numberName:text',
'curVal:integer'

pq_table (panda queue)

'queueName:text',
'nQueueLimitJob:integer',
'nQueueLimitWorker:integer',
'maxWorkers:integer',
'jobFetchTime:timestamp',
'submitTime:timestamp'

command_table

'command_id:integer primary key',
'command:text',
'receiver:text',
'params:blob',
'ack_requested:integer',
'processed:integer'

Central tables in Oracle

Harvester_Instances

The table to register harvester instances.

Name Type Description
HARVESTER_ID NOT NULL VARCHAR2(50) Identifier of the harvester instance
DESCRIPTION VARCHAR2(200) Description for the instance
STARTTIME DATE Start time
OWNER VARCHAR2(100) Owner of the instance
HOSTNAME VARCHAR2(100) Hostname where the instance is running

Harvester_Workers

The table for workers submitted by harvesters.

Name Type Description
HARVESTERID NOT NULL VARCHAR2(50) Identifier of the harvester instance
WORKERID NOT NULL NUMBER(11) Identifier of the worker
BATCHID VARCHAR2(80) Unique ID in the batch system
NODEID VARCHAR2(80) Identifier of the node, such as hostname, IP, etc
QUEUENAME VARCHAR2(80) Name of the batch queue
STATUS NOT NULL VARCHAR2(80) Worker status
COMPUTINGSITE VARCHAR2(128) Panda Queue name
SUBMITTIME DATE Set when the worker is submitted
STARTTIME DATE Set when the worker gets CPUs
ENDTIME DATE Set when the worker is terminated
NCORE NUMBER(6) The number of cores the worker use
ERRORCODE NUMBER(7) Error code in any
LASTUPDATE NOT NULL DATE Set when the record is updated
STDOUT VARCHAR2(250) URL for stdout
STDERR VARCHAR2(250) URL for stderr
BATCHLOG VARCHAR2(250) URL for batch log

Harvester_Jobs_Workers_Relation

The table for relationship between jobs and workers.

Name Type Description
HARVESTERID NOT NULL VARCHAR2(50) Identifier of the harvester instance
WORKERID NOT NULL NUMBER(11) Identifier of the worker
PANDAID NOT NULL NUMBER(11) Job ID in PanDA
LASTUPDATE NOT NULL DATE Set when the record is updated

Harvester_Worker_Stats

The table for realtime statistics of harvester workers.

Name Type Description
HARVESTER_ID NOT NULL VARCHAR2(50) Identifier of the harvester instance
COMPUTINGSITE NOT NULL VARCHAR2(128) Panda Queue name
RESOURCETYPE NOT NULL VARCHAR2(56) Resource type
STATUS NOT NULL VARCHAR2(80) Worker status
N_WORKERS NUMBER(7) Number of workers
LASTUPDATE NOT NULL DATE Last update time

Harvester_Commands

Command queue for harvester.

Name Type Description
COMMAND_ID NOT NULL NUMBER(10) Command ID for primary key
COMMAND VARCHAR2(200) Command string
HARVESTER_ID VARCHAR2(50) Identifier of the target harvester instance
ACK_REQUESTED NUMBER(1) 0/1, depending on whether panda server expects and acknowledgement
CREATION_DATE DATE Timestamp when the command was generated
STATUS VARCHAR2(32) Status of the call: new, retrieved, acknowledged...
STATUS_DATE DATE Timestamp when the status last changed
PARAMS CLOB Parameters for the call

Harvester_Command_Lock

Exclusive locks to send commands to harvester.

Name Type Description
HARVESTER_ID NOT NULL VARCHAR2(50) Identifier of the target harvester instance
COMPUTINGSITE NOT NULL VARCHAR2(128) Panda Queue name
RESOUCETYPE NOT NULL VARCHAR2(56) Resource type
COMMAND NOT NULL VARCHAR2(200) Command string
LOCKEDTIME DATE Timestamp when the command is locked
LOCKEDBY VARCHAR2(40) Process name which locks the command
Clone this wiki locally