Skip to content

DB structure

Tadashi Maeno edited this page Mar 27, 2017 · 29 revisions

Local tables in SQLite

DB ER diagram

SQLite DB tables are generated in the harvestercore/db_proxy.py. Individual table structure is defined in each *_spec.py file. The initial definitions below, but will evolve over time, so please consult the relevant spec file for the up to date structure.

alt text

job_table

'PandaID:integer primary key',
'taskID:integer',
'attemptNr:integer',
'status:text',
'subStatus:text',
'currentPriority:integer',
'computingSite:text',
'creationTime:timestamp',
'modificationTime:timestamp',
'stateChangeTime:timestamp',
'startTime:timestamp',
'endTime:timestamp',
'nCore:integer',
'jobParams:blob',
'jobAttributes:blob',
'hasOutFile:integer',
'metaData:blob',
'outputFilesToReport:blob',
'lockedBy:text',
'propagatorLock:text',
'propagatorTime:timestamp',
'preparatorTime:timestamp',
'submitterTime:timestamp',
'stagerLock:text',
'stagerTime:timestamp',
'zipPerMB:integer'             

work_table

'workerID:integer',
'batchID:text',
'mapType:text',
'queueName:text',
'status:text',
'hasJob:integer',
'workParams:blob',
'workAttributes:blob',
'eventsRequestParams:blob',
'eventsRequest:integer',
'computingSite:text',
'creationTime:timestamp',
'submitTime:timestamp',
'startTime:timestamp',
'endTime:timestamp',
'nCore:integer',
'walltime:timestamp',
'accessPoint:text',
'modificationTime:timestamp',
'stateChangeTime:timestamp',
'eventFeedTime:timestamp',
'lockedBy:text',
'postProcessed:integer'

jw_table (job worker)

'PandaID:integer',
'workerID:integer',
'relationType:text'

file_table

'fileID:integer primary key',
'PandaID:integer',
'taskID:integer',
'lfn:text',
'status:text',
'fsize:integer',
'chksum:text',
'path:text',
'fileType:text',
'eventRangeID:text',
'modificationTime:timestamp',
'fileAttributes:blob',
'isZip:integer',
'zipFileID:integer',
'objstoreID:integer'

cache_table

'mainKey:text',
'subKey:text',
'data:blob',
'lastUpdate:timestamp'

event_table

'eventRangeID:text',
'PandaID:integer',
'eventStatus:text',
'coreCount:integer',
'cpuConsumptionTime:integer',
'subStatus:text',
'fileID:integer'

seq_table

'numberName:text',
'curVal:integer'

pq_table (panda queue)

'queueName:text',
'nQueueLimitJob:integer',
'nQueueLimitWorker:integer',
'maxWorkers:integer',
'jobFetchTime:timestamp',
'submitTime:timestamp'

Central tables in Oracle

Harvester_Instances

The table to register harvester instances.

Name Type Description
HARVESTERID NOT NULL VARCHAR2(50) Identifier of the harvester instance
DESCRIPTION VARCHAR2(200) Description for the instance

Harvester_Workers

The table for workers submitted by harvesters.

Name Type Description
HARVESTERID NOT NULL VARCHAR2(50) Identifier of the harvester instance
WORKERID NOT NULL NUMBER(11) Identifier of the worker
BATCHID VARCHAR2(80) Unique ID in the batch system
NODEID VARCHAR2(80) Identifier of the node, such as hostname, IP, etc
QUEUENAME VARCHAR2(80) Name of the batch queue
STATUS NOT NULL VARCHAR2(80) Worker status
COMPUTINGSITE VARCHAR2(128) Panda Queue name
SUBMITTIME DATE Set when the worker is submitted
STARTTIME DATE Set when the worker gets CPUs
ENDTIME DATE Set when the worker is terminated
NCORE NUMBER(6) The number of cores the worker use
ERRORCODE NUMBER(7) Error code in any
LASTUPDATE NOT NULL DATE Set when the record is updated
STDOUT VARCHAR2(250) URL for stdout
STDERR VARCHAR2(250) URL for stderr
BATCHLOG VARCHAR2(250) URL for batch log

Harvester_Jobs_Workers_Relation

The table for relationship between jobs and workers.

Name Type Description
HARVESTERID NOT NULL VARCHAR2(50) Identifier of the harvester instance
WORKERID NOT NULL NUMBER(11) Identifier of the worker
PANDAID NOT NULL NUMBER(11) Job ID in PanDA
LASTUPDATE NOT NULL DATE Set when the record is updated
Clone this wiki locally