DB structure - PanDAWMS/panda-harvester GitHub Wiki
Local tables in SQLite
DB ER diagram
SQLite DB tables are generated in the harvestercore/db_proxy.py. Individual table structure is defined in each *_spec.py file. The initial definitions below, but will evolve over time, so please consult the relevant spec file for the up to date structure.

job_table
'PandaID:integer primary key',
'taskID:integer',
'attemptNr:integer',
'status:text',
'subStatus:text',
'currentPriority:integer',
'computingSite:text',
'creationTime:timestamp',
'modificationTime:timestamp',
'stateChangeTime:timestamp',
'startTime:timestamp',
'endTime:timestamp',
'nCore:integer',
'jobParams:blob',
'jobAttributes:blob',
'hasOutFile:integer',
'metaData:blob',
'outputFilesToReport:blob',
'lockedBy:text',
'propagatorLock:text',
'propagatorTime:timestamp',
'preparatorTime:timestamp',
'submitterTime:timestamp',
'stagerLock:text',
'stagerTime:timestamp',
'zipPerMB:integer'
work_table
'workerID:integer',
'batchID:text',
'mapType:text',
'queueName:text',
'status:text',
'hasJob:integer',
'workParams:blob',
'workAttributes:blob',
'eventsRequestParams:blob',
'eventsRequest:integer',
'computingSite:text',
'creationTime:timestamp',
'submitTime:timestamp',
'startTime:timestamp',
'endTime:timestamp',
'nCore:integer',
'walltime:timestamp',
'accessPoint:text',
'modificationTime:timestamp',
'stateChangeTime:timestamp',
'eventFeedTime:timestamp',
'lockedBy:text',
'postProcessed:integer'
jw_table (job worker)
'PandaID:integer',
'workerID:integer',
'relationType:text'
file_table
'fileID:integer primary key',
'PandaID:integer',
'taskID:integer',
'lfn:text',
'status:text',
'fsize:integer',
'chksum:text',
'path:text',
'fileType:text',
'eventRangeID:text',
'modificationTime:timestamp',
'fileAttributes:blob',
'isZip:integer',
'zipFileID:integer',
'objstoreID:integer'
cache_table
'mainKey:text',
'subKey:text',
'data:blob',
'lastUpdate:timestamp'
event_table
'eventRangeID:text',
'PandaID:integer',
'eventStatus:text',
'coreCount:integer',
'cpuConsumptionTime:integer',
'subStatus:text',
'fileID:integer'
seq_table
'numberName:text',
'curVal:integer'
pq_table (panda queue)
'queueName:text',
'nQueueLimitJob:integer',
'nQueueLimitWorker:integer',
'maxWorkers:integer',
'jobFetchTime:timestamp',
'submitTime:timestamp'
command_table
'command_id:integer primary key',
'command:text',
'receiver:text',
'params:blob',
'ack_requested:integer',
'processed:integer'
Central tables in Oracle
Harvester_Instances
The table to register harvester instances.
| Name | Type | Description |
|---|---|---|
| HARVESTER_ID | NOT NULL VARCHAR2(50) | Identifier of the harvester instance |
| DESCRIPTION | VARCHAR2(200) | Description for the instance |
| STARTTIME | DATE | Start time |
| OWNER | VARCHAR2(100) | Owner of the instance |
| HOSTNAME | VARCHAR2(100) | Hostname where the instance is running |
| LASTUPDATE | DATE | Set when the record is updated |
| SW_VERSION | VARCHAR2(50) | Software version |
| COMMIT_STAMP | VARCHAR2(100) | Commit stamp in the repository |
Harvester_Workers
The table for workers submitted by harvesters.
| Name | Type | Description |
|---|---|---|
| HARVESTERID | NOT NULL VARCHAR2(50) | Identifier of the harvester instance |
| WORKERID | NOT NULL NUMBER(11) | Identifier of the worker |
| BATCHID | VARCHAR2(80) | Unique ID in the batch system |
| NODEID | VARCHAR2(80) | Identifier of the node, such as hostname, IP, etc |
| QUEUENAME | VARCHAR2(80) | Name of the batch queue |
| STATUS | NOT NULL VARCHAR2(80) | Worker status |
| COMPUTINGSITE | VARCHAR2(128) | Panda Queue name |
| SUBMITTIME | DATE | Set when the worker is submitted |
| STARTTIME | DATE | Set when the worker gets CPUs |
| ENDTIME | DATE | Set when the worker is terminated |
| NCORE | NUMBER(6) | The number of cores the worker use |
| ERRORCODE | NUMBER(7) | Error code in any |
| LASTUPDATE | NOT NULL DATE | Set when the record is updated |
| STDOUT | VARCHAR2(250) | URL for stdout |
| STDERR | VARCHAR2(250) | URL for stderr |
| BATCHLOG | VARCHAR2(250) | URL for batch log |
| RESOURCETYPE | VARCHAR2(56) | Resource type |
| NATIVEEXITCODE | NUMBER(7) | Exit code in the underlying system |
| NATIVESTATUS | VARCHAR2(80) | Status in the underlying system |
| DIAGMESSAGE | VARCHAR2(500) | Error diagnostics |
| COMPUTINGELEMENT | VARCHAR2(128) | Gateway of the batch system |
| NJOBS | NUMBER(6) | The number of associated jobs |
| SUBMISSIONHOST | VARCHAR2(128) | The host name of submission node |
Harvester_Rel_Jobs_Workers
The table for relationship between jobs and workers.
| Name | Type | Description |
|---|---|---|
| HARVESTERID | NOT NULL VARCHAR2(50) | Identifier of the harvester instance |
| WORKERID | NOT NULL NUMBER(11) | Identifier of the worker |
| PANDAID | NOT NULL NUMBER(11) | Job ID in PanDA |
| LASTUPDATE | NOT NULL DATE | Set when the record is updated |
Harvester_Worker_Stats
The table for realtime statistics of harvester workers.
| Name | Type | Description |
|---|---|---|
| HARVESTER_ID | NOT NULL VARCHAR2(50) | Identifier of the harvester instance |
| COMPUTINGSITE | NOT NULL VARCHAR2(128) | Panda Queue name |
| RESOURCETYPE | NOT NULL VARCHAR2(56) | Resource type |
| STATUS | NOT NULL VARCHAR2(80) | Worker status |
| N_WORKERS | NUMBER(7) | Number of workers |
| LASTUPDATE | NOT NULL DATE | Last update time |
Harvester_Commands
Command queue for harvester.
| Name | Type | Description |
|---|---|---|
| COMMAND_ID | NOT NULL NUMBER(10) | Command ID for primary key |
| COMMAND | VARCHAR2(200) | Command string |
| HARVESTER_ID | VARCHAR2(50) | Identifier of the target harvester instance |
| ACK_REQUESTED | NUMBER(1) | 0/1, depending on whether panda server expects and acknowledgement |
| CREATION_DATE | DATE | Timestamp when the command was generated |
| STATUS | VARCHAR2(32) | Status of the call: new, retrieved, acknowledged... |
| STATUS_DATE | DATE | Timestamp when the status last changed |
| PARAMS | CLOB | Parameters for the call |
Harvester_Command_Lock
Exclusive locks to send commands to harvester.
| Name | Type | Description |
|---|---|---|
| HARVESTER_ID | NOT NULL VARCHAR2(50) | Identifier of the target harvester instance |
| COMPUTINGSITE | NOT NULL VARCHAR2(128) | Panda Queue name |
| RESOUCETYPE | NOT NULL VARCHAR2(56) | Resource type |
| COMMAND | NOT NULL VARCHAR2(200) | Command string |
| LOCKEDTIME | DATE | Timestamp when the command is locked |
| LOCKEDBY | VARCHAR2(40) | Process name which locks the command |
Harvester_Dialogs
Dialog messages from harvester instances.
| Name | Type | Description |
|---|---|---|
| HARVESTER_ID | NOT NULL VARCHAR2(50) | Identifier of the target harvester instance |
| DIAGID | NOT NULL NUMBER(11) | Serial number of the message |
| MODULENAME | VARCHAR2(100) | The module name which sent the message |
| IDENTIFIER | VARCHAR2(100) | Identifier of the message if any |
| CREATIONTIME | DATE | Timestamp when the message is sent |
| MESSAGELEVEL | VARCHAR2(10) | Message level |
| DIAGMESSAGE | VARCHAR2(500) | Dialog message |
Harvester_Slots
The table defines the number of available slots at Panda Queues mainly for workload provisioning.
| Name | Type | Description |
|---|---|---|
| PANDAQUEUENAME | NOT NULL VARCHAR2(128) | Panda Queue name |
| GSHARE | VARCHAR2(32) | Global share |
| RESOURCETYPE | VARCHAR2(56) | Resource type |
| NUMSLOTS | NOT NULL NUMBER(11) | The number of slots |
| MODIFICATIONTIME | NOT NULL DATE | When the record was updated |
| EXPIRATIONTIME | DATE | When the record expires |