DBSBuffer - dmwm/WMCore GitHub Wiki

The DBSBuffer package implements a buffer where the permanent files produced by the WMAgent are stored before being published into DBS and PhEDEx. This cinludes the process of packaging files in blocks and associating with some metadata required by DBS such as PSet and software information.

The current implementation uses the DBS3Buffer, although it works for upload to both DBS2 and DBS3. The tables for the buffer hold roughly the same information as the file tables in WMBS, these are:

  • dbsbuffer_dataset: This table holds the basic information about the datasets produced by the WMAgent, it has these columns:

    • path: The dataset name.
    • processing_ver: The processing version number, it is an integer.
    • acquisition_era: The acquisition era for the dataset.
    • valid_status: Status of the dataset, it can be PRODUCTION or VALID.
    • global_tag: Global tag that was used to produce the dataset.
    • parent: Parent dataset name.
  • dbsbuffer_dataset_subscription: This table holds the information about PhEDEx subscriptions for the datasets. This includes the site, custodiality, priority, type of subscription (i.e. move or replica) and whether the request is auto approved or not. It also stores if the subscription request has been created or not.

  • dbsbuffer_algo: The algo information represents the software that was used to produce the dataset, some of the fields are never used though. The information stored for an algo object is the following:

    • app_name: The name of the executable that produced the file, almost all the time it is cmsRun.
    • app_ver: The version of the software, in this case the CMSSW version.
    • app_fam: The output module that the files belongs to, usually just Merged.
  • dbsbuffer_algo_dataset_assoc: This stores the relations between algos and datasets, it is stored separately because the relation is many-to-many.

  • dbsbuffer_workflow: It keeps limited information about the workflows that produced the files stored in the buffer, it holds the name of the workflow and the task. Additionally, it stores the options for the closing of blocks produced by the workflow.

  • dbsbuffer_block: Files are aggregated in blocks by the WMAgent before being published to DBS2/3, this table holds the information about such blocks. The columns are:

    • blockname: The name of the block, e.g. /data/set/path#UUID
    • location: Location of the block, it points to a storage element in the dbsbuffer_location table.
    • create_time: Time when the block was first created, i.e. first file added to it.
    • status: Upload status to DBS2, the possible states are:
    • Open: When the block is new and has not meet the criteria for closing.
    • Pending: When an open block has meet the criteria for closing and is waiting for the WMAgent to close it.
    • Closed: A closed block has been completely injected to PhEDEx and is marked as closed in those catalogs. (DBS upload status is not checked, File state contains that information.)
  • dbsbuffer_file: It contains the same file information as stored in wmbs_file with some additionals to keep track of the DBS/PhEDEx injection process:

    • dataset_algo: Reference to the dataset and algo that the file belongs to.
    • block_id: Id of the block in the dbsbuffer_block table.
    • status: Status of the file injection to DBS. The options are:
      • NOTUPLOADED: The file is new in the buffer.
      • GLOBAL: The file is registered in DBS buffer (parent file status which should be already in DBS.)
      • InDBS: The file is uploaded in dbs.
    • in_phedex: Indicates if the file is already registered in phedex or not.
    • workflow: Id of the workflow that produced this file in dbsbuffer_workflow.
  • dbsbuffer_file_parent: It stores the parentage relationship in the same way as wmbs_file_parent.

  • dbsbuffer_file_runlumi_map: It stores the run and lumi sections in the same way as wmbs_file_runlumi_map.

  • dbsbuffer_checksum_type: It stores the checksum types in the same way as wmbs_checksum_type.

  • dbsbuffer_file_checksums: It stores the checksum of the files in the buffer in the same way as wmbs_file_checksums.

  • dbsbuffer_location: It stores stores storage elements names that are used in the buffer to determine location of the files.

  • dbsbuffer_file_location: This table stores the association between files and storage elements from the dbsbuffer_location* table.