DevGuide ExportFormat - VizierDB/vizier-scala GitHub Wiki
Vizier uses two distinct data representations: Its standard catalog model, with the entities defined in info.vizierdb.catalog._ and stored via JDBC, and a more portable export format, with entities defined in info.vizierdb.export._. This document details the latter. If you aren't already familiar with the catalog format, read this first.
Export Version 1
Identifiers (Project-local vs Export-local)
The following text makes a distinction between project-local identifiers (identifiers of objects encoded in the catalog format) and export-local identifiers (identifiers of objects in the exported file).
Export identifiers are entirely local to the export file. The identifier can be any unique ASCII-encoded strings. When JSON encoded, identifiers MUST be strings. Vizier's existing exporter re-uses the project-local identifiers for entities. All instances are assigned to fresh identifiers on import.
Terminology: Cell, Module, and Command
Export format version 1 is intended to be backwards compatible with the Python implementation of Vizier. The python version uses a slightly different catalog model, and so some of the terminology is a bit different. Most notably:
- What Vizier-Python and the export format call a Module is what Vizier-Scala calls a Cell
- What Vizier-Python and the export format call a Command is what Vizier-Scala calls a Module
The .vizier file format
A .vizier file is a tar-format archive compressed with gzip compression (commonly referred to as a tarball). tar -zxvf my_export.vizier should open the file. The contents of this archive are as follows:
version.txt: A text file containing exactly the text1.fs: A directory containingFILE-type artifacts. Each file's name is the export-local artifact identifier.project.json: A text file containing a singlejsonobject with the schema described below.
The project.json schema
The project.json file contains a single object with fields as follows:
properties: Array of Property Objects with schema:key: The identifier of a propertyvalue: The value of the identified property
defaultBranch: The export-local identifier of the Branch object that is the active.files: Array of FileSummary Objects with schema:id: The export-local identifier of a file. This corresponds to a file in thefsdirectory of the tarball.name: The human-readable name of the filemimetype: The MIME type of the file
modules: Dictionary of Module Objects. Note (as per the note above) that this is what Vizier-Scala calls a Cell. The key is the export-local identifier of the Module ("Cell" in vizier-scala). Module objects have schema:id: The export-local identifier of the Module ("Cell" in vizier-scala)state: One of the following integers (Typically 4).- 0: PENDING
- 1: RUNNING
- 2: CANCELLED
- 3: ERROR
- 4: DONE
- 5: FROZEN
command: A Command object with the following schema:id: The export-local identifier of the Command ("Module" in vizier-scala). Multiple Modules ("Cells" in vizier-scala) may have the same Command ("Module" in vizier-scala). If multiple Command ("Module" in vizier-scala) objects appear with the sameid, the remaining fields MUST be identical as well.packageId: The name of the package.commandId: The name of the command.arguments: An array of Arguments Objects that encode the module arguments. *Arguments objects have schema:id: The identifier of a command argumentvalue: The value of a command argument
revisionOfId: The export-local identifier of the Command ("Module" in vizier-scala) that this command was derived from in a prior revision of a workflow.properties: Ignored by the importer.
text: A human-readable description of the command (For backwards compatibility. The current implementation of Vizier ignores this field)timestamps: An Timestamps object with the following schema:createdAt: An ISO-8601 formatted datetime string indicating the time the Module ("Cell" in vizier-scala) was originally created.startedAt: An ISO-8601 formatted datetime string indicating the time the Module ("Cell" in vizier-scala) started execution ornullif it has not been started yet.finishedAt: An ISO-8601 formatted datetime string indicating the time the Module ("Cell" in vizier-scala) finished execution ornullif it has not been started yet or if it is still running.lastModifiedAt: An ISO-8601 formatted datetime string indicating the last time the cell was modified.
branches: Array of Branch Objects with the following schema:id: The export-local identifier for the Branch objectcreatedAt: An ISO-8601 formatted datetime string indicating the time the branch was originally created.lastModifiedAt: An ISO-8601 formatted datetime string indicating the last time the branch was modified.sourceBranch: An optional export-local identifier for the Branch object from which this branch was derived. If defined,sourceWorkflowmust also be defined.sourceWorkflow: An optional export-local identifier for the Workflow object from which this branch's head was derived. If defined,sourceBranchmust also be defined, andsourceWorkflowmust be a workflow in the branch identified bysourceBranchisDefault: True if this branch is the project'sdefaultBranchproperties: Array of Property Objects with schema:key: The identifier of a propertyvalue: The value of the identified property
workflows: Array of Workflow objects with schema:id: The export-local identifier for this workflow.createdAt: An ISO-8601 formatted datetime string indicating the time the workflow was originally created.action: A string identifying the action that created the workflow. Must be one of the strings:create: This is the first workflow of a branch.append: The workflow was derived by appending a module to a prior workflow.delete: The workflow was derived by deleting a module from a prior workflowinsert: The workflow was derived by inserting a module at a specified position in a prior workflow.update: The workflow was derived by modifying a module from a prior workflow.freeze: The workflow was derived by freezing one or more modules from a prior workflow.
- packageId: If
actionisappend,insert, orupdate, the package name of the command inserted. - commandId: If
actionisappend,insert, orupdate, the command name of the command inserted. - actionModule: If
actionisappend,insert, orupdate, the export-local module identifier of the affected module in the new workflow. Ifactionisdelete, the export-local identifier of the affected module in the prior workflow. - modules: Array of export-local Module ("Cell" in vizier-scala) identifiers. Each Module must be defined at
$.modules.{identifier}in this file.
createdAt: An ISO-8601 formatted datetime string indicating the time the project was originally created.lastModifiedAt: An ISO-8601 formatted datetime string indicating the last time the project was modified.