Data sets - npdc/portal GitHub Wiki

Data set description format

Data sets are described using the GCMD dif format.

Identifiers

The main identifier in the database is the numeric data set id with the data set version. For external use an uuid is generated which is also used in the canonical url of a page. So an uuid points to a specific version of a data set (or project or publication)

Retrieving data set descriptions in different formats

HTML

The main data set page is retrieved either using {domain}/dataset/{dataset_id}[/{version}] or using {domain}/{uuid}

JSON-LD

The json-ld data set description is embedded in the html version. This allows crawlers to easily crawl the page and extract the structured data set description. This also allows the data set to show up in google data search.

GCMD DIF

To get the data set description in the GCMD DIF format add .xml to url to the data set description. This gives the GCMD DIF as an XML that can be directly pasted into docbuilder.