Chapter 2: Recommendations for the deployment of mobilityDCAT‐AP - mobilityDCAT-AP/mobilityDCAT-AP GitHub Wiki

Authors: Peter Lubrich, Ben Witsch, Jasper Beernaerts, Sara de Oliveira, Mario Scrocca, Lina Molinas Comet

This chapter serves as an operational manual to guide NAP operators in the implementation phase of mobilityDCAT-AP.

1. How should I implement mobilityDCAT-AP?

As a part of the information model for the metadata registry of the NAP system, the implementation might be a prerequisite for newly developed NAPs (e.g. via a specification sheet for an IT deployer) or an upgrade topic for existing NAPs (e.g., as a change request for an IT deployer). Please note that such metadata database might include NAP-specific elements which are not defined by mobilityDCAT-AP. Of course, the database might be expanded with its own NAP-specific elements. However, a uniform syntax of the entire database should be ensured. A common way to do so is to use NAP-specific namespaces (e.g., “ext:”).

2. When should I implement mobilityDCAT-AP?

There are two main reasons to implement mobilityDCAT-AP as quickly as possible:

  • to ensure harmonization and interoperability among NAPs
  • to avoid having to re-contact data providers to ask them to fill out additional metadata elements (required by mobilityDCAT-AP) after their initial (non-conformant) data provision

However, critical implementation points are: (a) the compatibility of the NAP system with mobilityDCAT-AP, which can be established via an IT deployment, see question 1 above; and (b) the adaption of existing metadata entries towards mobilityDCAT-AP, see questions 5 and 6 below.

3. Does mobilityDCAT-AP constitute the entire data model of the NAP metadata registry on a NAP?

No, NAP deployers can adopt metadata elements from mobilityDCAT-AP and/or add proprietary metadata elements. For example, a NAP deployer could make an optional mobilityDCAT-AP element mandatory. Or they could introduce an element that has not been defined by mobilityDCAT-AP, see also question xxx.

4. How do the mobilityDCAT-AP definitions affect the user experience on a data portal?

Mostly via user interactions regarding the entry, search and lookup of dataset definitions (metadata) on a data portal system. Accordingly, many of the mobilityDCAT-AP elements will appear as elements of corresponding GUIs on the portal, such as “Create Dataset”, “Search for Dataset”, “Dataset Details” or similar. Thus, many properties from mobilityDCAT-AP (most likely those under the classes Dataset and Distribution) will appear at such GUI's, maybe with identical names (“Dataset title”) or synonymous names (“Dataset name”).

An example mapping of elements of a GUI to elements of mobilityDCAT-AP is shown next. The following figure shows a GUI called "Data offer details" for a real dataset on the German NAP. This GUI pops up when a data seeker finds a dataset on the NAP registry and clicks on the button "Offer Details". The GUI reveals various metadata about this dataset, which have been provided by the corresponding publisher on the NAP. In the figure, the GUI is overlayed with green rectangles which refer to elements of mobilityDCAT-AP. In most cases, properties from class "dcat:Dataset" are used. Some metadata do not have any rectangles, which means that the NAP system has some additional metadata beyond mobilityDCAT-AP definitions.

Moreover, the controlled vocabularies used for dataset and/or distribution properties (e.g. "mobilitydcatap:TransportMode") might be used to allow harmonised filtering over the available datasets, to allow a user to quickly find interesting datasets in a similar manner for all NAPs.

5. How can I adopt mobilityDCAT-AP on my data portal, when there are already metadata entries which are not conformant with mobiltiyDCAT-AP?

This depends on how the existing metadata entries are formatted. There are three constellations:

  • If the metadata are already in a DCAT-AP-compatible format, it is easy to adapt to mobilityDCAT-AP via mappings and conversions, as mobilityDCAT-AP is reusing a lot of the DCAT-AP data model. Notice that some common metadata software such as CKAN allows for DCAT-AP export, which might be taken as a baseline.
  • If the metadata are provided according to the Coordinated Metadata Catalogue (CMC), a mapping table to mobilityDCAT-AP is provided, see the Annex B of the mobilityDCAT-AP specification.
  • In the case of other or proprietary metadata formats, it is advised to carefully study the metadata elements proposed by mobilityDCAT-AP and try to map and convert the existing metadata towards the syntax and semantics of the mobilityDCAT-AP elements.

When talking about mapping and converting, different automated, semi-automated or manual methods might be applied. The method should be chosen based on effort and benefits. E.g., an automated mapping of a “free-text keyword” towards the “theme” property by mobilityDCAT-AP might be fast but error-prone in terms of semantics; on the other side, a manual, keyword-to-keyword mapping will take some time but is likely to provide the best semantic match.

6. How to do the mapping of existing metadata towards mobilityDCAT-AP?

Once the internal metadata system of an NAP is adapted to mobilityDCAT-AP, the values populating the metadata fields should be mapped towards the new structure. We recommend that the NAP system does most of the mapping by itself. However, an external metadata provider (or data publisher) might be consulted when there is missing or ambiguous metadata. In such cases, we advise using a metadata update mechanism to do so. Many NAP systems offer a feature like “Update my metadata” or similar for registered metadata providers. And some NAPs even require metadata providers to confirm their metadata entries regularly. In the end, any missing information could be collected during the metadata update or confirmation process, to have a metadata entry according to mobilityDCAT-AP. There might be some interim solutions for the time until missing metadata information is added. We recommend providing proxy information like “Information will be provided later” in the metadata entry, so the person reading the metadata is aware. An alternative is to work with default values. Please consider any ambiguous situations if a metadata field is not populated: did the metadata provider leave this field out on purpose? Or: did the system not allow for the population of this field (E.g., because the mapping of existing metadata is not yet done)? Try to make this clear to the metadata reader.

7. How do I handle metadata which is imported from some other portals via harvesting?

Many NAPs indeed harvest metadata from other portals. Such metadata needs to be handled differently than metadata which is natively created by a NAP user. Again, some mapping and conversion are most likely needed for harvested metadata, see questions 3 and 4. However, these processes can be eased when the metadata entry on the original portal is as much as possible compatible with mobilityDCAT-AP . Some agreements between the original portal and the NAP are recommended, e.g., the original portal might restrict the usage of keywords describing a dataset, provide a mapping table of such keywords towards native elements of mobilityDCAT-AP, and do some automated mapping during the harvesting process. In an ideal situation, however, the other portal provides a mobilityDCAT-AP feed that the NAP can directly harvest without any incompatibilities.

8. How do I ensure a co-existence between mobilityDCAT-AP metadata and INSPIRE/geoDCAT-AP metadata?

mobilityDCAT-AP has a lot in common with geoDCAT-AP: all default model elements from DCAT-AP v2.0.1 are reused in both extensions, and even some additions from geoDCAT v2.0.0 are reused as additions in mobilityDCAT-AP v1.0.0. However, there might be issues when exchanging metadata between geoDCAT-AP and mobilityDCAT-AP in the following case: one of the extensions has a mandatory element which is optional (or non-existent) in the other extension. In such cases, some additional mapping or proxy information (e.g., “information unknown”) related to such missing metadata is needed.

9. Are all elements from mobilityDCAT-AP data model exposed to external users of the NAP, such as data seekers?

No, even if all elements from mobilityDCAT-AP are meant to be exchanged, it does not need they be explicitly visible to external users. For example, the information about the contact point for a specific dataset may contain an email address. The NAP deployer may stipulate that such contact information remains restricted or even anonymous. Thus, the NAP deployer might impose visibility constraints regarding the exposure of such information to external users. However, mobilityDCAT-AP does not prescribe visibility constraints.

10. What does a mobilityDCAT-AP-compliant data structure look like?

The data structure is important when querying and exporting metadata, as hosted on the data portal, in a machine-readable format (see the section on functionalities).

Corresponding data should be formatted following the serialization files, as published by mobilityDCAT-AP on the release page. These files are provided in a RDF/XML, Turtle, and JSON-LD syntax.

Following the definitions of mandatory classes in the mobilityDCAT-AP data model, a fully-compliant data structure should contain the information on one instance of the class Catalogue, at least one instance of the class Catalogue Record, and instances of the corresponding classes Dataset and Distribution. These four classes correspond to the four central classes by the mobilityDCAT-AP data model, as noted in the model overview.

In practice, the files should correspond to the content of a query from an NAP system, e.g., when only one dataset is queried, the information under the corresponding instances of the Dataset and Distribution classes should be communicated. In contrast, when the entire metadata registry is queried, all information from all instances of all four classes should be communicated.

11. What is the minimum population of mobilityDCAT-AP-compliant data?

As stated above, a file compliant with mobilityDCAT-AP should contain all the classes defined as mandatory by the mobilityDCAT-AP specification. For all these classes, any mandatory properties must also be populated. These mandatory properties can be identified in the Reference of Classes and Properties.

The minimum class/property populations are also visualised as a UML diagram, see section 2.2 mobilityDCAT-AP Overview and explained as example files, see Appendix C (mobilityDCAT-AP example populations).

12. How can I validate if my data is mobilityDCAT-AP-compliant?

The SHACL shape file provided by mobilityDCAT-AP allows an automated validation of test files. A validation using this SHACL file checks if and how data elements with all constraints (minimum and/or maximum cardinality, expected datatype, etc.) are provisioned by a test file. It also checks that a certain property conforms to the specified Controlled Vocabularies. Moreover, in the SHACL shape file, there is an indication of which data elements are mandatory, recommended, or optional.

It is noted that not all properties, as defined by mobilityDCAT-AP, are defined in the above-linked SHACL file. This is because this SHACL file only deals with the additions of mobilityDCAT-AP, compared to DCAT-AP. Thus, a proper validation would be against both a SHACL file for mobilityDCAT-AP and a SHACL file for DCAT-AP.

The SHACL file for DCAT-AP is linked within the SHACL file for mobilityDCAT-AP as follows:

 <http://w3id.org/mobilitydcat-ap#> a owl:Ontology , adms:Asset ;
   owl:imports <https://raw.githubusercontent.com/SEMICeu/DCAT-AP/master/releases/2.0.1/dcat-ap_2.0.1_shacl_shapes.ttl> ;

13. How can I validate my data against the mobilityDCAT-AP SHACL shape file?

First, one should differentiate between the following concepts:

  • Data Graph: refers to the file in which an RDF graph is represented; such graph contains information about a certain topic, in our case data about mobility coming from the existing NAPs.

  • Shapes Graphs: refers to the SHACL shapes file in which constraints in SHACL are expressed as RDF. In our case, the SHACL shape file

Once we have both files, we can use some existing tools online, such as SHACL-VALIDATOR, SHACL-PLAY or SHACL-PLAYGROUND. The last one is also available as a library accessible here.

It is enough to provide the data and shape graphs, and then one gets warning or error messages, depending on the severity of the validation of the constraints.

14. What should I do, if my NAP is already using a Controlled Vocabulary, that is different to the Vocabularies defined by mobilityDCAT-AP?

Some existing NAPs use native Controlled Vocabularies, due to legacy systems, e.g., based on the Coordinated Metadata Catalogue. These native Vocabularies might be different to the official Controlled Vocabularies by mobilityDCAT-AP. E.g., there might be differently named values, differently clustered values, or additional values. An alignment of existing metadata entries to the official mobilityDCAT-AP Vocabularies might be challenging. There might be also plausible reasons, why a NAP system wants to use a different Controlled Vocabulary.

As a solution, alternative (such as nation-specific) vocabularies are allowed as “variants” of the mobilitDCAT-AP vocabularies. For such cases, the NAP operators are asked to send a proposal of an alternative Vocabulary to the author team of mobilityDCAT-AP. The author team will do the following:

  • Check if the alternative Vocabulary is affiliated with one of the official Vocabularies.
  • If the check is positive:
  • The author team will produce an RDF-compliant representation of the alternative Vocabulary.
  • The author team will publish this alternative as a variant of an official Vocabulary. (e.g.,. there might be a variant of https://w3id.org/mobilitydcat-ap/mobility-theme/1.0.0 like this: https://w3id.org/mobilitydcat-ap/mobility-theme/AT_Variant_1.0.0)
  • For each value under the variant, references will be made to the official vocabulary using the SKOS language (e.g., via properties “skos:semanticRelation” etc.).

This way, the alternative vocabulary will be equally hosted as any official vocabulary (under the mobilityDCAT-AP vocabulary repository), allowing RDF retrieving, parsing between vocabularies, etc.

As an alternative, an alternative Controlled Vocabulary can be also hosted on the NAP system. (Remember that in the RDF world, it doesn’t matter where an RDF vocabulary is hosted, as long as the IRIs are persistent and referenceable.) Still, any alternative Controlled Vocabularies should be sent to the mobilityDCAT-AP author team for review.