Dataset Classification: Orgs Types Groups Tags - ckan/ckan GitHub Wiki
CKAN has an abundance of methods to classify datasets. This wiki page attempts to highlight the differences between each method. Maybe one day this page will grow into a proper part of the CKAN docs.
Dataset organizations for people who are publishing data. They are for controlling who can view, edit, create and publish datasets.
Organizations are one of the default facets in the dataset search pages. Also the organization dataset search at /organization/{name}
shows all dataset types for the given organization.
Dataset types are for when you want to have different types of datasets that have different schemas.
URLs are automatically added for searching different datasets types. /dataset
is the default type and search page, showing only "dataset"-type datasets. A new dataset type e.g. "application" would automatically be given the search page /application
, showing only that type of dataset.
Groups are for people who are consuming datasets. Use groups when you want to group datasets together under a theme e.g. climate data etc. You don't want just anyone to be able to add datasets to your carefully curated climate group, so only users who are members of the group are allowed to add datasets to or remove them from the group.
Unlike with organizations, being a member of a group doesn't give you permission to create or edit the datasets in the groups, groups are about collecting existing datasets together into groups, they're not about publishing datasets.
Groups are one of the default facets in the dataset search pages. Also the group dataset search at /group/{name}
shows all dataset types for the given group.
Different group types are supported by writing a CKAN extension. This extension will need to provide a controller or other way of actually create non-default group types.
Tag vocabularies are just for when you want to add a custom field to the dataset schema, e.g. "Genre", and you want that field to be a drop-down list with a fixed number of possible values. There's an API for adding and removing from the list of possible values.
Tag vocabularies must be added to facets by a CKAN extension, there is no default interface for showing only datasets with a tag from a tag vocabulary.
Completely free-form tagging
Tags are one of the default facets in the dataset search pages.
method | effect | number/dataset | types/dataset | can change | extra info | custom plugin req'd |
---|---|---|---|---|---|---|
organization | dataset editing permissions | 0 or 1 | 1 | sysadmin only? | yes | no |
dataset type | dataset schema | 1 | 1 | sysadmin only? | no | yes |
group | none | 0+ | 1+* | group editors | yes | *yes for >1 type |
tag vocabulary | none | 0+ | 1+ | dataset editors | no | yes |
tag | none | 0+ | 1 | dataset editors | no | no |
- method: the dataset classification method chosen
- effect: effects of the options selected
- number/dataset: the number of options allowed per dataset
- types/dataset: the number of groups of options available
- can change: who may modify options for a dataset
- extra info: extra information may be stored with each option
- custom plugin req'd: custom CKAN extension code required