Open Data Principles - ODIQueensland/ocd-standards GitHub Wiki
Principles
We try to make conforming to the standard as easy as possible.
- As few required fields as possible.
- These are interoperability standards are for transforming already-collected data only.
- Match common field names as much as possible.
- Field names must be 10 characters or fewer, due to Shapefile attribute limitations.
General guidelines
- Don’t release a data set without ensuring you have selected a suitable license to publish it under
- Don’t create filtered or cubed data sets unless the original data is also published. Summarised data sets are often useful for a sole purpose and greatly limit what can be done with them
- Avoid creating hierarchical data sets in a single table. Instead, create multiple data sets which cross reference each other
- Don’t assume the consumer will understand your coding system. Always include a reference if the field name isn’t enough to clearly determine the purpose of the data
- All date fields should be provided as YYYY-MM-DD
- Additional fields can (and should) be provided, but should be included after recommended fields.
- Numeric values should be provided as a single numeric value ("1.3"). Don't include a range ("1.2 - 1.4"), nor units ("1.3m").
- Spatial data should presented as raw latitude/longitude (EPSG:4326), not eastings and northings (projected coordinates).
- Spatial data should be provided in CSV-geo-au format if point data, and also GeoJSON.
- Each field should contain a consistent type of data or be empty. For example, don’t have a field which is sometimes a string and sometimes a date
- Avoid creating lists of values in a single field
- Don’t be inconsistent in your treatment of similar data sets. Use the same field names and data types wherever possible
- Don’t include generic data which has little or no meaning when information is missing. For example, if you don’t have a description don’t put “No description” in the data set
- Unknown information should be indicated as:
- CSV: empty value (two consecutive commas)
- GeoJSON: no attribute (preferred), or ""
- Shapefile: no attribute (preferred), or ""
For more advice, including licensing, please see the Open Council Data Toolkit.
Participate
These standards are maintained in a Github repository that anyone can contribute to.
You can help create and refine these standards by:
- participating on the Open Council Data Google Group
- raising and commenting on issues on the Github issue tracker
- joining the fortnightly Open Council Data video Meetup (email Will McIntosh)
- coming to a weekly Open Knowledge Melbourne meetup
- uploading your data to data.gov.au (get started here)