Open Data APIs for Input Data - OpenData-tu/documentation GitHub Wiki

Open Data APIs

Version Date Modified by Summary of changes
0.4 2017-07-05 Nico Tasche added validation schema
0.3 2017-06-06 Nico Tasche updated to new dataformat version
pre0.3 2017-06-06 Nico Tasche already revised version of the example, rest is following next week, after my seminar presentation
0.2a 2017-05-29 Andres Ardila Added comments/suggestions + minor formatting & spelling fixes to original text
0.2 2017-05-24 Nico Tasche started describing datamodel for worker service output 0.2
0.1 2017-05-16 Nico Tasche initial Version

1. Preface

This document describes the data format which is created by the importers.

2. Data Model

Our data model consists of a shallow hierarchy, which allows no deep trees.

Tier 1:

Tier 1 fields are necessary for each entry in the database.

Field Value
source_id name of data provider, which was provided during setup process
device name for the device, for example name of a weather station
timestamp time when data has been saved in this database
timestamp_record timestamp from datasource on when the measurement has been taken place
location location as object representation lat and lon explicitly named
sensors object, containing all sensors within that device
extras object, containing all information, the source demands necessary for one entry
height (optional) height in meter above sealevel
license license is mandatory, even when it is empty, ore unknown

Tier 2:

Tier 2 fields contain a collection of measurements, each one is optional

Field Value
temperature main temperature object
temperature_1 second temperature object
humidity main humidity object

Tier 3:

Tier 3 fields contain the actual data

Sensors

Field Value
sensor Sensors may be physical devices, computational methods, a laboratory setup
observation_value value
optional types any kind of optional type like sensor quality etc. Optional types must be consistent across all data points for one data source a is defined in the meta data database

extras

might contain additional information, see example.

Schema

{
  "$schema": "http://json-schema.org/schema#",
  "title": "Data Source",
  "description": "A Data Source for Open Sensor Data from the CP project at TU Berlin. ",
  "type": "object",
  "properties": {
    "source_id": {
      "type": "string"
    },
    "device": {
      "type": "string"
    },
    "timestamp": {
      "type": "string",
      "format": "date-time"
    },
    "timestamp_record": {
      "type": "string",
      "format": "date-time"
    },
    "location": {
      "type": "object",
      "properties": {
        "lat": {
          "type": "number",
            "exclusiveMaximum": true,
            "exclusiveMinimum": true,
            "maximum": 90,
            "minimum": -90,
        },
        "lon": {
          "type": "number",
          "exclusiveMaximum": true,
          "exclusiveMinimum": true,
          "maximum": 180,
          "minimum": -180,
        }
      },
        "required": ["lat", "lon"]
    },
    "license": {
      "type": "string"
    },
    "sensors": {
      "type": "object",
      "items": [
        {
          "type": "object",
          "properties": {
            "sensor": {
              "type": "string"
            },          
            "observation_value": {
              "type": "number"
            }
          }
        }
      ]
    }
  },
  "required": ["timestamp", "timestamp_record","sensors", "location", "license"]
}
Sample with Luftdate.info
{
    "source_id": "luftdaten_info",
    "device": "141",
    "timestamp": "2017-06-06T00:02:10",
    "timestamp_record": "this is gonna be the timestamp, when the data is included into the database",
    "location":{
        "lat": 48.779,
        "lon": 9.160
    },
    "licence": "gotta find out",
    "sensors" : {
        "temperature": {
            "sensor": "BME280",
            "observation_value:" : 17.62
            },
        "humidity": {
            "sensor": "BME280",
            "observation_value:" : 76.34
            }
    },
    "extra":{
        "location": "65"
    }    
}
TODO:
Open questions:
* Do we allow more than one Sensor with the same kind of data per datapoint?
* Lat/lon are usually (in GeoJSON and many APIs) represented as a pair of  floats or doubles in a Cartesian plane
e.g. `location: [13.3242, 52.2345]`
note this is in reverse order, i.e. it represents the point (x,y) on the surface of the earth and not lat/lon
This could potentially require less transformation/integration since this format is already commonly used (AA)

3. Units

General

Every dataset has to convert their units to the predefined units. If possible Si- Units are being used.

Time

The below can be written more concisely to simply say that time must be an ISO 8601 compliant datetime format (preferably one with time zone information included) (AA)

Time is internally being saved in UTC, milliseconds since epoch. To safe time use the following formate:

YYYY-MM-DDThh:mm:ss.fffZ

If sufficient, hh:mm:ss.fff does not need to be supplied.

If necessary YYYY-MM-DDThh:mm:ss.fff+02:00 can be used, to supply time from a different timezone. This example safes the time CEST (MESZ Summertime). It is internally converted to UTC.