Open Data APIs for Input Data - OpenData-tu/documentation GitHub Wiki
Open Data APIs
Version | Date | Modified by | Summary of changes |
---|---|---|---|
0.4 | 2017-07-05 | Nico Tasche | added validation schema |
0.3 | 2017-06-06 | Nico Tasche | updated to new dataformat version |
pre0.3 | 2017-06-06 | Nico Tasche | already revised version of the example, rest is following next week, after my seminar presentation |
0.2a | 2017-05-29 | Andres Ardila | Added comments/suggestions + minor formatting & spelling fixes to original text |
0.2 | 2017-05-24 | Nico Tasche | started describing datamodel for worker service output 0.2 |
0.1 | 2017-05-16 | Nico Tasche | initial Version |
1. Preface
This document describes the data format which is created by the importers.
2. Data Model
Our data model consists of a shallow hierarchy, which allows no deep trees.
Tier 1:
Tier 1 fields are necessary for each entry in the database.
Field | Value |
---|---|
source_id |
name of data provider, which was provided during setup process |
device |
name for the device, for example name of a weather station |
timestamp |
time when data has been saved in this database |
timestamp_record |
timestamp from datasource on when the measurement has been taken place |
location |
location as object representation lat and lon explicitly named |
sensors |
object, containing all sensors within that device |
extras |
object, containing all information, the source demands necessary for one entry |
height (optional) |
height in meter above sealevel |
license |
license is mandatory, even when it is empty, ore unknown |
Tier 2:
Tier 2 fields contain a collection of measurements, each one is optional
Field | Value |
---|---|
temperature |
main temperature object |
temperature_1 |
second temperature object |
humidity |
main humidity object |
Tier 3:
Tier 3 fields contain the actual data
Sensors
Field | Value |
---|---|
sensor |
Sensors may be physical devices, computational methods, a laboratory setup |
observation_value |
value |
optional types | any kind of optional type like sensor quality etc. Optional types must be consistent across all data points for one data source a is defined in the meta data database |
extras
might contain additional information, see example.
Schema
{
"$schema": "http://json-schema.org/schema#",
"title": "Data Source",
"description": "A Data Source for Open Sensor Data from the CP project at TU Berlin. ",
"type": "object",
"properties": {
"source_id": {
"type": "string"
},
"device": {
"type": "string"
},
"timestamp": {
"type": "string",
"format": "date-time"
},
"timestamp_record": {
"type": "string",
"format": "date-time"
},
"location": {
"type": "object",
"properties": {
"lat": {
"type": "number",
"exclusiveMaximum": true,
"exclusiveMinimum": true,
"maximum": 90,
"minimum": -90,
},
"lon": {
"type": "number",
"exclusiveMaximum": true,
"exclusiveMinimum": true,
"maximum": 180,
"minimum": -180,
}
},
"required": ["lat", "lon"]
},
"license": {
"type": "string"
},
"sensors": {
"type": "object",
"items": [
{
"type": "object",
"properties": {
"sensor": {
"type": "string"
},
"observation_value": {
"type": "number"
}
}
}
]
}
},
"required": ["timestamp", "timestamp_record","sensors", "location", "license"]
}
Sample with Luftdate.info
{
"source_id": "luftdaten_info",
"device": "141",
"timestamp": "2017-06-06T00:02:10",
"timestamp_record": "this is gonna be the timestamp, when the data is included into the database",
"location":{
"lat": 48.779,
"lon": 9.160
},
"licence": "gotta find out",
"sensors" : {
"temperature": {
"sensor": "BME280",
"observation_value:" : 17.62
},
"humidity": {
"sensor": "BME280",
"observation_value:" : 76.34
}
},
"extra":{
"location": "65"
}
}
TODO:
Open questions:
* Do we allow more than one Sensor with the same kind of data per datapoint?
* Lat/lon are usually (in GeoJSON and many APIs) represented as a pair of floats or doubles in a Cartesian plane
e.g. `location: [13.3242, 52.2345]`
note this is in reverse order, i.e. it represents the point (x,y) on the surface of the earth and not lat/lon
This could potentially require less transformation/integration since this format is already commonly used (AA)
3. Units
General
Every dataset has to convert their units to the predefined units. If possible Si- Units are being used.
Time
The below can be written more concisely to simply say that time must be an ISO 8601 compliant datetime format (preferably one with time zone information included) (AA)
Time is internally being saved in UTC, milliseconds since epoch. To safe time use the following formate:
YYYY-MM-DDThh:mm:ss.fffZ
If sufficient, hh:mm:ss.fff
does not need to be supplied.
If necessary YYYY-MM-DDThh:mm:ss.fff+02:00
can be used, to supply time from a different timezone. This example safes the time CEST (MESZ Summertime). It is internally converted to UTC.