Outbreak Time Series Specification Overview - JohnTigue/idots GitHub Wiki

This is the home page for information about The Outbreak Time Series Specification. The primer is a good place to start learning. Clearly, this is a work in progress...

Table of Contents

Introduction

The data model described by The Outbreak Time Series Specification (the Spec) is intentionally simple and specific. The Spec quantifies only the population level information required for representing on the World Wide Web, as first class objects, information about infectious disease outbreaks. The Spec includes specific web-native serialization for transferring the information over the web and file systems. The information is sufficient for the exchange of outbreak information between governments, GNOs, researchers, etc. The information can also drive spatio-temporal visualizations of epidemics.

Personally Identifiable Information (PII) is explicitly out of scope. The Spec is not intended to address things like line lists nor contact tracing; doing so would involve a much greater level of complexity, technically and politically.

The core data serialization is based upon the W3C's CSV for the Web (CSVW). CSVW fits the definition of "web-native" as used here. CSVW serializations are mainly CSV files plus a bit of JSON for metadata and schema information. The CSVW Working Group defined algorithmic was of translating CSVW JSON+CSV information into pure JSON and XML (RDF), so systems which need such formats can also consume Spec-compliant documents.

Designed to enable easy adoption

The Spec has been designed to be intentionally simple in order to enable small, quick wins while staying out of any privacy concern quagmires (as much as possible). Yet just this small bit of structure might have a big payoff quickly.

The Spec defines a novel, specific was of structuring outbreak data. Nonetheless, the Spec has also been intentionally designed to work as well as possible with existing CSVs already deployed on the web. CSVW can used to "up translate" existing CSVs on the Web, see the W3C's CSV on the Web: A Primer:

CSV on the Web is designed to enable you to reuse the same schema when publishing multiple CSV files, even if those files are created by different organisations and therefore reside in different places... Experience shows that publishers of data in CSV files often use their own headings for the columns... The titles property allows you to provide multiple alternative titles that people may use in an array.

In this way it is hoped that existing agencies will be able to "adopt" the Spec with great easy in the short term, yet long term more and more agencies will publish outbreak data intentionally fully conformant with the Spec. This is a design strategy to improve the diffusion of innovation.

Documents

Relevant documents included the following.

  • Outbreak Time Series Specification Primer: an informal introduction to the Specification. The tutorial is non-normative but it is very informative. This is the place to go to quickly wrap your head around the Spec and the type of thing the Spec enables.
  • Outbreak Time Series Specification: the core document. The Spec is a tightly worded technical document and therefore is intentionally terse. It is not the place to start learning, it is the place where the buck stops.
  • Outbreak Time Series Specification Requirements: the reqs are not as formally written as the Specification but they enumerate features of what a solution to the problem at hand should encompass.

Code

Omolumeter is a client web app written in JavaScript which can read and visualize Outbreak Time Series Specification compliant documents.

Various implementations conformant to this specification may make available outbreak time series information in many ways, examples include: as a single CSV file on a USB stick, as XML attachments to blog posts, as XML markup namespaced into an Atom feed (which is actually how a global outbreak monitoring network should be implemented but more on that later), or as a set of HTTP URLs with can be used to fetch the information as JSON. For a recommendation for how to do the latter ("a web service API") case, see Outbreak Time Series web Service API Design Recommendations.

outbreak_time_series_reader is a JavaScript module which reads all three serializations defined in the Outbreak Time Series Specification. The outbreak_time_series_reader module runs in node.js based servers and in web browsers (via browserify), compatible with various popular frameworks, for example AngularJS. See outbreak_time_series_reader module for details.

Data

To actually code against a realization of those recommendations, there is a GitHub repository which maintains an example set of HTTP URLs which provide GETable data on the 2014 Ebola Outbreak in West Africa, see Ebola 2014 Outbreak in West Africa data. Also maintained is a set of CSV files which which provide the same information but which are conformant to this Spec.

The Omolumeter client web-app is focused on reading data according to the Spec, but it also reads existing well-kwown data feeds. For example, the Omolumeter can read [the situation reports from the WHO]( - E.g. WHO's Ebola data (CSV & JSON) ).