Home - gtfierro/pundat GitHub Wiki

Pundat is an archiver intended for use with BOSSWAVE.

Pundat is short for "Punctuated Data Access Archiver"

Archiving Data

Pundat is a subscription-based archiver, rather than the usual push-based models used by Giles and sMAP. Pundat subscribes to URIs on which data is published through the use of Archive Requests.

Archive Requests are special messages that contain instructions on which URIs to archive and how to archive them (how to interpret their data structs, where metadata comes from, etc).

Currently, the most convenient way to work with Archive Requests is through the savepoint tool.

How to use Archive Requests

Accessing Data

There are two ways to interact with the archiver:

  • SQL-like query language (docs contain the structure of the returned messages)
  • API requests (supported by archiver library, not yet by BOSSWAVE interface)

The API and language allow for:

  • querying raw timeseries data
  • querying statistical summaries of data (via BtrDB)
  • querying metadata
  • finding timeseries streams using metadata

How to run queries

Punctuated Data Access

PunDat uses BOSSWAVE DOTs to determine access to historical timeseries data

As of 60c84d50156, PunDat will mask access to metadata as well. First, an unaltered metadata query is run and the results are buffered locally. Then, for each document in the returned query, PunDat attempts to build a chain from the requesting VK to the URI of the associated document. If the chain exists, the document is permitted to be returned; else if the chain does not exist (note that this only uses existing access DOTs, not archival DOTs), the document is filtered.

This is appropriate for most cases, but isn't fully "correct" as it allows inherited metadata to be queried and seen. Additionally, distinct queries do not currently have any DOT protection.