time formats - acfr/comma GitHub Wiki

formats

comma uses boost::posix_time::ptime (in c++) and numpy.datetime64 (in python) to represent times.
Both these formats represent time as a 64-bit integer of microseconds since epoch (19700101T000000.000000)

all comma applications read and write times in the following formats:

  • ascii: ISO-8601 format; YYYYMMDD_T_hhmmss[.ffffff]
  • binary: 64-bit integer; number of microseconds since epoch

use csv-time to convert ISO-8601 to another time format.

special values

  • infinite time
  • comma ascii: "+infinity"
  • comma binary: int64::max()
  • in boost: int64::max() - 1
  • in python: not implemented
  • negative infinite time
  • comma ascii: "-infinity"
  • comma binary: int64::min() + 1
  • in boost: int64::min()
  • in python: not implemented
  • invalid time
  • comma ascii: "not-a-date-time"
  • comma binary: int64::min()
  • in python: int64::min()
  • in boost: int64::max()

These binary representations were chosen so that not-a-date-time matches numpy.datetime64('NaT') in python.

ranges

ascii: 14000101T000000.000000 .. 99991231T235959.999999, -infinity, +infinity, not-a-date-time
binary: 19011213T204553.00000 .. 20380119T031408.999999, -infinity, +infinity, not-a-date-time

limitations

parsing times in c++ applications

When parsing from iso time, boost requires that the year be in the range 1400..9999. Empty or invalid time strings are parsed as not-a-date-time.
When reading from binary, boost internally casts to an int32 to represent seconds since epoch. Because of this times after 20380119T031408 cannot be represented; instead they will overflow.

infinite times in python applications

python's numpy.datetime64 does not have a representation for -infinity or +infinity, so comma maps these to the earliest and latest times representable in a datetime64. python applications such as csv-eval can perform comparisons on these times but do not support arithmetic operations (e.g. time-shifting) on infinite times.
Use csv-time-delay for these operations.

examples

find times which are within a time interval:

$ echo 20160101T000000 | csv-select --fields=t "t;greater=-infinity" "t;less=20170101T000000"
20160101T000000

find time intervals which include a query time (use +/- infinity to specify an open interval):

$ echo 20160101T000000,+infinity | csv-select --fields=f,t "f;less=20161019T160000" "t;greater=20161019T160000"
20160101T000000,+infinity

shift a time interval by one day:

$ echo 20150101T000000,20150102T000000 | csv-time-delay --fields=from, 86400 | csv-time-delay --fields=,to 86400
20150102T000000,20150103T000000