Orcasound S3 HLS archives - orcasound/orcadata GitHub Wiki

Orcasound HLS Archives

Around 2-3 years of recordings from Orcasound hydrophones are currently archived on AWS S3 in HLS format, organized as described here. The date-timestamp on the earliest data for each nodes are:

  • Oct 10, 2018, 14:36:14 (UTC-07:00) for Bush Point (Whidbey Island), aka bush_point
  • Oct 31, 2018, 16:17:19 (UTC-07:00) for Orcasound Lab (Haro Strait), aka orcasound_lab
  • Sep 11, 2019, 13:11:21 (UTC-07:00) for Port Townsend (Marine Science Center), aka port_townsend

NOTE: S3 is a blob storage & not a true filesystem, hence the namespace below the level of a bucket is flat unlike a filesystem where heirachical subfolders can be listed & accessed efficiently.

Prefix Heirarchy

S3 still provides the users the option to list & filter using "prefixes" i.e. search & return only paths starting with X. The prefix heirarchy in current use is:

S3 bucket: streaming-orcasound-net

Prefix heirarchy:
    rpi_bush_point
    rpi_port_townsend
    rpi_orcasound_lab
        latest.txt
            hls
                1541061134
                1541027406
                    live.m3u8
                        live000.ts
                        live001.ts
                        ...
                        live296/7.ts
                ...

Below is a screenshot of what that looks like with an admin view:

screenshot of orcasound s3 page showing prefix heirarchy

  • Each hydrophone node e.g. rpi_orcasound_lab has multiple "subfolder" prefixes (1541061134 ... etc.) corresponding to a UTC time.
  • This "subfolder" prefix nests a 3-hour recording in HLS format (see below)
  • The latest.txt file at the root contains the prefix that is currently streaming in real-time

HLS Format

TODO