Orca test data - orcasound/orcadata GitHub Wiki

Do not use for training!

In pursuit of meaningful comparison of the performance of orca-specific models, this page presents open test sets that are specific to killer whale signals. The highest priority of the AI for Orcas project is classification of signals from the endangered Southern Resident Killer Whales (SRKWs), so test sets are listed first for their signals, starting with calls (but with placeholders for whistles and clicks). Orcasound is interested in classifiers for other common signals in the Salish Sea, so test sets for other species and sources, like Bigg's killer whales are listed subsequently.

Southern Resident Killer Whale test sets

Calls

  • High signal:noise ratio
    • 27 Sep 2017 -- 1/2? hour of data from the Orcasound Lab node
      • 2017 listener log
      • Labeled first in Pod.Cast by Scott, Akash, and Prakruti
      • Labels verified by Scott in Audacity
        • Labels
        • Audio data
        • Metadata
  • Intermediate signal:noise ratio
    • 05 Jul 2019 -- 1/2 hour of data from the Orcasound Lab node labeled in Audacity by Scott
      • Labels:
        • only calls
          • AWS CLI access via aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/classification/killer-whales/southern-residents/20190705/orcasound-lab/test-only/OS_7_05_2019_08_24_00_labels-SV_200210_only_calls.txt .
        • other signals -- with start/end times + label in row N ("call," specific stereotyped call ID, or "?" to indicate probable but not 100% certain call); row N+1 starts with \ and then contains lower and upper frequency bounds.
          • AWS CLI access via aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/classification/killer-whales/southern-residents/20190705/orcasound-lab/test-only/OS_7_05_2019_08_24_00_labels-SV_200210_other_signals.txt .
      • Audio data -- in WAV format
        • AWS CLI access via aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/classification/killer-whales/southern-residents/20190705/orcasound-lab/test-only/OS_7_05_2019_08_24_00_.wav .
      • Metadata
  • Low signal:noise ratio
    • 14 Nov 2019 -- 2.5 hours of data from the Port Townsend node labeled in Audacity by Scott
      • Labels
      • Audio data
      • Metadata

Labels for SRKW call type

Whistles

Clicks

Bigg's killer whale (aka West Coast transient) test sets

Northeast Pacific humpback whale test sets

Alaskan resident killer whale test sets