Orca ML resources - orcasound/orcadata GitHub Wiki

Here you'll find a list of efforts to develop machine learning models related to orca signals. And then there are some catch-all sections with other resources at the bottom...

Machine learning efforts and accomplishments:

A roughly-chronological list of open efforts to develop orca-related machine learning algorithms

  • Erika's talk (fall 2018 start; ML4All: Apr 29, 2019)
    • "ML models overfit to the training set and are unable to generalize to other sources of data..."
    • We need users to help us label data by adding markers of orca sounds to the stream or archived files.
  • Val's recent contributions
  • Abhishek and Jesse (Google 2019 Summer of Code project on Bigg's & Alaskan resident killer whales... & humpbacks)
    • OrcaCNN data - data samples & student recruitment
    • OrcaCNN project - working repository
    • AK orca descriptions
    • 300 field recordings with pod ID from the Gulf of Alaska collected by Dan Olsen of North Gulf Oceanic Society in Homer
    • Pod-specific call catalog available for pod inference
    • Dan's main goal: improve autodetection of killer whales vs humpbacks vs boat noise, improving on PAMGUARD whistle & moan results
    • 2ndary goals: distinguish ecotype; find all occurrences of 1 specific call type from 1 pod.
  • Hackathon steps forward with Orcasound guidance or data
  • Canadian SRKW ML efforts
    • Ocean Network Canada (ONC)
      • 2020: Kristen Kanes plans to publish KW training data set via Science Data
      • SRKW call category standardization work group may begin to meet quarterly in 2021?
    • Department of Fisheries and Oceans (DFO)
      • 2019 DFO/Google/Rainforest Connection SRKW ML activities (internal, closed-source with Rainforest Connection as of 3/2020)
      • 2020 DFO/Meridian ML activities (planned, open-source)
        • Initial remote only meetings in fall, 2020
        • 2 year post-doc starts Jan, 2021
  • Other

Short descriptions of your contributions to the Orcadata repository:

You may also create page for more details regarding your classifier, data labeling efforts, etc...

Links to other bioacoustic and/or ML resources or algorithms that Orcasound might leverage

Good precedents in building synergy between human and machine learning, especially with real-time audio data

  • Birdnet - live bird call sound and classifier (browser-based, works best in Firefox, not Chrome)
  • Listen to the Deep (LIDO) - real-time underwater sound & spectrogram with ML classifiers (Flash-based)

Algorithm deployment guidance

  • For real-time Orcasound data (link to TBD guidance in various Orcasound repositories?)
  • For archived Orcasound data (link to GSoC repo?)