Harvest Workflows - QutEcoacoustics/baw-server GitHub Wiki
Our new harvester APIs have been designed to work in two ways: streaming mode or batch mode.
Streaming mode is intended for use by automated sensors that send audio back to the workbench as they collect it. It is the stricter form of the harvest modes and does not support corrective actions or free-form arrangements of files. Any files that fail validation are simply ignored.
Batch mode is intended for direct use by users. It supports corrective actions by users when validations fail, and allows for a free form upload of files.
A harvest model is a state machine. It has stages it can transition to. Some transitions are automatic, some must be manually activated.
status | streaming | description |
---|---|---|
:new_harvest | true/false | Initial state used only server side |
:uploading | true/false | A software defined SFTP service is created and enabled for uploads |
:scanning | false | SFTP connection is disabled, our workers scan for files |
:metadata_extraction | false | SFTP connection is disabled, our workers validate files |
:metadata_review | false | Users can review the state of the harvest. Corrective actions can taken. The harvest can be transitioned back to :uploading or :metadata_extraction. |
:processing | false | The final ingestion phase |
:complete | false | A final state. One a harvest enters this phase it cannot transition to any other phase. The SFTP connection is deleted and can never be reactivated. |
stateDiagram-v2
state choice <<choice>>
new_harvest --> choice:⏭️
choice --> uploading: if !streaming ⏭️
choice --> uploading_streaming: if streaming ⏭️
uploading --> scanning: 🔡
scanning --> metadata_extraction: ⏭️
metadata_extraction --> metadata_review: ⏭️
metadata_review --> uploading: 🔡
metadata_review --> metadata_extraction: 🔡
metadata_review --> processing: 🔡
processing --> complete: ⏭️
uploading_streaming : uploading
uploading_streaming --> complete: 🔡
[*] --> complete: ❗abort
- ⏭️ denotes an automatic transition
- 🔡 denotes an manual transition that must be done by a client with a
PATCH
request that modifiesstatus
-
(new_harvest) Create a new harvest
POST /projects/:projectId/harvests { "harvest": { "streaming": true } }
-
(uploading) Upload files using SFTPGO login details from previous request
- New directories cannot be made. Files must be uploaded into existing sub-directories.
- During upload process, show harvest report to user
- During upload process, show harvested audio files to user
-
(uploading) Complete the harvest
PATCH /projects/:projectId/harvests/:harvestId { "harvest": { "status": "complete" } }
-
(complete) Show final statistics from the harvest report
- (new_harvest) Create a new harvest
POST /projects/:projectId/harvests { "harvest": { "streaming": false } }
- (uploading) Upload files using SFTPGO login details from previous request
- During upload process, show harvest report to user
- (uploading) Transition to metadata extraction stage
PATCH /projects/:projectId/harvests/:harvestId { "harvest": { "status": "scanning" } }
- (scanning) Poll for updates. The harvest will transition to the next state automatically.
- During scanning process, show harvest report to user
- (metadata_extraction) Poll for updates. The harvest will transition to the next state automatically.
- During extraction process, show harvest report to user
- (metadata_review) Review changes. Three courses of action available:
- Change a directory mapping to add metadata to files
- (metadataReview) Fix file mappings
PATCH /projects/:projectId/harvests/:harvestId { "harvest": { "mappings": [ ...harvest.mappings, { "site_id": 1, "path": "path/to/folder", "utc_offset": "+10:00", "recursive": true } ] } }
- Then transition back to the metadata extraction stage
⤴️ PATCH /projects/:projectId/harvests/:harvestId { "harvest": { "status": "metadata_extraction" } }
- (metadataReview) Fix file mappings
- Files need to be changed or rearranged
- Transition to back to the uploading stage
⤴️ PATCH /projects/:projectId/harvests/:harvestId { "harvest": { "status": "uploading" } }
- Transition to back to the uploading stage
- Ready to advance
- any yet to be fixed fixable errors will be processed as failures
- any non-fixable errors will be processed as failures
- any files that have no errors should be harvested successfully
- Transition to processing stage
PATCH /projects/:projectId/harvests/:harvestId { "harvest": { "status": "processing" } }
- Change a directory mapping to add metadata to files
- (processing) Poll for updates. The harvest will transition to the next state automatically.
- During processing, show harvest report to user
- During processing, show harvested audio files to user
- (complete) Show final statistics
- Show harvest report to user
- Show harvested audio files to user
GET /projects/:projectId/harvests/:harvestId
PATCH /audio_recordings/filter
{
"filter": {
"harvests.id": { "eq": harvest.id }
}
}