ACT Usage - MeirellesLab/AzureCustomTasks GitHub Wiki

What ACT DOES and DOESN'T do:

Does:

  • Centralize all configuration parameters in a JSON file: Managing the creation of the Azure Batch resources (Pool, Job, and Tasks), instead of making multiple calls to the Microsoft Azure API for each configuration programmatically. Making it fast and easy to run your routines.
  • Contain most important Microsoft Azure Batch Service features: Providing uncomplicated access to many important commands, speeding up usage of those features.
  • Automatically mount storage containers to compute nodes: The pool configuration can specify the storage containers to be mounted in the compute nodes and Tasks can read inputs and write outputs to it.
  • Multiple options to define the Task input: Inputs can be blobs in a storage container or strings in a file. We can also filter them by existing Tasks and/or by existing output. Allowing the user to customize the new Tasks creation for the precise need.
  • Custom function to calculate required computing slots for Tasks: It’s possible to define, in the configuration file, a function to calculate the number of required computing slots for each Task, based on the input name and/or size. Optimizing resource management and reducing costs.
  • Built-in debug: To show execution details (inputs/outputs/scripts/task commands) before, and while, running.
  • Built-in custom log: To help keep track of the Task events and profile the execution of the Task’s commands.

Doesn't:

  • Create Storage containers: ACT reads and writes blobs to existing containers, but it does not manage any other storage operations.
  • Automatically upload input to storage: ACT expects that the input blobs are already in the defined input container. But if you have numerous inputs to upload, you can include in your task a setup process for uploading the inputs in your own code before running your analysis. To do this, use the argument --input and add an input file that will contain your inputs' current location to access them from the cloud, or use the Azure Portal or another application to do it, like AzCopy.
  • Automatically upload scripts to storage or create applications: ACT expects that your tasks.command parameter calls only existing applications or scripts that you defined in the storage.scripts.container parameter to be copied to the task.
  • Test your script before running: If your script has a syntax or execution error, it will occur only after the creation and execution of the Tasks. All batch resources will be created and files transferred, having ongoing costs until all resources are cleaned. To prevent unnecessary expenditure, test your code on a small amount of low-capacity VMs, before setting up the real scenario.

ACT Arguments

This Section explains the arguments that can be used with the ACT program. All arguments are optional and if you do not provide them, some assume a default value, which is explained below.

Azure Custom Tasks - ACT v1.0

Usage:

python3 azure_custom_tasks.py [-j JSON] [-i INPUT] [-xslcedrwfyvh] [-sI] [-sO] [-sS] [-sT] [-dI]
  • -h or --help : Show the help message and exit.
  • -j or --json : Use the specified JSON file as the configuration file. This file must be in the .json format and contain all the required configuration strings. If you do not provide this parameter, the default value is to consider the existence of a file named config.json in the current working directory. If the JSON file does not exist, the program finishes with an error message. You cannot run ACT without a configuration file. We explain this file and all its parameters in its corresponding section of this wiki.
  • -i or --input : Use the strings in the INPUT file as inputs for the Tasks. It is expected that each line of this file contains one input description separated by a comma, (1) the input string itself, (2) the input size, and (3) the required computing slots for this input, only the first parameter is required, the other parameters are optional with default value 0 and 1, respectively. We explain this in more detail in the example 2.
  • -x or --exec : Start the Batch Service, Pool, Job, and Tasks, execution with the parameters specified in the configuration file. If you do not supply any argument (other than -j and -i) this is the default behavior of ACT.
  • -s or --show : Show the debug information about the current execution: inputs, outputs, scripts, and Tasks' commands.
  • -sI or --show-inputs : Show the corresponding blobs from the configured input.
  • -sO or --show-outputs : Show the corresponding blobs from the configured output.
  • -sS or --show-scripts : Show the corresponding blobs from the configured scripts.
  • -sT or --show-tasks : Show the Tasks' commandLine for each Task.
  • -dI or --delete-inputs : Delete the corresponding blobs from configured input.
  • -l or --list : List Tasks by their states.
  • -c or --count : Count Tasks by their states.
  • -d or --disable : Disable the current Job and all associated Tasks, returning the Tasks that are running to the end of the execution queue. Cannot add new Tasks while the Job is disabled.
  • -e or --enable : Enable the current Job and all associated Tasks, restarting the Task's allocation to the execution queue.
  • -r or --reactivate : Reactivate all failed Tasks to re-queue them.
  • -w or --wait : Wait all Tasks to complete while showing the current progress.
  • -f or --free : Terminate the batch and free its resources (deleting all Pools, Jobs, and Tasks from the Batch Account).
  • -y or --yes : Include this to --free command to confirm deletion without requiring user confirmation.
  • -v or --version : Show ACT version number and exit.