Hello World - MeirellesLab/AzureCustomTasks GitHub Wiki

Hello World Example

ACT's Hello World contains two files, the config.json, and the inputs.csv. In the following lines, we will explain in more detail the execution of this example and the contents of each of these files.

Before running this example, you must setup the config credentials and create the storage output container (mydata).

Then, from the AzureCustomTask root folder using the command:

# To preview the execution before sending it to the cloud
python3 src/act/azure_custom_tasks.py -j examples/helloworld/config.json -i examples/helloworld/inputs.csv -s

# To run the command
python3 src/act/azure_custom_tasks.py -j examples/helloworld/config.json -i examples/helloworld/inputs.csv

Here, you can take a look at the entire config.json file content:

{
  "batch": {
    "accountName":"PLACE_YOUR_BATCH_ACCOUNT_NAME_HERE",
    "accountKey":"PLACE_YOUR_BATCH_ACCOUNT_KEY_HERE",
    "accountUrl":"PLACE_YOUR_BATCH_ACCOUNT_URL_HERE"
  },
  "pool": {
    "id":"PoolHelloWorld",
    "dedicatedNodeCount":1,
    "lowPriorityNodeCount":0,
    "taskSlotsPerNode":2,
    "vmSize":"Standard_A1_v2",
    "vmConfiguration": {
        "imageReference": {
            "publisher": "canonical",
            "offer": "0001-com-ubuntu-server-focal",
            "sku": "20_04-lts",
            "version": "latest"
        },
        "nodeAgentSKUId": "batch.node.ubuntu 20.04"
    },
    "useEphemeralOSDisk":false,
    "nodeStorageContainers": {
      "mount":false
    },
    "nodeAutoScale": {
      "include":false
    },
    "applications": {
      "include":false
    },
    "startupTask": {
      "include":false
    }
  },
  "job": {
    "id":"MyJobHelloWorld"
  },
  "tasks": {
    "addCollectionStep":10,
    "inputs": {
      "areBlobsInInputStorage":false
    },
    "resources": {
      "automaticInputsUpload":false,
      "automaticScriptsUpload":false
    },
    "logs": {
      "automaticUpload":true,
      "destinationPath":"logs/helloworld/",
      "pattern":"../std*"
    },
    "outputs": {
      "automaticUpload":false
    },
    "command":"bash -c \"echo -n 'Hello world from the ACT Hello world example! on ' && echo ",
    "commandSuffix":"\"",
    "retryCount":0,
    "retentionTimeInMinutes":1000
  },
  "storage": {
    "accountName":"PLACE_YOUR_STORAGE_ACCOUNT_NAME_HERE",
    "accountDomain":"PLACE_YOUR_STORAGE_ACCOUNT_DOMAIN_HERE",
    "accountSASToken":"PLACE_YOUR_STORAGE_SAS_TOKEN_HERE",
    "scripts": {
      "container":"mydata",
      "blobPrefix":"scripts/"
    },
    "input": {
      "container":"mydata",
      "path":"inputs/",
      "blobPrefix":""
    },
    "output": {
      "container":"mydata",
      "path":"output/",
      "blobPrefix":""
    }
  },
  "cleanup": {
    "timeoutInMinutes":10
  }
}

This config file has the minimal parameters required to run ACT. Most features are set to false meaning that they will not be used.

Here, we create a pool with only one VM node (pool."dedicatedNodeCount":1) of the smallest type (pool."vmSize":"Standard_A1_v2"), that can execute up to 2 task slots at once (pool."taskSlotsPerNode":2), using the Ubuntu server 20.04 image (pool."vmConfiguration").

The inputs are set to not come from storage (tasks.inputs."areBlobsInInputStorage":false), so it needs to receive an input file on execution.

In this example the contents of the inputs.csv are:

HELLO WORLD 1
HELLO WORLD 2
HELLO WORLD 3

Each line in this file represents the input string that will be passed to the Task.

For each Task, ACT will execute the command from the parameter tasks."command" appended with its received input, enclosed in single quotes, followed by the tasks.commandSuffix parameter if it exists.

When creating your own commands try to follow the recommended practices to wrap Linux commands in Azure Batch.

In this example, for the first input, the Task commandLine will be:

bash -c "echo -n 'Hello world from the ACT Hello world example! on ' && echo  'HELLO WORLD 1' "

Each output will be placed in the Task log file (stdout.log and stderr.log from tasks.logs."pattern":"../std*"), that will be automatically uploaded (tasks.logs."automaticUpload":true) to the output container (storage.output."container":"mydata") at folder '/logs/helloworld/' (tasks.logs."destinationPath":"logs/helloworld/").

The content of the expected output, from each stdout.log file, will be:

Hello world from the ACT Hello world sample! on HELLO WORLD 1

Hello world from the ACT Hello world sample! on HELLO WORLD 2

Hello world from the ACT Hello world sample! on HELLO WORLD 3