Azure - multiparty/cardinal GitHub Wiki

This is a guide to setting up Cardinal as an orchestrator on Azure.

  1. Azure Kubernetes Service - Creating the cluster

    • Login to your Azure account and navigate to your directory. If you are the root account, you should not have any permissions issues for the remaining steps. For IAM accounts, please make sure you have the "Contributor" and "User Access Administrator" roles.
    • Create a resource group for your cluster under "Resource Groups".
    • Create a cluster under "Kubernetes Services" - the default settings will work.
    • Once the cluster has been created, navigate to the "Virtual Machine Scale Sets" page and you should see a virtual machine scale set has been created for your cluster. Follow the instructions here to give it a Managed Identity. This will grant the cluster nodes permission to access Azure services.
    • Once a Managed Identity has been enabled for your virtual machine scale set, give it the appropriate permissions:
      • Go to the details page of your virtual machine scale set again.
      • Click "Identity" under "Settings" on the side menu and then click the "Azure role assignments" button.
      • Add 3 roles:
        • Scope: "Resource Group"; Subscription: {subscription where your cluster is}; Resource group: {the node group resource group (starts with MC)}; Role: "Owner"}
        • Scope: "Storage"; Subscription: {subscription where your cluster is}; Resource: {your storage account}; Role: "Storage Queue Data Contributor"}
        • Scope: "Storage"; Subscription: {subscription where your cluster is}; Resource: {your storage account}; Role: "Storage Blob Data Contributor"}
    • Navigate to the "Resource Groups" page again. You should see a new resource group has been created for your cluster nodes as well. The name will have the form MC_{cluster_resource_group_name}_{cluster_name}_{cluster_location_code}. Please make note of this resource group name.
  2. Docker

    NOTE: For setup in Azure, you can keep the cardinal repo on your local machine. We will be building a Docker image for cardinal locally and pushing it to an image repository. This is different from setup in Google Cloud.

    • Install Docker Desktop onto your machine and make sure it is running.
    • Make a Dockerhub account if you don't have one already.
    • Clone the cardinal repo onto your machine and change directories into it.

    git clone https://github.com/multiparty/cardinal.git

    cd cardinal

    • Move the Dockerfile from the aks_docker directory to the main cardinal directory (either on your file system or it your terminal with the following line).

    mv aks_docker/Dockerfile .

    • Edit the following fields in the Dockerfile:
      • Set SUB_ID to the ID of the Azure subscription that your cluster is a part of.
      • Set RESOURCE_GROUP_NAME to the node resource group name that you noted down from step 1.
      • Set LOCATION to the location code of your cluster.
      • Set STORAGE_ACCT to the URL of the storage account where your datasets are located.
      • Set CONGREGATION to the image URI of the Congregation image you would like to use. See note below about Congregation images.
      • Set CHAMBERLAIN to the IP address or hostname of your deployed chamberlain server.
      • Set PROFILE to 'true' or 'false' depending on whether you would like this deployment to save profiling information for each workflow.
      • Enter the AWS credentials for the account where your cardinal results will be sent.
      • Set DESTINATION_BUCKET to the name of the bucket in the above AWS account where your results will be sent.
      • If you intend to start multiple clusters (i.e. one for each party) in this Azure directory, you will need to edit, build, and push this Dockerfile for each one under a different tag. Please see below for a template Dockerfile.
    • In the terminal, run docker build -t {name_of_your_Dockerhub_repo}/cardinal:{tag} . where tag can be any string that helps you identify what version of this image you are working with or building. (e.g. hicsail/cardinal:aks-east1)
    • Run docker push {name_of_your_Dockerhub_repo}/cardinal:{tag} to push the image you just made to your Dockerhub repository.
  3. Kubernetes - Deploying your Cardinal image(s)

    • Install the Azure CLI locally.
    • Please locate the file cardinal-depl.yaml in the top-level directory of the cardinal repo. Edit line 19 in the file to point to the image URI of the image you pushed in step 2 (e.g. docker.io/hicsail/cardinal:aks-east1). You may also edit any lines having to do with the deployment and service names if you wish, provided they are consistent across the file.
    • In the terminal, use az login to authenticate to your Azure account. If you have multiple subscriptions, you will want to run az account set --subscription "{subscription_name}" or you will need to include the --subscription option for future az calls.
    • Run az aks get-credentials --resource-group {your_resource_group} --name {your_AKS_cluster} to add the cluster configuration to your Kubernetes config file. This command should also switch your current context to your Azure cluster. Test that this worked by running kubectl get pod - you should not see anything as we haven't deployed anything to this cluster yet!
    • Make sure you are still in the top-level directory of the cardinal repo and run kubectl apply -f role_def.yaml and kubectl apply -f role_binding_def.yaml to give the cluster full Kubernetes API permissions. You should only have to apply these files once per cluster.
    • Make sure you are still in the top-level directory of the cardinal repo and run kubectl apply -f cardinal-depl.yaml. This will create a Deployment of cardinal and a LoadBalancer Service with an external IP address that you can reach.
    • You can find the external IP address of the Service either in the terminal using kubectl get service or in the Azure console at Kubernetes services > {your cluster} > Services and ingresses (on sidebar). In the browser, if you navigate to that IP with the port 5000 and you see a page with just the word "homepage" on it, that means the cardinal server is up and running.
    • NOTE: Once you have gone through these steps once, you will not have to do them all every time you need to deploy. You can just run kubectl delete -f cardinal-depl.yaml to delete your existing deployment and then apply the new one.

You are now set up to run workflows using Cardinal! You can begin by trying to send a submit request from chamberlain.

Congregation

If you do not know of a publicly available Congregation Docker image that you would like to use, you are welcome to build, push, and use your own. Please see the repository catechism to see what the current configuration of the pushed Congregation images are. There you will see directories labeled by which backend is being used in that particular image. Inside each directory will be a push_pull.py script, a bash script, and a Dockerfile that specifies which libraries to clone to the image and then runs the bash script. To make a new congregation image, simply copy the file structure of an existing one, make the changes you would like to make, and then build and push the image in the same way that we built and pushed the cardinal images.

Dockerfile

Below is a template Dockerfile for deploying cardinal on Azure AKS:

FROM python:3.7

RUN mkdir /cardinal
WORKDIR /cardinal
ADD . /cardinal/
RUN pip install --upgrade pip
RUN pip install -r requirements.txt

#------------------#
# CHANGE AS NEEDED #
#------------------#
# Environment information
ENV PORT=5000
ENV CLOUD_PROVIDER="AKS"
ENV INFRA="AZURE"
ENV SUB_ID="{subscription ID}"
ENV RESOURCE_GROUP_NAME="{NODE resource group name}"
ENV LOCATION="{cluster location}"
ENV STORAGE_ACCT="{storage account where the datasets accessible by this cluster are located}"
ENV CONGREGATION="{image URI of congregation image you would like to use}"

# profile flag - set to "true" to receive profiling timestamps with each workflow
ENV PROFILE="false"

# Information for curia
ENV AWS_REGION="{region of public-read results bucket}"
ENV AWS_ACCESS_KEY_ID="{access key for owner account of bucket}"
ENV AWS_SECRET_ACCESS_KEY="{secret access key for owner account of bucket}"
ENV DESTINATION_BUCKET="{name of public-read results bucket}"

CMD ["python", "/cardinal/wsgi.py"]