Feature: Scheduler Labels Others - sonchang/cattle GitHub Wiki

Overview

Why?

From the standpoint of building/deploying/scaling an application consisting of a bunch of micro-services, we want to allow users fine grained control over the deployment of the containers running these micro-services. Often these containers have very specific deployment requirements in order to inter-operate with each other (for example, port requirements, volume requirements, etc...)

Considerations

Handle existing container dependencies
- Port conflicts
- Shared volumes: --volumes-from=dependency
- Links: --link=dependency:alias
- Shared network stack: --net=container:dependency
Try to be compatible with docker swarm's specifications for filtering
Try to provide capabilities somewhat equivalent to fleet and kubernetes

Rancher labels for affinity rules

io.rancher.scheduler.constraint:XYZ{ne}ABC

io.rancher.scheduler.constraint:XYZ{eq~}DEF

Labels:

io.rancher.scheduler.affinity:container=foo (where foo is name of container or uuid)
io.rancher.scheduler.affinity:container_ne=foo,bar
io.rancher.scheduler.affinity:container_soft=foo,bar
io.rancher.scheduler.affinity:container_soft_ne=foo,bar
io.rancher.scheduler.affinity:host_label=XYZ=ABC (where XYZ and ABC are key/value pairs for a host label)
io.rancher.scheduler.affinity:host_label_ne=XYZ=ABC
io.rancher.scheduler.affinity:host_label_soft=XYZ=ABC,AAA=BBB
io.rancher.scheduler.affinity:host_label_soft_ne=XYZ=ABC
io.rancher.scheduler.affinity:container_label=XYZ=ABC (where XYZ and ABC are key/value pairs for a container label)
io.rancher.scheduler.affinity:container_label_ne=XYZ=ABC
io.rancher.scheduler.affinity:container_label_soft=XYZ=ABC
io.rancher.scheduler.affinity:container_label_soft_ne=XYZ=ABC

We'll continue to support swarm's environment variables as well for specifying affinity rules

Container constraints (where 'foo' is name of container or 'uuid'):
- affinity:container==foo (goes to a host having 'foo')
- affinity:container!=foo (goes to a host that does not have 'foo')
- affinity:container==~foo
- affinity:container!=~foo
Host constraints:
- constraint:label_key==bar (goes to host with label_key=bar)
- constraint:label_key!=bar
- constraint:label_key==~bar
- constraint:label_key!=~bar
Rancher environment variables
- affinity:container_label:key==value (goes to a host having container with label: label_key=bar)
- affinity:container_label:key!=value
- affinity:container_label:key==~value
- affinity:container_label:key!=~value

Service scenarios

TODO: Most likely these rules will actually use ==~ instead of == (and !=~ instead of !=) to avoid falling into a case where no allocation is found

Service rule: A: affinity:container==B (where A=3, B=1)

A1: affinity:container==B
A2: affinity:container==B
A3: affinity:container==B

Service rule: A: affinity:container==B (where A=3, B=2)

I guess cycle through however many B's are available. Likewise, if B's scale is larger than A's:
- A1: affinity:container==B1
- A2: affinity:container==B2
- A3: affinity:container==B1

Service rule: A: affinity:container!=A (where scale of A is 3)

Initially A1 lands on any:
- A1: affinity:container!=A1
- A1: affinity:container!=A2
- A1: affinity:container!=A3
A2 shouldn't land on A1
- A2: affinity:container!=A1
- A2: affinity:container!=A2
- A2: affinity:container!=A3
A3 shouldn't land on A1 or A2
- A3: affinity:container!=A1
- A3: affinity:container!=A2
- A3: affinity:container!=A3

Service rule: A: affinity:container!=B (where A=3, B=2)

A1: affinity:container!=B1
A1: affinity:container!=B2
A2: affinity:container!=B1
A2: affinity:container!=B2
A3: affinity:container!=B1
A3: affinity:container!=B2

Functional design

Labels/tagging versus clustering

Question: Should we use cattle's existing clustering framework for labeling?

I don't think we should do this for multiple reasons:

Currently, a cluster is also considered as a host. This means it has an entirely different deployment mechanism for containers. We choose one specific 'host' or 'cluster' to deploy to. On the other hand, with labeling, there's much greater flexibility in terms of how the label is treated including wildcarding or regular expressions. Also, treating a cluster as a label is limiting the labeling to just hosts as opposed to including containers.
Labels are currently publicly exposed within docker. Similarly, clusters are publicly exposed within cattle. Intermixing the two would cause much confusion.

Ultimately, labeling/tagging should be an entirely different axis from clustering.

Environment variable definition

Swarm type constraints

-e constraint:{label_key==label_value} where label_key can be a custom value or one of the following standard ones sourced from docker info : storagedriver, executiondriver, kernelversion, operatingsystem

Note: Currently the labels above only apply to host/docker daemon labels. So we'll have cattle's scheduler do the same except act on our own host labels

-e affinity:container=={other_container_name or id}

-e affinity:image={image_name or id}

*Note: Soft affinities is specified using '~' * Note: globbing support?

Additional research: Docker GET /exec/(id)/json returns info that is gathered by the docker info command. However, how do we obtain this info without first deploying a container?

Potentially phase 2: Support for go regular expressions

Fleet / systemd considerations

Option Name	Description
`MachineID`	Require the unit be scheduled to the machine identified by the given string.
`MachineOf`	Limit eligible machines to the one that hosts a specific unit.
`MachineMetadata`	Limit eligible machines to those with this specific metadata.
`Conflicts`	Prevent a unit from being collocated with other units using glob-matching on the other unit names.
`Global`	Schedule this unit on all agents in the cluster. A unit is considered invalid if options other than `MachineMetadata` are provided alongside `Global=true`.

MachineID: accomplished via matching of tag/label of host
MachineOf: accomplished via matching of tag/label of container
MachineMetadata: accomplished via matching of tag/label of hosts all having same tag/label
Conflicts: Negative matches of above

Kubernetes type scheduling

Labels support (same as swarm's)

Question: Are there additional capabilities that label variable substitutions provide that can't be accomplished by glob-matching / regex

Implementation

##New models

'label' with fields 'name', 'type'
- 'name': name of the label
- 'type': TBD whether we should keep this or not
'instancelabelmap'
'hostlabelmap'
labels for images?

Basically, allows you associate label(s) to instances or hosts.

Public API

'container': New actions 'addlabel', 'removelabel'

'host': New actions 'addlabel', 'removelabel'

'label': Simplified process lifecycle: create/remove

Appendix

Sample host information:

{
  "fields": {
    "reportedUuid": "fb605b0b-f213-412f-bed6-234a89a0e2fe",
    "type": "host",
    "physicalHostUuid": "7cb89135-6030-48ec-a009-aad41844ee3e",
    "info": {
      "osInfo": {
        "versionDescription": "trusty",
        "kernelVersion": "3.18.5-tinycore64",
        "distribution": "Ubuntu",
        "version": "14.04",
        "dockerVersion": "Docker version 1.5.0, build a8a31ef"
      },
      "cpuInfo": {
        "count": 8,
        "cpuCoresPercentages": [
          0.21,
          0.565,
          0.904,
          0.347,
          0.826,
          1.27,
          0.065,
          0.289
        ],
        "loadAvg": [
          0.02,
          0.08,
          0.07
        ],
        "mhz": 2210.338,
        "modelName": "Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz"
      },
      "memoryInfo": {
        "memTotal": 2007.906,
        "swapTotal": 1434.145,
        "cached": 185.699,
        "swapCached": 0.0,
        "swapFree": 1434.145,
        "memAvailable": 1635.586,
        "memFree": 1678.313,
        "inactive": 119.238,
        "active": 156.703,
        "buffers": 34.305
      },
      "diskInfo": {
        "mountPoints": {
          "/dev/sda1": {
            "total": 18603.41,
            "free": 12721.297,
            "used": 5882.113,
            "percentUsed": 31.62
          }
        }
      }
    }
  }
}

Sample Kubernetes replication controller config:

id: nginx-controller
apiVersion: v1beta1
kind: ReplicationController
desiredState:
  replicas: 2
  # replicaSelector identifies the set of Pods that this
  # replicaController is responsible for managing
  replicaSelector:
    name: nginx
  # podTemplate defines the 'cookie cutter' used for creating
  # new pods when necessary
  podTemplate:
    desiredState:
      manifest:
        version: v1beta1
        id: nginx
        containers:
          - name: nginx
            image: nginx
            ports:
              - containerPort: 80
    # Important: these labels need to match the selector above
    # The api server enforces this constraint.
    labels:
      name: nginx

References

https://github.com/GoogleCloudPlatform/kubernetes/blob/master/examples/walkthrough/k8s201.md

https://docs.docker.com/swarm/scheduler/filter/