Metadata Server - sonchang/cattle GitHub Wiki

Schema / API / Definitions

Rancher

The following 2 environment variables will be defined in the container allowing docker image bootstrapping scripts to access the metadata server to access information:

METADATA_CONNECTION_URL=http://52.10.215.107:8500/v1/kv/
DEPLOYMENT_UNIT_ID=5/test1/s4/f58ce48e-a4d5-4c1b-905f-ee3b15ccf115

In the POC, I had accountId/stack/service/deploymentUnitId. With secondary launch configs, we might want that value between service and deploymentUnitId.

key	value
cardinal_value	launch index or some unique number for an instance within a scaled service
hostname	name of host for this particular node within a scaled service
virt_ip_address	ip address within managed network
service_domain	domain name for the service
*all_hosts	This might not be necessary since maybe this is just returned if the `deploymentUnitId` folder is removed above. Example: `/accountId/stack/service` ( minus the `deploymentUnitId` ). This can return a JSON structure with a list of links to the other deploymentUnits: `[ '5/test1/s4/f58ce48e-a4d5-4c1b-905f-ee3b15ccf115', '5/test1/s4/f58ce48e-a4d5-4c1b-905f-ee3b15ccf117', '5/test1/s4/f58ce48e-a4d5-4c1b-905f-ee3b15ccf118' ]` Or rather than links we can return the expanded JSON structure to avoid repeated requests to the server.
port
image	I really don't think we need this since we should really know what image we're currently in. But EC2 has AMI as a piece of metadata
volumes
service_name

Considerations:

Do we want more a key/value pair similar to EC2 or more a full data structure like Consul?
- Key/value pairs will likely result in more individual requests made. But also, at the same time, sometimes not all information is available at the same time (for example, host information for all instances in the service might not be available immediately). With individual requests, they each can have their own response code.
The concept of some revision number makes sense within cattle's ConfigUpdate framework for managing the configs for agents. However, in this context, it's not clear what would delineate a new revision? Perhaps, just a timestamp might make more sense.
I'm also wondering how much of the HTTP framework do we want to use (since I was using consul for the POC, I didn't have control over this). Some headers that could be useful are: Last-Modified, Content-Type, Content-Encoding, Date, Expires)

Future Enhancements:

Scope visibility via essentially an API token which would be another environment variable passed in. For example, from within that container, we shouldn't be able to query for metadata for a service in a different account.

Reference

EC2 style metadata

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html It is very much key/value pairs:

Example:

$ curl http://169.254.169.254/latest/meta-data/ami-id
ami-2bb65342

$ curl http://169.254.169.254/latest/meta-data/network/interfaces/macs/02:29:96:8f:6a:2d/
device-number
local-hostname
local-ipv4s
mac
owner-id
public-hostname
public-ipv4s

Consul

Consul returns more of a JSON data structure. https://consul.io/docs/agent/http/agent.html

For example:

/v1/agent/members

returns:

[
  {
    "Name": "foobar",
    "Addr": "10.1.10.12",
    "Port": 8301,
    "Tags": {
      "bootstrap": "1",
      "dc": "dc1",
      "port": "8300",
      "role": "consul"
    },
    "Status": 1,
    "ProtocolMin": 1,
    "ProtocolMax": 2,
    "ProtocolCur": 2,
    "DelegateMin": 1,
    "DelegateMax": 3,
    "DelegateCur": 3
  }
]

or /v1/agent/self

{
  "Config": {
    "Bootstrap": true,
    "Server": true,
    "Datacenter": "dc1",
    "DataDir": "/tmp/consul",
    "DNSRecursor": "",
    "DNSRecursors": [],
    "Domain": "consul.",
    "LogLevel": "INFO",
    "NodeName": "foobar",
    "ClientAddr": "127.0.0.1",
    "BindAddr": "0.0.0.0",
    "AdvertiseAddr": "10.1.10.12",
    "Ports": {
      "DNS": 8600,
      "HTTP": 8500,
      "RPC": 8400,
      "SerfLan": 8301,
      "SerfWan": 8302,
      "Server": 8300
    },
    "LeaveOnTerm": false,
    "SkipLeaveOnInt": false,
    "StatsiteAddr": "",
    "Protocol": 1,
    "EnableDebug": false,
    "VerifyIncoming": false,
    "VerifyOutgoing": false,
    "CAFile": "",
    "CertFile": "",
    "KeyFile": "",
    "StartJoin": [],
    "UiDir": "",
    "PidFile": "",
    "EnableSyslog": false,
    "RejoinAfterLeave": false
  },
  "Member": {
    "Name": "foobar",
    "Addr": "10.1.10.12",
    "Port": 8301,
    "Tags": {
      "bootstrap": "1",
      "dc": "dc1",
      "port": "8300",
      "role": "consul",
      "vsn": "1",
      "vsn_max": "1",
      "vsn_min": "1"
    },
    "Status": 1,
    "ProtocolMin": 1,
    "ProtocolMax": 2,
    "ProtocolCur": 2,
    "DelegateMin": 2,
    "DelegateMax": 4,
    "DelegateCur": 4
  }
}

DEPRECATED SECTION

Basically, the metadata server will just be a key/value store. There are many popular ones currently available:

etcd
- provides a DNS based service discovery for clustering (in addition to just using a static configuration). There's also a registry-based service discovery available.
- allows us to specify a specific size for the cluster
- additional etcd processes can act as proxy nodes with the option of just providing readonly privileges
- written in golang
zookeeper
consul

Docker's libkv tries to abstract this into a general kv library.