Rancher Packaging and Catalog - sonchang/cattle GitHub Wiki

Overview

Current state

Users can define and launch a service potentially consisting of multiple sub-services into an environment. This service is defined with a docker+rancher.yml file that may contain information (such as password, port #, log directories, etc...) specific to this environment.

Example:

service_a:
  labels:
    io.rancher.scheduler.global: 'true'
  ports:
    - 8080
  environment:
    - HOST_NAME=foo1
    - DB_USER=service_user
    - DB_PASSWORD=some_password
  image: sonchang/service_a
  volumes:
    - /var/log

If this docker+rancher.yml file was copied and supplied to someone else to use, they would have to manually identify what needs to be changed in the file before running compose up on it. For example, they would almost certainly have to change the DB_USER and DB_PASSWORD environment values and possibly the expose port if they have a port conflict.

It would be nice if it was possible to package this service up generically. Likewise, it would be nice if you could put this package into some repository. And even better would be if you could limit who has access to this repository.

Proposal

There are 2 elements to the proposal which are mostly orthogonal to each other:

Service Package: How to generically package services?
Rancher Catalog: How to manage or distribute these "packages"?

I listed "Rancher Package" first since I think it's necessary to solidify this concept first before delving into "Rancher Catalog".

But ultimately, it would be nice to author reusable services or applications (service package), and package it into something easy to understand, and can easily be distribute/shared, and run in other environments.

Service Package

In this document, I've used the term "service package". Using an analogy, a service package is like a java class as a service is to an instance of the class.

`rancher-compose`

rancher-compose allows us to specify how an entire application or individual services are defined. However, services often require some degree of configuration: passwords, log directories, exposed port numbers, host names, what system capabilities are exposed, etc...

There are many ways this can be achieved:

A custom docker image can be built with the configuration already baked in the image (however, this doesn't solve the exposed ports, mounted volumes, and linux capabilities)
The service itself or some bootstrapping script can pull down these configuration values from some server. However, the location of this server as well as connection credentials will still need to be supplied somehow. Also, once again this doesn't solve the container's runtime configuration: ports, volumes, networking, linux capabilities.
Configuration data can be specified in the environment variables in the compose file (Not recommended for passing in secrets like passwords)
A mixture of the above mechanism can be used. For example, the location for a configuration server + the connection credentials can be specified in environment variables.

For secrets, a system like keywhiz should probably be used

Generalizing `compose.yml` to `compose-template.yml`

What if we created a compose-template.yml file by replacing all the environment configuration information with a macro? In the below example, we have the same .yml file except some values have been substituted with a macro of the form ${name}.

Example:

service_a:
  labels:
    io.rancher.scheduler.global: 'true'
  ports:
    - ${port}
  environment:
    - HOST_NAME=${host_name}
    - DB_USER=${db_user}
    - DB_PASSWORD=${db_password}
  image: sonchang/service_a
  volumes:
    - ${log_dir}

With a generic .yml template file like above, we can feed it through a pre-processor to do the variable substitutions to produce a concrete compose file that can be fed to docker compose.

To do this variable substitution, it might be nice to know a little bit more about these variables. One possibility is to define this in a new field user_fields_metadata within a .yml file (or could just be part of the service .yml file).

user_fields_metadata:
  port:
    type: int
    description: "API port #"
    default: 8080
  db_user:
    type: string
    description: "DB user"
  some_option:
    type: enum
    options:
    - 'option1'
    - 'option2'
    - 'option3'

Some things to note:

Also, if nothing is declared, but a variable substitution is used, it will just be interpreted as a 'string' type with no description.
For places where we want the literal text ${foo} and not do substitution on it, we can use an escaping mechanism.

Since the user will not know what host the service is deployed to, information like ${host_name} will be substituted automatically by cattle. ${host_name} can be part of a set of system_fields_metadata. These system_fields_metadata are special cattle supplied runtime environment variables.

${cardinal_number}: for scaled services, this is the cardinal number in the scale
${host_name}: domain name for individual host within the service
${service_domain}: domain name for service
${all_domain_names}: comma-separated list of all host domain names within the service
...TODO

How do we know about these variables and how do we get these values?

From UI

If the user is starts up this service in the UI, cattle is able to examine the compose-template.yml file to discover what are the user input fields. The UI can then prompt for those and cattle can do the variable substitutions, feed it into rancher-compose to execute the various cattle API calls to launch the service.

Command line

rancher-compose binary (or some pre-processor binary that pipes into rancher-compose) can read the compose-template.yml file to discover what are the user input fields. It can then prompt the user to enter them from the command prompt. This allows for scripting and possibly integration with other systems.

Meta-data Server

For user supplied data, we could use a meta-data server to "save" or store the values after they've been inputted. There can be an option in the UI and CLI to specify to retrieve from this server instead of prompting the user. These values would have to be namespaced according to the environment/project name.
For system supplied data, the server will have to determine the values programmatically.

Some questions

This meta-data server can be used in many different ways. It can be used in the compose-template.yml variable substitution. It can also be used by some bootstrap.sh script within the Docker image to pull down configuration data to auto-configure itself. The connection info for the meta-data server as well as the service's deployment_unit ID can be passed into the bootstrap.sh so it knows how to fetch information about itself.
This server can potentially just be:
1. The cattle server or an entirely new server. If this server is separate from cattle, it'll still likely have to have access to cattle's data.
2. The server should be able to operate in a cluster with ideally a redundant data store

Potential API

Request: curl http://10.42.102.189/v1/{deployment_unit_id}/config_data

Response:

{
   'host_name': 'serviceA1',
   'service_domain': 'serviceA',
   'ip_address': '10.42.103.123',
   'cardinal_number': 1,
   'all_domain_names': 'serviceA1,serviceA2,serviceA3',

   'log_dir': '/var/log',
   'port': '8080',
   'other_server_url': 'http://someotherserver:8081/v1',
    etc...
}

Authoring "service packages" in the UI

Most likely for phase 1 of development, the UI will just allow you to text edit the compose yml file.

A better UI may be developed in the future. One idea is that it could be similar to the existing UI for creating a new Service with the following modifications:

Addition of new tab for specifying user fields
In all other places where a free form text field exists, it's possible for the user to enter a macro substitution variable instead
Not sure what's the best way for handling non-free form text fields

I think one use case that might happen often is a user spends time to try to get an actual service instance working properly. Then they might want to package it so someone else can easily install it.

Rancher Catalog

Recap of "packaging" above

From the above packaging discussion, it is likely that a "service package", will likely consist of a compose-template.yml file with a custom Docker image that has a bootstrap.sh script to potentially pull down config data from a meta-data server. The .yml file can serve 2 purposes:

Enumerate and describe what fields users need to provide data for
Do the proper substitutions to establish the proper container runtime environment

The bootstrap.sh script that pulls down data from a meta-data server can be useful for:

To avoid putting sensitive data in the compose file
Provide easily runtime data after the container has been deployed (for example, IP address)

Repository, visibility scope/sharing rules

If everything is encapsulated into essentially a .yml file (which includes a reference to the specific docker image), this can easily be represented as just a txt file.

Current thought here is to provide integration with other repositories such as github. These other repositories can then handle versioning and potentially visibility scope and sharing rules.

In the Services tab, in addition to the existing 'Add Service' option, we can potentially introduce 'Author Service' and 'Install Service'.

Choosing 'Authoring Service', takes you to the authoring UI (See above).

Choosing 'Install Service' provides a list of available services according to the visibility scope/sharing rules we decide on above.

Initial list of targeted services

zookeeper
etcd
kubernetes
swarm
mesos
prometheus
rancher-volume
mysql
redis
hadoop
spark
wordpress
drone
jenkins
elk
kafka
cassandra
mongodb
ceph
gluster
consul
docker-registry