Aquarium - adobe/aquarium-fish GitHub Wiki

Aquarium was born as an experiment over completely distributed and heterogeneous system to manage an organization resources. It mostly serves the purpose of internal cloud without binding to hardware and executed as userspace application to dynamically allocate the resources needed by the organization.

Components

How it works?

img/Aquarium-how_aquarium_works.svg

Aquarium just takes all the resource management operations on it's shoulders, so no more worries from your builds scheduler and the other workers-hungry systems.

img/Aquarium-how_aquarium_manages_resource.svg

In simple description - you need to pair the Aquarium cluster of individual Fish nodes via plugin or some other wrapper of simple Aquarium API and it will send the required requests to provide the scheduler system the required resource. When resource is not needed anymore the same plugin will destroy the resource (or it will be destroyed by the Aquarium itself by timeout).

A bit of history

Initially we had a huge demand for the stability of Build/Release infrastructure. Organizations have a huge issue of "how to manage the resources", because usually it's too complicated to setup the environment and maintain it to quickly share the compute resources between the teams. It's particularly hard to manage the build systems, because they are heterogeneous (hello MacOS & Windows) and needs as much compute power as possible without much overhead. As a result of conversations the requirements was defined:

  • Save the build environment for releases and ensure they are buildable for 10+ years

    Jenkins will be able to make sure we still can build the old releases we have to support for the government contracts.

  • Have protected and immutable build/test environment

    Separated resources cluster will provide the resource on demand out of image. When resource is used once - it's going to be destroyed to not be reused by another workload accidentally.

  • Share the available resources with anyone who needs them according the defined rules

    The separated resources cluster will be able to share with the others according to the negotiated limitations.

  • Maintain Jenkins easier

    Such schema will allow to easily add/remove Jenkins to/from the cluster. Also will not require to have just one Jenkins at a time, so new Jenkins can replace the old one with the seamless migration of the jobs if needed.

  • Develop & test changes locally with no disturbance for prod services

    With full Jenkins config automation, resources manager and pipeline support for proper testing it will be possible to run the whole system locally and test it with any use case.

  • Have a completely automated build environment with no people inside

    To provide the enough level of security - we can't allow anyone to interfere the build process or put some manual changes in the build environments, so that will allow to use resources and artifacts both in Release and Build automation.

  • Debug the issues with the build environment snapshot

    On fail it's possible to take snapshot of the pipeline environment and keep it somewhere in order to debug locally.

  • Easy to implement disaster recovery

    Basically is about the configuration automation, so we can have multiple clusters and smart routing (DNS-based with ping for example) to those 2 LBs will allow to provide quite similar disaster recovery with no waste of the resources.

So the first step was to build the images and after that separate Jenkins server from the agents - that's how Aquarium PoC saw the light. The experience with Clouds and Kubernetes showed a relatively good way to solve all those issues, so experimentation was begin.

Aquarium Bait

Aquarium Bait was the first part of the Aquarium system developed in order to test the optimal ways of building the stable and reliable environments for Build/Release purposes. For sure configuration management showed its teeth by making the maintenance of the huge pool of machines unbearable. This is not how we wanted to see the future, so first of all found the root cause - complexity of those scripts are coming from the unknown state of the original system. So it tries to maintain the system state without knowing the previous state which by definition is ridiculous idea.

The layered images actually solves this issue with unknown state, because the configuration scripts executed just one time (during the build) and on the well-known state.

For the PoC we chosen VMWare VMX (Fusion/Workstation) due to the next reasons:

  • Userspace application - easy to run locally and play with, developers happy to reproduce the CI environments and actually have experience in using this applications.
  • Cross-platform - supports Mac, Win, Linux. Most probably we will live just with Mac and Linux as HW platforms in order to lower the expertise and automation efforts.
  • Site license available for our organization.
  • Performance tests for VMWare Fusion & MacOS showed 2.1-4.3% build performance hit in the real pipeline application.
  • In comparison with ESXi:
    • We don't need to support HW level capabilities (uses supported OS level API to virtualize, instead of reverse-engineered linux kernel on Mac for example). VMWare claims ESXi is running on Mac HW, but none of them (VMWare/Apple) are actually support us or will provide future plans about that.
    • License restrictions are much more complicated - restricts amount of CPUs and can hit us hard with the amount of machines in pool we have. Not to mention vSphere separated license and who knows what else if we're going to the full-blown VMWare infrastructure.
    • ESXi require HW OS switch (with Fusion/WS we can use VM and don't loose way to run on actual HW MacOS applications).
    • Portability is much higher - the same images will work the same way on our systems and on dev systems. In theory ESXi machines are compatible with the other VMWare products, but we know how it is working in the real world...
    • MacOS license restrictions not allow to run more than 2 MacOS VMs, so ESXi features is overkill and loses OS HW layer (which can be third env actually).
    • It's much easier to switch to opensource solution VMs or containers if we will not go ESXi route - otherwise we will stuck in the proprietary swamp and by spending alot of money on licenses and expertise we will loose it.
    • Performance tests are actually not so off between the VMWare products - they are using quite the same core and CPU features, ESXi could be slightly more performant (~5%).

The implementation took a while and was mostly aimed to layering of the images of MacOS and, later, Windows platforms - because they are the most complicated for automation. For now VMX is still the best choice from sandboxing, cross-platform, feature-support and control perspective. The beauty of VM - that it allows a complete control over the OS, so complete (shell + UI through VNC) automation is possible. As the future plans that allowed us to build the images in VM and after that move them to HW machines.

Aquarium Fish

Aquarium Fish was the second step, because it needs the images to operate on. We found that to create the completely distributed system we need some sort of internal database and DQlite was seen as a good option to not deal with the implementation of cluster algorithms. Unfortunately due to it's C-nature and limits in different OS support DQLite was replaced by Go-based SQLite with custom cluster sync system. Golang was chosen because of the available bindings to dqlite, no GIL, wide range of available packages and ability to compile one binary.

The database object model was designed to serve the distributed cluster requirements and to fulfill the Aquarium goals. For example Labels was come from Jenkins (it uses labels to identify the agents capabilities and properly distribute the workloads), they are become the versioned environment definitions, with strict images versions, to survive in time (~10y) and serve the old pipelines the same way. Decision making engine was based on cluster voting to get the consensus on who will execute the workload.

The system quickly absorbed the best design patterns (ORM, OpenAPI, plugins, ...) and become a quite compact userspace service with an attempt to become a sandboxing solution for the dynamic environments, which gives an answer to the regular security question "do we need to strictly control what's executed on the environment?" and the answer is "NO, it's easier to sandbox the executable code, allow it a limited controlled access to the required services and destroy the environment afterwards".

Aquarium Net Jenkins

Aquarium Net Jenkins - was the last part of the system to show how it will interact with the Aquarium Fish to dynamically allocate the agents, which are connects back to Jenkins server. In order to do that the Aquarium Fish and Bait images need to support the Metadata processing (just like in clouds) which is sent by the plugin (agent name, agent secret, jenkins url, ...).

The plugin was based on Scripted Cloud Plugin, became a simplification of the Kubernetes plugin and works quite the same way. It is using Jenkins cloud extension point and when it sees the workload in queue and the label is serviceable by the Aquarium Fish cluster - it creates the new node and sends the Application request to the cluster, which allocating the Resource and it connects back to the Jenkins, executing workload and after that node destroys the Resource after that.

Overall the plugin is simple, but shows the majority of the processes to use the Aquarium API correctly. It also uses OpenAPI to generate client source code out of the API specification provided by Aquarium Fish. It initializes from init host and later can get the list of cluster nodes to connect to one of them in case the init host becomes unavailable.

Internal processes

You can find the internal processes document right here: Aquarium: Internal Processes