DOC: Key Concepts and Objects - ibmcb/cbtool GitHub Wiki

Experiment: A sequence of CBTOOL directives (with start times and duration for each execution), processed and executed sequentially.

An experiment is composed by a set of Experiment Objects. Experiment objects are defined within CBTOOL, and used by it to control the effective deployment and execution of benchmark applications.

Objects are of three classes.

The configuration objects are used by CBTOOL for environment configuration and experiment execution customization. The parameters of these objects control several aspects of the CBTOOL execution (e.g., the IP address of the node that executes the Metric Store, what is the default polling interval for provisioning operations, the location of the ssh private keys used to connect to the VMs). These objects are also know internally as global objects, since all other objects ( concrete and abstract have parameters derived from "templates" on configuration objects.
The concrete objects are managed and tracked by both CBTOOL and the cloud manager.
The abstract objects are the ones whose meaning and state are tracked only by CBTOOL, representing a logical aggregation of the multiple instances of concrete objects. Abstract objects can represent either a single Virtual Application deployed on a cloud, or a group of inter-related VApps. It is through the specification of abstract objects that an experiment assumes a truly dynamic behavior.

Concrete Objects are of four flavors: Clouds, Virtual Machine Containers (VMCs), Hosts (exposed by some Clouds) and Virtual Machines (VMs).

Cloud: The Cloud object represents the cloud manager, and includes in its description all information required to establish a connection to it, including access and authentication credentials.

By having a whole cloud as an object, CloudBench allows one direct an individual experiment plan at multiple clouds to compare them against each other.

Virtual Machine Container (VMC): The smallest point of access or "place" where a VM is instantiated.

Each VMC has a cloud-wide unique identifier
While the meaning of VMC is invariant, its scope is very specific to each particular cloud. It can range from a single host (in a virtualized environment with the libvirt/KVM duo) to a whole geographic region with multiple “availability zones” (in the case of Amazon’s Elastic Compute Cloud).
The definition of the VMC as a distinct object is useful, allowing CBTOOL to exploit intra-cloud parallelism (e.g., in a geographically distributed cloud)

Host: Individual Hypervisors where VMs are effectively deployed.

Not all clouds allow Hosts to be discovered/monitored (e.g., Amazon EC2)

Virtual Machine (VM): Individual Virtual Machine instances are the only element whose state (i.e., created, running, destroyed) are effectively know by a given cloud.

VMs are instantiated and terminated (the latter only in case of an VAppS, with a variable number of instances) by the submission of the appropriate commands/operations/requests to the cloud by CBTOOL.
In order to be properly created, a VM with a given role needs to be fully identified within the cloud, with some cloud-specific information, like image id, instance size and/or class. This is designated a VM template.

Abstract Objects are of four flavors: Virtual Applications, Virtual Application Submitters, Virtual Machine Capture Request Submitters, and Fault Injection Request Submitters.

Virtual Application Submitter (VAppS): A collection of Application Instances of a given Application Type.

Due to "historical" reasons, this abstraction is also called Application Instance Deployment Request Submitter (AIDRS).
Every Virtual Application instance has an inter-arrival time, and a lifetime, governed by two VAppS-wide random distributions.
Every Virtual Application instance on an VAppS has an individual time-varying load intensity and duration attributed to it, according to two VAppS-wide random distributions.

Virtual Application (VApp): A collection of VMs that run cooperatively to effectively execute a given application type.

Due to "historical" reasons, this abstraction is also called Application Instance (AI)
Every VM has a “role” within the VApp.
One of the VMs has to have the roles load manager and metrics aggregator. This VM will: (1) manage the load applied to the rest of the AI and (2) collect performance data from the VApp.
Each VApp has an “ Virtual Application template ”, containing a list of VM roles and its topology.

Virtual Machine Capture Request Submitter (VMCRS): a long running process that randomly selects VMs (within a scope) and then sends a request to the cloud manager to capture them.

It is almost unnecessary to point out that this Abstract Object can only be used if the cloud supports the "capture" operations (in some clouds, this is termed "save instance" or "snapshot instance").
A Virtual Machine Capture Request Submitter operates independently of any other Abstract Object, but requires a set of previously created Virtual Applications (these can be either explicitly created, or implicitly, by a VAppS).
Every Virtual Machine Capture Request Submitter instance has the following parameters :

Scope: only VMs belonging to a VApp of a certain type, or all VMs belonging to a certain cloudbench user are candidates to capture
Inter-arrival time: VMCRS-wide random distributions, very similar to the inter-arrival time of the VAppS
Maximum simultaneous capture requests: since a capture operation can take a long time, we can determine that the VMCRS has to wait after a certain number of concurrent capture operations is reached before continuing to issue new ones (this takes precedence over the inter-arrival time).
Maximum number of total capture requests: in order to not starve the cloud of VMs (a capture operation is always preceded by an instance shutdown, and succeeded by an instance removal), the VMCRS can be instructed to stop after a certain number of capture operations was issued.
Minimum capture age: in order to be meaningful, the capture operation has to be issued against a VM whose amount of data generated differs significantly from the base image template from where it was booted. Since the data generated is proportional to the execution time of the application (typically), it is important that the VMs selected to be captured should execute for at least a minimum amount of time before being captured.