Home - HolDCSim/HolDCSim GitHub Wiki

HolDCSim

A Holistic Simulator for Data Centers

About

Background / Purpose

Cloud computing based systems, that span data centers, are commonly deployed to offer high performance for user service requests. As data centers continue to expand, computer architects and system designers are facing many challenges on how to balance resource utilization efficiency, server and network performance, energy consumption and quality-of-service (QoS) demands from the users. To develop effective data center management policies, it becomes essential to have an in-depth understanding and synergistic control of the various sub-components inside large scale computing systems, that include both computation and communication resources. Prior studies on performance and energy issues in data centers largely focus on either servers or the network and completely ignore issues relating to the other components, or consider only high level analytical models without sufficient detail, which can lead to non-optimal solutions. Unfortunately, it is prohibitively expensive or in some cases even impossible to have complete access to an operational large-scale computing system (e.g., production server farms). Therefore, a comprehensive simulation infrastructure that models all major hardware and system components, and offers interfaces to manage the interplay between both computation and communication resources are critical in advancing future research for more effective performance and energy optimization in data centers.

Our Implementation

We propose HolDCSim, a light-weight, holistic, extensible, event-driven data center simulation platform that effectively models both server and network architectures. HolDCSim can be used in a variety of data center system studies including job/task scheduling, resource provisioning, global and local server farm power management, and network and server performance analysis. We demonstrate the design of our simulation infrastructure, and illustrate the usefulness of our framework with several case studies that analyze server/network performance and energy efficiency.

Topologies

Users are able to simulate multiple types of topology, including fat tree and flattened butterfly, using various routing algorithms and run many types of different experiments. Details on each of these can be found under the Features section.

Easy Configuration

A configuration file is included to allow the user to easily modify the experiment parameters, including things such as network size, topology, link bandwidth, network device power consumption, and much more.

Power Consumption Analyzation

Our implementation allows the user to set different power consumption levels for the networking switches and servers. HolDCSim calculates the power usage throughout the entire experiment, depending on these values. This allows the user to not only analyze both the performance of the system configuration, but also the power efficiency of the system in addition to this.

Custom Sleep State Thresholds

Another design element of HolDCSim is the ability to modify sleep state thresholds for the various sleep levels of the networking devices in the experimental systems. This allows the user to customize how long it takes for the networking devices to enter the different levels of sleep to determine the optimal sleep state thresholds in order to optimize power consumption across the system.

The details on how these values are edited can be seen explained further in the Features section as well.

Visualization Tool

In addition to the core features mentioned above, we are currently developing a Visualization Tool to be paired with the simulator. This will generate tables, graphs, topologies, and animations to assist the user in quickly visualizing the results of the simulation aside from just raw data numbers being dumped to the log files. More information on this can be seen in the Visualization section.