12 04 Dynamo Amazons Highly Available Key value Store - lkuper/CSE232-2024-09 GitHub Wiki

Discussion for: Giuseppe DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)

Paper overview

The paper introduces Dynamo, a distributed key-value store developed by Amazon to provide high availability and fault tolerance for services like shopping carts and session management. Dynamo is designed to meet Amazon’s operational requirements of reliability and scalability while allowing applications to remain "always writeable" even under conditions like server failures, network partitions, or extreme load.

To achieve these goals, Dynamo sacrifices strong consistency in favor of eventual consistency, making it a system optimized for high availability and responsiveness. Key techniques include consistent hashing for partitioning and replication, hinted handoff to handle temporary failures, vector clocks for tracking updates and resolving conflicts, and protocols for membership and failure detection.

Discussion questions

Q1

Previously, we read the "Paxos Made Live" paper, which discussed Google's Chubby lock service. Discuss ways in which Dynamo's design principles and implementation differ from Chubby's.

Discussion summary for Q1

Dynamo and Chubby represent two very different approaches to distributed systems, reflecting their distinct use cases. Chubby is a centralized system that uses leader election and the Paxos consensus algorithm to ensure strong consistency, making it well-suited for managing locks. Dynamo, on the other hand, is fully decentralized and prioritizes availability and scalability over strict consistency. It is designed for loosely coupled systems where high availability and fault tolerance are critical, even at the expense of occasional inconsistencies.

Chubby achieves fault tolerance through replication across five replicas, ensuring availability as long as a majority of these replicas are functional. In contrast, Dynamo operates at a much larger scale, managing hundreds to thousands of nodes across multiple data centers. This allows it to handle much larger systems, but it requires techniques like application-level conflict resolution to deal with inevitable failures and inconsistencies. Overall, while Chubby emphasizes strong guarantees and coordination, Dynamo focuses on liveness and performance, making it better suited for large, dynamic systems like Amazon’s platform.

Q2

The paper is quite emphatic about the number 99.9. For example, Section 2.2 of the paper says, "at Amazon, SLAs are expressed and measured at the 99.9th percentile of the distribution". What is an SLA? What "distribution" are the authors referring to, and why do we care about the 99.9th percentile specifically? What would be some alternative ways to express and measure an SLA, and what's wrong with those alternatives (according to the Dynamo authors)?

Discussion summary for Q2

An SLA, or Service Level Agreement, is a contract that defines the level of performance a service is expected to deliver. In the context of Dynamo, the SLA focuses on request latency, and the "distribution" refers to the range of latencies experienced by all requests handled by the system. The paper emphasizes the 99.9th percentile of this distribution, which represents the worst-case latency experienced by the slowest 0.1% of requests. By focusing on this percentile, Dynamo ensures that even the outlier requests are handled within acceptable bounds, rather than just optimizing for the average or median user experience.

This approach is particularly important in large-scale systems like Amazon, where tail latencies can severely degrade the experience for users during peak loads. Alternative metrics like the mean or median fail to capture these outliers, which can make the system appear to perform well overall while some users experience significant delays. The focus on the 99.9th percentile aligns with Amazon's customer-centric philosophy, ensuring that even edge cases are handled reliably.

Q3

(related to questions asked by multiple students)

During lecture on Monday, we discussed representing the contents of a shopping cart as a set, and using the set union operation to merge different versions of a shopping cart. This example is discussed in section 2 of the Dynamo paper:

For instance, the application that maintains customer shopping carts can choose to “merge” the conflicting versions and return a single unified shopping cart.

Why does this merge need to be handled at the application level instead of at the data store level, i.e., in Dynamo itself? If an application does not provide its own merge mechanism, what does Dynamo do in that case? What are the pros and cons of handling merges at the application level, versus handling merges at the data-store level?

Discussion summary for Q3

The merge of conflicting versions in Dynamo is handled at the application level because application developers have the best understanding of their specific data and how conflicts should be resolved. For example, in the case of a shopping cart, developers may choose to merge items by taking the union of two conflicting versions, ensuring that no items are lost. This flexibility allows developers to create domain-specific logic that aligns with their application’s requirements and ensures the best possible user experience.

If an application does not provide its own merge mechanism, Dynamo defaults to a simpler policy, such as "last write wins." While this approach is efficient, it may not always preserve the correct or intended state of the data. Application-level merging provides developers with more control and customization but also increases development complexity, as they are responsible for implementing the merge logic. On the other hand, handling merges at the data store level simplifies development and ensures consistent behavior across applications but may not address the unique needs of individual use cases. Dynamo’s design reflects a trade-off, emphasizing flexibility and empowering developers to make decisions based on their application’s requirements.

Q4 (optional; if time)

(question contributed by former CSE232 student Bobby Dhillon)

Section 2.3 of the paper states,

Many traditional data stores execute conflict resolution during writes and keep the read complexity simple [7]. In such systems, writes may be rejected if the data store cannot reach all (or a majority of) the replicas at a given time. On the other hand, Dynamo targets the design space of an “always writeable” data store (i.e., a data store that is highly available for writes) [...] This requirement forces us to push the complexity of conflict resolution to the reads in order to ensure that writes are never rejected.

What types of systems would benefit from being "always readable", with conflict resolution on writes rather than reads? Name a few examples.

Discussion summary for Q4, if we get to it

[to be filled in by scribes, if we get to it]