ds note - modrpc/info GitHub Wiki

Table of Contents

Topics

Performance

  • Distribution can hurt: network B/W and latency bottlenecks
    • lots of tricks: e.g. caching, threaded serves
  • Distribution can help: parallelism, pick server near client
  • Idea: scalable design
    • Nx servers -> Nx total performance
  • Need a way to divide the load by N
    • device the state by N
      • split by user
      • split by file name
      • "sharding" or "partitioning"
  • Rarely perfect -> only scales so far
    • Global operations: e.g. search
    • Load imbalance
      • One very active user
      • One very popular file
        • one server 100%, added servers mostly idle
        • Nx servers -> 1X performance

Fault Tolerance

  • Dropbox: ~10,000 servers: some fail
  • Can I use my files if there's a failure?
    • Some part of the network, some set of servers
  • Maybe: replicate the data on multiple servers
    • Perhaps client sends every operation to both
    • Maybe only needs to wait for one reply
  • Opportunity: operate from two "replicas" independenty if partitioned
  • Opportunity: can 2 servers yield 2x availability AND 2x performance?

Consistency

  • "contract with apps/users" about meaning of operations
    • e.g. "read yields most recently written value"
    • hard due to partial failure, replication/caching, concurrency
  • Problem: keep replicas identical
    • If one is down, it will miss operations
      • Must be brought up to date after reboot
    • If net is broken, *both* replicas maybe live, and see different ops
      • Delete file, still visible via other replica
      • "split brain" -- usually bad
  • Problem: clients may see updates in different orders
    • Due to caching or replication
    • I make grades.txt unreadable, then TA writes grades to it
    • What if the operations run in different order on different replicas?
  • Consistency often hurts performance (communication, blocking)
    • Many systems cut corners -- "relaxed consistency"
    • Shifts burden to applications

Distributed Communication (Silberschatz Ch.15)

Sockets

  • socket: endpoint for communication
    • a pair of (how about group?) processes communicating over a network employs a pair of sockets
    • socket: IP address + port

RPC

RPC semantics

  • SETUP
    • WLOG, there are only two machines: M0 and M1
    • one-caller, one-callee
      • M0 calls FOO on M1
    • one-caller, two-callee
      • M0 calls FOO and BAR on M1
    • two-caller, one-callee
      • M0 calls FOO twice (from different thread)
  • at-least-once semantics
  • at-most-once semantics
  • exactly-once semantics
    • at-most-once semantics + unbounded retries + fault-tolerant service
⚠️ **GitHub.com Fallback** ⚠️