System Design Material - s530479-ShivaKumar/Leet GitHub Wiki

System Design

Basic approach

  1. Ask lot of questions to clarify the requirements.
  2. List all the functional and non functional requirements.
  3. Design the API's required
  4. Talk about the database schema
  5. Explain the architectural design.

Key characteristics of Distributed systems.

  • Scalability
  • Reliability
  • Availability
  • Efficiency
  • Serviceabbility or Manageability.

Load Balancing

1. LB algorithms

  • Least Connection Method
  • Least Response Time Method
  • Least Bandwidth Method
  • Round Robin Method
  • Weighted Round Robin Method
  • IP Hash.

Caching

  • Appication Server Cache

Cache invalidation

  1. Write through cache
  2. Write around cache
  3. Write back cache.

Cache eviction policies

  1. First In First Out (FIFO)
  2. Last In First Out (LIFO)
  3. Least Recently Used (LRU)
  4. Most Recently Used (MRU)
  5. Least Frequently Used (LFU)
  6. Random Replacement (RR)

Data Partitioning

Methods

  1. Horizontal partitioning
  2. Vertical Partitioning
  3. Directory Based Partitioning

Partitioning Criteria

  1. Key or Hash-based partitioning
  2. List partitioning
  3. Round-robin partitioning
  4. Composite partitioning

Common problems of Data Partitioning

  1. Joins and Denormalization
  2. Referential integrity
  3. Rebalancing

Indexing

  • While indexing increases the performance of querying the database, they should be carefully selected to avoid decrease in performance where there are more write operations than retrivals.(Since we have to update the indexes each time when there is an insertion).

Proxies

  • A proxy server is an intermediate server between the client and the back-end server. Clients connect to proxy servers to make a request for a service like a web page, file, connection, etc. In short, a proxy server is a piece of software or hardware that acts as an intermediary for requests from clients seeking resources from other servers.
  • Typically, proxies are used to filter requests, log requests, or sometimes transform requests (by adding/removing headers, encrypting/decrypting, or compressing a resource). Another advantage of a proxy server is that its cache can serve a lot of requests. If multiple clients access a particular resource, the proxy server can cache it and serve it to all the clients without going to the remote server.

Open Proxy

  • Anonymous
  • Transparent

Reverse Proxy

Databases

SQL

  • Relational databases store data in rows and columns.
  • MySQL, Oracle, MS SQL Server, SQLite, Postgres, and MariaDB.

NO SQL

  1. Key-Value Stores
  • Redis, Voldemort, and Dynamo.
  1. Document Databases
  • CouchDB and MongoDB.
  1. Wide-Column Databases
  • Cassandra and HBase.
  1. Graph Databases
  • Neo4J and InfiniteGraph.

Differences between SQL and NoSQL

  1. Storage
  2. Schema
  3. Querying
  4. Scalability
  5. Reliability or ACID compliance

System Design Problems

tiny URL

  1. Functional Requirements
  • Should create a short url of fixed size when given a big URL
  • Should retrieve the big URL when short url is given
  • User login
  1. Non functinal requirements
  • System should be available
  • System should be Scalable
  • System should be durable
  • System should be consistent.
  1. API design
  • CreateURL(userToken, longURL, expirationTime)
  • get(userToken, shortURL)
  • deleteURL(userToken, shortURL)
  1. DBSchema NoSql will be good as we don't have much mappings and it can scale easily.
  2. Architectural design