System Design - robbiehume/CS-Notes GitHub Wiki

General Notes

Load Balancing

  • Load balancing helps distribute incoming requests and traffic evenly across multiple servers
  • The main goal of load balancing is to ensure high availability, reliability, and performance by avoiding overloading a single server and avoiding downtime
  • To utilize full scalability and redundancy, we can try to balance the load at each layer of the system. We can add LBs at three places:
    • Between the user and the web server
    • Between web servers and an internal platform layer, like application servers or cache servers
    • Between internal platform layer and database
  • Common use cases:
    • Improving website performance: it can distribute incoming web traffic among multiple servers, reducing the load on individual servers and ensuring faster response times for end users
    • Ensuring high availability and reliability: by distributing the workload among multiple servers, load balancing helps prevent single points of failure
    • Scalability: load balancing allows organizations to easily scale their infrastructure as traffic and demand increase
      • Additional servers can be added to the load balancing pool to accommodate increased demand, without the need for significant infrastructure changes

Database Design

SQL

Structured, strict, relational

  • Use SQL when:
    • Data is structured with complex relationships (e.g., banking, ERP)
    • ACID compliance and consistency are critical
      • Atomic: all or nothing transactions
      • Consistent: data is valid before and after
      • Isolated: multiple transactions at the same time
      • Durable: committed data is never lost
    • Complex queries (joins, aggregations) are needed
  • Weaknesses:
    • The structure (schema) must be created ahead of time
    • Not effective for storing / querying unstructured data
    • Difficult to scale horizontally because of their relational nature
      • For read heavy systems it's easy to create multiple read-only replicas
      • For write heavy systems, usually the only option is to vertically scale the DB up, which is usually more expensive than provisioning additional servers
  • Scales vertically

NoSQL

Scalable, flexible, distributed

  • Use NoSQL when:
    • Scalability and high availability matter (e.g., real-time apps, IoT)
    • Data is semi-structured or schema is flexible
    • You need fast, distributed storage (e.g., caching, big data)
  • General
    • When
  • Strengths
    • They're flexible and simpler to setup because they don't support table relationships
    • Better for storing unstructured data
      • Because of this, they can also shard this data across different data stores, allowing for distributed databases
  • Weaknesses
    • For write heavy systems, even with shading (peer-to-peer replication) there can be loss of consistency
    • This can lead to accessing stale data, known as eventual consistency