System Design - robbiehume/CS-Notes GitHub Wiki
General Notes
Load Balancing
- Load balancing helps distribute incoming requests and traffic evenly across multiple servers
- The main goal of load balancing is to ensure high availability, reliability, and performance by avoiding overloading a single server and avoiding downtime
- To utilize full scalability and redundancy, we can try to balance the load at each layer of the system. We can add LBs at three places:
- Between the user and the web server
- Between web servers and an internal platform layer, like application servers or cache servers
- Between internal platform layer and database
- Common use cases:
- Improving website performance: it can distribute incoming web traffic among multiple servers, reducing the load on individual servers and ensuring faster response times for end users
- Ensuring high availability and reliability: by distributing the workload among multiple servers, load balancing helps prevent single points of failure
- Scalability: load balancing allows organizations to easily scale their infrastructure as traffic and demand increase
- Additional servers can be added to the load balancing pool to accommodate increased demand, without the need for significant infrastructure changes
Database Design
SQL
Structured, strict, relational
- Use SQL when:
- Data is structured with complex relationships (e.g., banking, ERP)
- ACID compliance and consistency are critical
- Atomic: all or nothing transactions
- Consistent: data is valid before and after
- Isolated: multiple transactions at the same time
- Durable: committed data is never lost
- Complex queries (joins, aggregations) are needed
- Weaknesses:
- The structure (schema) must be created ahead of time
- Not effective for storing / querying unstructured data
- Difficult to scale horizontally because of their relational nature
- For read heavy systems it's easy to create multiple read-only replicas
- For write heavy systems, usually the only option is to vertically scale the DB up, which is usually more expensive than provisioning additional servers
- Scales vertically
NoSQL
Scalable, flexible, distributed
- Use NoSQL when:
- Scalability and high availability matter (e.g., real-time apps, IoT)
- Data is semi-structured or schema is flexible
- You need fast, distributed storage (e.g., caching, big data)
- General
- When
- Strengths
- They're flexible and simpler to setup because they don't support table relationships
- Better for storing unstructured data
- Because of this, they can also shard this data across different data stores, allowing for distributed databases
- Weaknesses
- For write heavy systems, even with shading (peer-to-peer replication) there can be loss of consistency
- This can lead to accessing stale data, known as eventual consistency