System design - crackodeal/Learning GitHub Wiki
Source of information
The material has been gathered from LinkedIn from my favorite authors. The listed authors retain all their rights.
ByteByteGo
https://www.linkedin.com/company/bytebytego/
Top 5 Kafka use cases
Kafka was originally built for massive log processing. It retains messages until expiration and lets consumers pull messages at their own pace.
Letโs review the popular Kafka use cases.
Log processing and analysis Data streaming in recommendations System monitoring and alerting CDC (Change data capture) System migration
12-Factor App
The "12 Factor App" offers a set of best practices for building modern software applications. Following these 12 principles can help developers and teams in building reliable, scalable, and manageable applications.
Here's a brief overview of each principle:
Codebase: Have one place to keep all your code, and manage it using version control like Git.
Dependencies: List all the things your app needs to work properly, and make sure they're easy to install.
Config: Keep important settings like database credentials separate from your code, so you can change them without rewriting code.
Backing Services: Use other services (like databases or payment processors) as separate components that your app connects to.
Build, Release, Run: Make a clear distinction between preparing your app, releasing it, and running it in production.
Processes: Design your app so that each part doesn't rely on a specific computer or memory. It's like making LEGO blocks that fit together.
Port Binding: Let your app be accessible through a network port, and make sure it doesn't store critical information on a single computer.
Concurrency: Make your app able to handle more work by adding more copies of the same thing, like hiring more workers for a busy restaurant.
Disposability: Your app should start quickly and shut down gracefully, like turning off a light switch instead of yanking out the power cord.
Dev/Prod Parity: Ensure that what you use for developing your app is very similar to what you use in production, to avoid surprises.
Logs: Keep a record of what happens in your app so you can understand and fix issues, like a diary for your software.
Admin Processes: Run special tasks separately from your app, like doing maintenance work in a workshop instead of on the factory floor.
high scalability, and high throughput
The diagram below is a system design cheat sheet with common solutions.
High Availability This means we need to ensure a high agreed level of uptime. We often describe the design target as โ3 ninesโ or โ4 ninesโ. โ4 ninesโ, 99.99% uptime, means the service can only be down 8.64 seconds per day. To achieve high availability, we need to design redundancy in the system. There are several ways to do this:
Hot-hot: two instances receive the same input and send the output to the downstream service. In case one side is down, the other side can immediately take over. Since both sides send output to the downstream, the downstream system needs to dedupe.
Hot-warm: two instances receive the same input and only the hot side sends the output to the downstream service. In case the hot side is down, the warm side takes over and starts to send output to the downstream service.
Single-leader cluster: one leader instance receives data from the upstream system and replicates to other replicas.
Leaderless cluster: there is no leader in this type of cluster. Any write will get replicated to other instances. As long as the number of write instances plus the number of read instances are larger than the total number of instances, we should get valid data.
High Throughput This means the service needs to handle a high number of requests given a period of time. Commonly used metrics are QPS (query per second) or TPS (transaction per second). To achieve high throughput, we often add caches to the architecture so that the request can return without hitting slower I/O devices like databases or disks. We can also increase the number of threads for computation-intensive tasks. However, adding too many threads can deteriorate the performance. We then need to identify the bottlenecks in the system and increase its throughput. Using asynchronous processing can often effectively isolate heavy-lifting components.
High Scalability This means a system can quickly and easily extend to accommodate more volume (horizontal scalability) or more functionalities (vertical scalability). Normally we watch the response time to decide if we need to scale the system.
Data Pipelines Overview.
Data pipelines are a fundamental component of managing and processing data efficiently within modern systems. These pipelines typically encompass 5 predominant phases: Collect, Ingest, Store, Compute, and Consume.
-
Collect: Data is acquired from data stores, data streams, and applications, sourced remotely from devices, applications, or business systems.
-
Ingest: During the ingestion process, data is loaded into systems and organized within event queues.
-
Store: Post ingestion, organized data is stored in data warehouses, data lakes, and data lakehouses, along with various systems like databases, ensuring post-ingestion storage.
-
Compute: Data undergoes aggregation, cleansing, and manipulation to conform to company standards, including tasks such as format conversion, data compression, and partitioning. This phase employs both batch and stream processing techniques.
-
Consume: Processed data is made available for consumption through analytics and visualization tools, operational data stores, decision engines, user-facing applications, dashboards, data science, machine learning services, business intelligence, and self-service analytics.
The efficiency and effectiveness of each phase contribute to the overall success of data-driven operations within an organization.
REST API Design
What is GraphQL? Is it a replacement for the REST API?
The diagram below shows the quick comparison between REST and GraphQL.
๐นGraphQL is a query language for APIs developed by Meta. It provides a complete description of the data in the API and gives clients the power to ask for exactly what they need.
๐นGraphQL servers sit in between the client and the backend services.
๐นGraphQL can aggregate multiple REST requests into one query. GraphQL server organizes the resources in a graph.
๐นGraphQL supports queries, mutations (applying data modifications to resources), and subscriptions (receiving notifications on schema modifications).
9 best practices for developing microservices
A picture is worth a thousand words: 9 best practices for developing microservices.
When we develop microservices, we need to follow the following best practices:
- Use separate data storage for each microservice
- Keep code at a similar level of maturity
- Separate build for each microservice
- Assign each microservice with a single responsibility
- Deploy into containers
- Design stateless services
- Adopt domain-driven design
- Design micro frontend
- Orchestrating microservices
Concurrency is ๐๐๐ parallelism.
In system design, it is important to understand the difference between concurrency and parallelism.
As Rob Pyke(one of the creators of GoLang) stated:โ Concurrency is about ๐๐๐๐ฅ๐ข๐ง๐ ๐ฐ๐ข๐ญ๐ก lots of things at once. Parallelism is about ๐๐จ๐ข๐ง๐ lots of things at once." This distinction emphasizes that concurrency is more about the ๐๐๐ฌ๐ข๐ ๐ง of a program, while parallelism is about the ๐๐ฑ๐๐๐ฎ๐ญ๐ข๐จ๐ง.
Concurrency is about dealing with multiple things at once. It involves structuring a program to handle multiple tasks simultaneously, where the tasks can start, run, and complete in overlapping time periods, but not necessarily at the same instant.
Concurrency is about the composition of independently executing processes and describes a program's ability to manage multiple tasks by making progress on them without necessarily completing one before it starts another.
Parallelism, on the other hand, refers to the simultaneous execution of multiple computations. It is the technique of running two or more tasks or computations at the same time, utilizing multiple processors or cores within a computer to perform several operations concurrently. Parallelism requires hardware with multiple processing units, and its primary goal is to increase the throughput and computational speed of a system.
In practical terms, concurrency enables a program to remain responsive to input, perform background tasks, and handle multiple operations in a seemingly simultaneous manner, even on a single-core processor. It's particularly useful in I/O-bound and high-latency operations where programs need to wait for external events, such as file, network, or user interactions.
Parallelism, with its ability to perform multiple operations at the same time, is crucial in CPU-bound tasks where computational speed and throughput are the bottlenecks. Applications that require heavy mathematical computations, data analysis, image processing, and real-time processing can significantly benefit from parallel execution.
How does Docker Work? Is Docker still relevant?
Docker's architecture comprises three main components:
๐น Docker Client This is the interface through which users interact. It communicates with the Docker daemon.
๐น Docker Host Here, the Docker daemon listens for Docker API requests and manages various Docker objects, including images, containers, networks, and volumes.
๐น Docker Registry This is where Docker images are stored. Docker Hub, for instance, is a widely-used public registry.
Explaining JSON Web Token (JWT) with simple terms.
Imagine you have a special box called a JWT. Inside this box, there are three parts: a header, a payload, and a signature.
The header is like the label on the outside of the box. It tells us what type of box it is and how it's secured. It's usually written in a format called JSON, which is just a way to organize information using curly braces { } and colons : .
The payload is like the actual message or information you want to send. It could be your name, age, or any other data you want to share. It's also written in JSON format, so it's easy to understand and work with.
Now, the signature is what makes the JWT secure. It's like a special seal that only the sender knows how to create. The signature is created using a secret code, kind of like a password. This signature ensures that nobody can tamper with the contents of the JWT without the sender knowing about it.
When you want to send the JWT to a server, you put the header, payload, and signature inside the box. Then you send it over to the server. The server can easily read the header and payload to understand who you are and what you want to do.
Improving API Performance with Database Connection Pooling
The diagram below shows 5 common API optimization techniques. Today, Iโll focus on number 5, connection pooling. It is not as trivial to implement as it sounds for some languages.
When fulfilling API requests, we often need to query the database. Opening a new connection for every API call adds overhead. Connection pooling helps avoid this penalty by reusing connections.
๐๐ผ๐ ๐๐ผ๐ป๐ป๐ฒ๐ฐ๐๐ถ๐ผ๐ป ๐ฃ๐ผ๐ผ๐น๐ถ๐ป๐ด ๐ช๐ผ๐ฟ๐ธ๐
- For each API server, establish a pool of database connections at startup.
- Workers share these connections, requesting one when needed and returning it after.
๐๐ต๐ฎ๐น๐น๐ฒ๐ป๐ด๐ฒ๐ ๐ณ๐ผ๐ฟ ๐ฆ๐ผ๐บ๐ฒ ๐๐ฎ๐ป๐ด๐๐ฎ๐ด๐ฒ๐
However, setting up connection pooling can be more complex for languages like PHP, Python and Node.js. These languages handle scale by having multiple processes, each serving a subset of requests.
- In these languages, database connections get tied to each ๐ฝ๐ฟ๐ผ๐ฐ๐ฒ๐๐.
- Connections can't be efficiently shared across processes. Each process needs its own pool, wasting resources.
In contrast, languages like Java and Go use threads within a single process to handle requests. Connections are bound at the application level, allowing easy sharing of a centralized pool.
๐๐ผ๐ป๐ป๐ฒ๐ฐ๐๐ถ๐ผ๐ป ๐ฃ๐ผ๐ผ๐น๐ถ๐ป๐ด ๐ฆ๐ผ๐น๐๐๐ถ๐ผ๐ป
Tools like PgBouncer work around these challenges by ๐ฝ๐ฟ๐ผ๐ ๐๐ถ๐ป๐ด ๐ฐ๐ผ๐ป๐ป๐ฒ๐ฐ๐๐ถ๐ผ๐ป๐ at the application level.
PgBouncer creates a centralized pool that all processes can access. No matter which process makes the request, PgBouncer efficiently handles the pooling.
At high scale, all languages can benefit from running PgBouncer on a dedicated server. Now the connection pool is shared over the network for all API servers. This conserves finite database connections.
Connection pooling improves efficiency, but its implementation complexity varies across languages.