System design - crackodeal/Learning GitHub Wiki

Source of information

The material has been gathered from LinkedIn from my favorite authors. The listed authors retain all their rights.

ByteByteGo

https://www.linkedin.com/company/bytebytego/

Top 5 Kafka use cases

Kafka was originally built for massive log processing. It retains messages until expiration and lets consumers pull messages at their own pace.

Letโ€™s review the popular Kafka use cases.

Log processing and analysis Data streaming in recommendations System monitoring and alerting CDC (Change data capture) System migration

1717774390456

12-Factor App

The "12 Factor App" offers a set of best practices for building modern software applications. Following these 12 principles can help developers and teams in building reliable, scalable, and manageable applications.

Here's a brief overview of each principle:

Codebase: Have one place to keep all your code, and manage it using version control like Git.

Dependencies: List all the things your app needs to work properly, and make sure they're easy to install.

Config: Keep important settings like database credentials separate from your code, so you can change them without rewriting code.

Backing Services: Use other services (like databases or payment processors) as separate components that your app connects to.

Build, Release, Run: Make a clear distinction between preparing your app, releasing it, and running it in production.

Processes: Design your app so that each part doesn't rely on a specific computer or memory. It's like making LEGO blocks that fit together.

Port Binding: Let your app be accessible through a network port, and make sure it doesn't store critical information on a single computer.

Concurrency: Make your app able to handle more work by adding more copies of the same thing, like hiring more workers for a busy restaurant.

Disposability: Your app should start quickly and shut down gracefully, like turning off a light switch instead of yanking out the power cord.

Dev/Prod Parity: Ensure that what you use for developing your app is very similar to what you use in production, to avoid surprises.

Logs: Keep a record of what happens in your app so you can understand and fix issues, like a diary for your software.

Admin Processes: Run special tasks separately from your app, like doing maintenance work in a workshop instead of on the factory floor.

1717816789033

high scalability, and high throughput

The diagram below is a system design cheat sheet with common solutions.

High Availability This means we need to ensure a high agreed level of uptime. We often describe the design target as โ€œ3 ninesโ€ or โ€œ4 ninesโ€. โ€œ4 ninesโ€, 99.99% uptime, means the service can only be down 8.64 seconds per day. To achieve high availability, we need to design redundancy in the system. There are several ways to do this:

Hot-hot: two instances receive the same input and send the output to the downstream service. In case one side is down, the other side can immediately take over. Since both sides send output to the downstream, the downstream system needs to dedupe.

Hot-warm: two instances receive the same input and only the hot side sends the output to the downstream service. In case the hot side is down, the warm side takes over and starts to send output to the downstream service.

Single-leader cluster: one leader instance receives data from the upstream system and replicates to other replicas.

Leaderless cluster: there is no leader in this type of cluster. Any write will get replicated to other instances. As long as the number of write instances plus the number of read instances are larger than the total number of instances, we should get valid data.

High Throughput This means the service needs to handle a high number of requests given a period of time. Commonly used metrics are QPS (query per second) or TPS (transaction per second). To achieve high throughput, we often add caches to the architecture so that the request can return without hitting slower I/O devices like databases or disks. We can also increase the number of threads for computation-intensive tasks. However, adding too many threads can deteriorate the performance. We then need to identify the bottlenecks in the system and increase its throughput. Using asynchronous processing can often effectively isolate heavy-lifting components.

High Scalability This means a system can quickly and easily extend to accommodate more volume (horizontal scalability) or more functionalities (vertical scalability). Normally we watch the response time to decide if we need to scale the system.

1717648765296

Data Pipelines Overview.

Data pipelines are a fundamental component of managing and processing data efficiently within modern systems. These pipelines typically encompass 5 predominant phases: Collect, Ingest, Store, Compute, and Consume.

  1. Collect: Data is acquired from data stores, data streams, and applications, sourced remotely from devices, applications, or business systems.

  2. Ingest: During the ingestion process, data is loaded into systems and organized within event queues.

  3. Store: Post ingestion, organized data is stored in data warehouses, data lakes, and data lakehouses, along with various systems like databases, ensuring post-ingestion storage.

  4. Compute: Data undergoes aggregation, cleansing, and manipulation to conform to company standards, including tasks such as format conversion, data compression, and partitioning. This phase employs both batch and stream processing techniques.

  5. Consume: Processed data is made available for consumption through analytics and visualization tools, operational data stores, decision engines, user-facing applications, dashboards, data science, machine learning services, business intelligence, and self-service analytics.

The efficiency and effectiveness of each phase contribute to the overall success of data-driven operations within an organization.

1715835347563

REST API Design

1714536500926

What is GraphQL? Is it a replacement for the REST API?

The diagram below shows the quick comparison between REST and GraphQL.

๐Ÿ”นGraphQL is a query language for APIs developed by Meta. It provides a complete description of the data in the API and gives clients the power to ask for exactly what they need.

๐Ÿ”นGraphQL servers sit in between the client and the backend services.

๐Ÿ”นGraphQL can aggregate multiple REST requests into one query. GraphQL server organizes the resources in a graph.

๐Ÿ”นGraphQL supports queries, mutations (applying data modifications to resources), and subscriptions (receiving notifications on schema modifications).

1714798139144

9 best practices for developing microservices

A picture is worth a thousand words: 9 best practices for developing microservices.

When we develop microservices, we need to follow the following best practices:

  1. Use separate data storage for each microservice
  2. Keep code at a similar level of maturity
  3. Separate build for each microservice
  4. Assign each microservice with a single responsibility
  5. Deploy into containers
  6. Design stateless services
  7. Adopt domain-driven design
  8. Design micro frontend
  9. Orchestrating microservices

1713714751168

Concurrency is ๐๐Ž๐“ parallelism.

In system design, it is important to understand the difference between concurrency and parallelism.

As Rob Pyke(one of the creators of GoLang) stated:โ€œ Concurrency is about ๐๐ž๐š๐ฅ๐ข๐ง๐  ๐ฐ๐ข๐ญ๐ก lots of things at once. Parallelism is about ๐๐จ๐ข๐ง๐  lots of things at once." This distinction emphasizes that concurrency is more about the ๐๐ž๐ฌ๐ข๐ ๐ง of a program, while parallelism is about the ๐ž๐ฑ๐ž๐œ๐ฎ๐ญ๐ข๐จ๐ง.

Concurrency is about dealing with multiple things at once. It involves structuring a program to handle multiple tasks simultaneously, where the tasks can start, run, and complete in overlapping time periods, but not necessarily at the same instant.

Concurrency is about the composition of independently executing processes and describes a program's ability to manage multiple tasks by making progress on them without necessarily completing one before it starts another.

Parallelism, on the other hand, refers to the simultaneous execution of multiple computations. It is the technique of running two or more tasks or computations at the same time, utilizing multiple processors or cores within a computer to perform several operations concurrently. Parallelism requires hardware with multiple processing units, and its primary goal is to increase the throughput and computational speed of a system.

In practical terms, concurrency enables a program to remain responsive to input, perform background tasks, and handle multiple operations in a seemingly simultaneous manner, even on a single-core processor. It's particularly useful in I/O-bound and high-latency operations where programs need to wait for external events, such as file, network, or user interactions.

Parallelism, with its ability to perform multiple operations at the same time, is crucial in CPU-bound tasks where computational speed and throughput are the bottlenecks. Applications that require heavy mathematical computations, data analysis, image processing, and real-time processing can significantly benefit from parallel execution.

1713455205760

How does Docker Work? Is Docker still relevant?

Docker's architecture comprises three main components:

๐Ÿ”น Docker Client This is the interface through which users interact. It communicates with the Docker daemon.

๐Ÿ”น Docker Host Here, the Docker daemon listens for Docker API requests and manages various Docker objects, including images, containers, networks, and volumes.

๐Ÿ”น Docker Registry This is where Docker images are stored. Docker Hub, for instance, is a widely-used public registry.

1717911858607

Explaining JSON Web Token (JWT) with simple terms.

Imagine you have a special box called a JWT. Inside this box, there are three parts: a header, a payload, and a signature.

The header is like the label on the outside of the box. It tells us what type of box it is and how it's secured. It's usually written in a format called JSON, which is just a way to organize information using curly braces { } and colons : .

The payload is like the actual message or information you want to send. It could be your name, age, or any other data you want to share. It's also written in JSON format, so it's easy to understand and work with.

Now, the signature is what makes the JWT secure. It's like a special seal that only the sender knows how to create. The signature is created using a secret code, kind of like a password. This signature ensures that nobody can tamper with the contents of the JWT without the sender knowing about it.

When you want to send the JWT to a server, you put the header, payload, and signature inside the box. Then you send it over to the server. The server can easily read the header and payload to understand who you are and what you want to do.

1717948477298

Improving API Performance with Database Connection Pooling

The diagram below shows 5 common API optimization techniques. Today, Iโ€™ll focus on number 5, connection pooling. It is not as trivial to implement as it sounds for some languages.

When fulfilling API requests, we often need to query the database. Opening a new connection for every API call adds overhead. Connection pooling helps avoid this penalty by reusing connections.

๐—›๐—ผ๐˜„ ๐—–๐—ผ๐—ป๐—ป๐—ฒ๐—ฐ๐˜๐—ถ๐—ผ๐—ป ๐—ฃ๐—ผ๐—ผ๐—น๐—ถ๐—ป๐—ด ๐—ช๐—ผ๐—ฟ๐—ธ๐˜€

  1. For each API server, establish a pool of database connections at startup.
  2. Workers share these connections, requesting one when needed and returning it after.

๐—–๐—ต๐—ฎ๐—น๐—น๐—ฒ๐—ป๐—ด๐—ฒ๐˜€ ๐—ณ๐—ผ๐—ฟ ๐—ฆ๐—ผ๐—บ๐—ฒ ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ๐˜€

However, setting up connection pooling can be more complex for languages like PHP, Python and Node.js. These languages handle scale by having multiple processes, each serving a subset of requests.

  • In these languages, database connections get tied to each ๐—ฝ๐—ฟ๐—ผ๐—ฐ๐—ฒ๐˜€๐˜€.
  • Connections can't be efficiently shared across processes. Each process needs its own pool, wasting resources.

In contrast, languages like Java and Go use threads within a single process to handle requests. Connections are bound at the application level, allowing easy sharing of a centralized pool.

๐—–๐—ผ๐—ป๐—ป๐—ฒ๐—ฐ๐˜๐—ถ๐—ผ๐—ป ๐—ฃ๐—ผ๐—ผ๐—น๐—ถ๐—ป๐—ด ๐—ฆ๐—ผ๐—น๐˜‚๐˜๐—ถ๐—ผ๐—ป

Tools like PgBouncer work around these challenges by ๐—ฝ๐—ฟ๐—ผ๐˜…๐˜†๐—ถ๐—ป๐—ด ๐—ฐ๐—ผ๐—ป๐—ป๐—ฒ๐—ฐ๐˜๐—ถ๐—ผ๐—ป๐˜€ at the application level.

PgBouncer creates a centralized pool that all processes can access. No matter which process makes the request, PgBouncer efficiently handles the pooling.

At high scale, all languages can benefit from running PgBouncer on a dedicated server. Now the connection pool is shared over the network for all API servers. This conserves finite database connections.

Connection pooling improves efficiency, but its implementation complexity varies across languages.

1717994252083