System Expert - MateuszMazurkiewicz/coding-and-learning GitHub Wiki

Notes:

[3] Client-Server Model

Port (communication endpoint): 22: Secure Shell 53: DNS lookup 80: HTTP 443: HTTPS

DNS: Domain Name System - describes the entities and protocols involved in the translation from domain names to IP Address. Typically, machines make a DNS query to well known entity which is responsible for returning the IP address of requested domain name in the response.

[4] Network Protocols:

IP: internet protocol, IP packet (information unit) - header (source & destination IP address), payload (data), max 2^16 bytes TCP: transmission control protocol, TCP packet inside IP packet, HTTP: request-response paradigm

[6] Latency And Throughput:

Latency: how long it takes?

read 1 MB from memory: 250 us < read 1 MB from SSD: 1000us < send 1MB 1Gbps network (no distance): 10k us < read 1MB from HDD: 20k us < packet (very small) CA -> Netherlands -> CA: 150 us

Throughput: how much work can be performed in given amount of time? eg. requests per second RPS

[7] Availability:

system fault tolerance, odds of server or service being up and running at any point of time availability - percentage of up time per year (in nines) 99% - 87.7 h down (two nines) 99.9% 8.8 h (three nines)

5 nines - gold standard High Availability - min. 5 nines

SLA service-level-agreement (on service availability) SLO service-level-objective (SLA are made of one or multiple SLOs)

Redundancy might prevent system failures - multiplying parts. Passive redundancy - all parts work at the same time Active redundancy - only some parts work while other wait to replace failed ones

[8] Caching:

Write-through cache: write to cache and DB at the same time

Write-back cache: write to caches, update DB asynchronously

Caches can become stale if they are not updated.

REDIS as cache

Caching is good for static data otherwise it is tricky. / Consistency is not critical. / Only one agent reads and writes.

Eviction policy (limiting/clearing cache): get rid of least recently used / least frequently used / FIFO

[9] Proxies:

Forward Proxy: server between Client and Server that acts on behalf of Client (Client -> Forward Proxy -> Server)

Reverse Proxy: acts on behalf of Server, request from Client goes to Reverse Proxy first and Reverse Proxy sends it Server

Reverse Proxy can distribute load between servers

(!) Nginx

[10] Load Balancers:

Hardware/Software Load balancers - software is more flexible

Server Selection Strategy: random / round robin / weighted round robin / based on load or performance / IP based routing - hash IP addresses of clients and use hash to chose server (useful with caches) / path based (servers have roles)

Use multiple load balancers for various parts of system and for redundancy

[11] Hashing:

Consistent hashing: https://www.toptal.com/big-data/consistent-hashing

Rendezvous hashing: based on scores for each hash code, allows for minimal re-distribution of mappings when server goes down

[12] Relational Databases:

DB Index: auxiliary data structure that allows DB to perform certain queries faster. Usually faster reads, but slower writes as index must be also updated

Strong Consistency: AICD transaction

Eventual Consistency: state of the database will eventually reflect writes within a time period

(!) Postgres

[13] Key-Value Stores:

NoSQL database that's often used for caching and dynamic configuration. DynamoDB, Etcd, Redis (in-memory), ZooKeepr

[14] Specialized Storage Paradigms:

Blob Storage: GCS, S3

Time Series DB: InfluxDB, Prometheus

Graph DB: multiple levels of relationships in data (naturally forms graph)

Spatial DB: optimized for storing and querying data like locations on map (Quadtree)

[15] Replication and Sharding:

Replication: failure tolerance / decreasing latency

Sharding: data partitioning - splitting db into two or more pieces to increase throughput of db. Sharding based on region / type of data stored / hash of column

Hot Spot: when some servers receive a lot more traffic than others when sharding key or hashing function are suboptimal

[16] Leader Election:

Paxos & Raft algorithm for leader election Zoo Keeper, etcd used for leader election

[17] Peer-To-Peer Networks:

Distribute Hash Table - maps data chunks id to peers IP addresses

Gossip Protocol vs Tracker - sharing data in peer-to-peer in uncoordinated way vs central DB

[18] Polling And Streaming

[19] Configuration

[20] Rate Limiting

[21] Logging And Monitoring

Graphana / Prometheus

[22] Publish/Subscribe Pattern

[25] API Design

Swagger

Pagination - offset, limit

Create, Read, Update, Delete, (List)