Relational Databases - kdwivedi1985/system-design GitHub Wiki

Common features and differences among Postgres, Oracle, MySQL, SQLServer, YugabyteDB and CockroachDB:

Feature PostgreSQL MySQL (InnoDB) SQL Server Oracle Database YugabyteDB CockroachDB
License Open-Source (PostgreSQL License) Open Source (GPL) / Commercial Microsoft Oracle Open Source (Apache 2.0) CockroachLabs
Partitioning Manual,Declarative (HASH) Manual No native sharding Yes, RAC (Real Application Cluster) Native, Automatic,Declarative Native, Automatic, Declarative
Distributed 2PC No No Yes, using MSDTC Microsoft Distibuted Transaction Coordinator) Yes Yes Yes
Spatial Features PostGIS Extension Limited. Yes, supports bothGeometry and Geography Extensive Sapatial capabilities Not native but, PostGIS-Compatible Not native but, PostGIS-Compatible
Concurrency Control MVCC + Pessimistic Locking MVCC + Pessimistic Locking Pessimistic Locking (default), row versioning to emulate MVCC if enabled MVCC + Pessimistic Locking Distributed MVCC + OCC Distributed MVCC + OCC
Data Type Support Extensive (JSONB, UUID, etc.) JSON, UUID JSON, XML, UUID JSON, XML, UUID JSON, UUID, Arrays JSON, UUID, Arrays

What is Multi-version concurrency(MVCC)?

  • Multi-version concurrency control (MVCC) is a method used by database systems to manage concurrent access to data, allowing multiple transactions to read and write data simultaneously without blocking each other. It achieves this by maintaining multiple versions of each row, allowing each transaction to see a consistent snapshot of the data at the time it started.
  • It defines how versions are created, used, garbage-collected, and how conflicts are resolved.

What is Row Versioning?

  • It is a mechanism to achieve MVCC. It stores historical versions of rows. Used in SQL Server to achieve MVCC.
  • Row Versioning = Building block for MVCC
  • MVCC = The rules or Full Architecture (create, use, garbage-collected, and resolve conflicts of versioning)

What is Shared Lock?

  • A shared lock is a type of database lock that allows multiple transactions to read the same data simultaneously, but prevents any of them from modifying that data until all shared locks are released. e.g. Most of RDBMS databases use shared lock for read operations e.g. SQLServer, Oracle without MVCC mode.

What is row level Lock?

  • A row-level lock is a type of lock used in most of the databases (like Oracle, Postgres, SQL Server, MySQL with InnoDB, etc.) that locks only the specific row(s) being accessed or modified during a transaction. It ensure data consistency and isolation.
  • A row-level lock allows read operation in Databases which supports MVCC. e.g. MySQL, Oracle, Postgres.

What is table level Lock?

  • A table-level lock is a type of database lock that restricts access to an entire table. It blocks both read and write operations on the table based on lock mode.
  • Generally used in DDL operations (Create, Crop, Alter etc).

What MVCC + OCC?

  • MVCC allows non-blocking reads by maintaining multiple versions.
  • If conflict is detected during commit, it enforce rollback and retry (re-read/update).
  • Each row has timestamp and version. OCC is achieved by matching timestamp and version.
  • Used in Distributed databased like CockroachDB and YugabyteDB
  • e.g. Two processes may read the same original version of the row. When the first transaction updates and commits, PostgreSQL creates a new version of the row with a new transaction ID. The second transaction, when it tries to update and commit, finds that the row version it read is now obsolete (because of the first transaction’s update).

What MVCC + Pessimistic Locking?

  • Allow non-blocking reads by maintaining multiple versions.
  • Locks the transaction/row(s) while being updated. No other process can update while locked.
  • Used in Distributed databased like Postgres and Oracle etc.
  • e.g. e.g. In simultaneous update, the first transaction acquires the lock reads the value, increments, and writes back.The second transaction waits for the lock. Once it acquires the lock, it re-reads the (now updated) value, increments, and writes back.