Database part2 - rahul7838/quora.clone GitHub Wiki

Isolation level (@Transactioanl)

Here are the 4 standard database transaction isolation levels, from lowest to highest isolation:

  1. Read Uncommitted
  • Allows dirty reads (reading uncommitted changes)
  • No protection against any concurrency issues
  1. Read Committed
  • Prevents dirty reads
  • Each query sees only committed data
  • Default in many databases like PostgreSQL
  1. Repeatable Read
  • Prevents dirty and non-repeatable reads
  • Same row returns same data in transaction
  • May allow phantom reads
  1. Serializable
  • Highest isolation level
  • Prevents all concurrency phenomena
  • Transactions execute as if sequential
  • Lowest performance due to strict locking

@Transactional serves multiple purposes:

  1. Atomicity (ACID):
@Transactional
public void transferMoney() {
    account1.debit(100);
    account2.credit(100); // If this fails, debit is rolled back
}
  1. Connection/Resource Management:
  • Acquires and releases database connections
  • Manages transaction boundaries
  • Handles session context
  1. Isolation Level Control:
@Transactional(isolation = Isolation.READ_COMMITTED)
public void sensitiveOperation() {
    // Controls how concurrent transactions interact
}
  1. Propagation Behavior:
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void newTransaction() {
    // Forces new transaction even if called within existing one
}
  1. Exception Handling:
@Transactional(rollbackFor = CustomException.class)
public void operation() {
    // Specifies which exceptions trigger rollback
}

General database questions

  1. Can single db connection can make multiple request concurrently.

No, a single database connection cannot process multiple queries simultaneously. Each connection can only handle one query at a time. For concurrent queries, you need multiple connections from the connection pool:

// Two concurrent requests
@Transactional
public void request1() {
    // Uses connection 1
    userRepository.findById(1);
}

@Transactional
public void request2() {
    // Uses connection 2
    userRepository.findById(2); 
}

This is why connection pools maintain multiple connections - to handle concurrent requests from different users.

  1. Transaction Propagation

Transaction propagation controls how database transactions behave when one transactional method calls another. Here's why it matters:

Consider two database operations:

@Transactional
void methodA() {
    saveCustomer();
    methodB(); // What happens to the transaction here?
}

@Transactional
void methodB() {
    saveOrder();
}

Without propagation rules, you'd face questions like:

  • Should methodB run in methodA's transaction?
  • Should it start a new transaction?
  • What if methodA's transaction fails?

Propagation answers these questions:

  • REQUIRED: methodB joins methodA's transaction
  • REQUIRES_NEW: methodB starts fresh transaction
  • SUPPORTS: methodB uses methodA's transaction if it exists
  • MANDATORY: methodB must have a transaction from methodA

This ensures data consistency and proper transaction management across method calls.

  1. Are we using http or something else for db communication from spring boot.?

Database communication between Spring Boot and MySQL uses the MySQL Protocol, not HTTP. This is a TCP-based protocol specifically designed for client-server database communication.

The connection flow works like this:

  1. Application opens TCP connection to MySQL's port (default 3306)
  2. MySQL Protocol handles authentication
  3. Commands/queries are sent via MySQL Protocol
  4. JDBC driver (mysql-connector-j) translates between Java and MySQL Protocol
  5. Results return through same protocol

Connection Example:

spring.datasource.url=jdbc:mysql://localhost:3306/dbname

The "jdbc:mysql://" prefix indicates MySQL Protocol usage rather than HTTP.

4.### Comparing DB connection reuse and creation

Let's break down what happens when a connection is created:

TCP connection establishment (3-way handshake) Database authentication Session state initialization Security context setup Memory allocation both on application and database side

This process typically takes between 10-50 milliseconds, which might not seem like much, but it adds up significantly under load

Connection Acquisition Time:

Under 1ms: Excellent (reusing pooled connection) 1-5ms: Good (minimal contention) 5-10ms: Warning (pool saturation)

10ms: Critical (connection starvation)

Number of Thread required for 1000 request per second.

Your calculations are on the right track, but let’s clarify and refine the logic.

Request Handling Breakdown

  1. Requests per second: 1000 requests/second.
  2. Processing time per request: 300ms = 0.3 seconds.

This implies that 1 thread can process ( \frac{1}{0.3} \approx 3.33 ) requests per second.

Threads Needed To handle 1000 requests per second: [ \text{Threads required} = \frac{1000 \text{ requests/second}}{3.33 \text{ requests/thread/second}} \approx 300 \text{ threads}. ]

So, you need approximately 300 threads to handle the load.


Core Utilization Modern CPUs can run multiple threads per core, but typically:

  • 1 core can handle ~1 thread at 100% CPU utilization (for CPU-bound tasks).
  • For I/O-bound tasks (like database queries), threads spend some time waiting, so more threads per core can be managed efficiently.

Assume a Thread per Core ratio:

  1. For CPU-bound tasks: 1 thread/core.
  2. For I/O-bound tasks: ( 4-6 \text{ threads/core} ).

Since database interactions are usually I/O-bound, a single core can handle more than 1 thread. Let’s assume 5 threads per core for I/O-bound tasks.

[ \text{Cores needed} = \frac{\text{Threads required}}{\text{Threads per core}} = \frac{300}{5} = 60 \text{ cores}. ]


Database Connection Pool If each thread requires a database connection and all 300 threads are actively handling requests, you’d need ~300 database connections in your connection pool. However:

  • Connection pooling often reuses connections, so you may not need exactly 300 active connections simultaneously unless all threads are fully utilized.

Consider starting with 200-300 connections in the pool and monitor performance under load.


Horizontal Scaling If a single machine with 60 cores is not feasible, you can horizontally scale your application by running multiple instances.

  1. Suppose each instance has 4 cores:

    • Threads per instance = ( 4 \times 5 = 20 \text{ threads}).
    • Requests per instance = ( 20 \times 3.33 \approx 66 \text{ requests/second}).
  2. Total instances needed: [ \text{Instances needed} = \frac{\text{Total requests/second}}{\text{Requests per instance}} = \frac{1000}{66} \approx 15. ]

You’d need approximately 15 instances to handle 1000 requests/second.


Optimizations to Consider

  1. Asynchronous Processing: Use asynchronous frameworks like Spring WebFlux to reduce the number of threads required, especially for I/O-bound operations.
  2. Caching: Reduce the load on the database by caching frequently accessed data.
  3. Load Testing: Simulate your target traffic and observe actual performance metrics to fine-tune thread counts, connection pools, and scaling needs.