Database part2 - rahul7838/quora.clone GitHub Wiki
Isolation level (@Transactioanl)
Here are the 4 standard database transaction isolation levels, from lowest to highest isolation:
- Read Uncommitted
- Allows dirty reads (reading uncommitted changes)
- No protection against any concurrency issues
- Read Committed
- Prevents dirty reads
- Each query sees only committed data
- Default in many databases like PostgreSQL
- Repeatable Read
- Prevents dirty and non-repeatable reads
- Same row returns same data in transaction
- May allow phantom reads
- Serializable
- Highest isolation level
- Prevents all concurrency phenomena
- Transactions execute as if sequential
- Lowest performance due to strict locking
@Transactional serves multiple purposes:
- Atomicity (ACID):
@Transactional
public void transferMoney() {
account1.debit(100);
account2.credit(100); // If this fails, debit is rolled back
}
- Connection/Resource Management:
- Acquires and releases database connections
- Manages transaction boundaries
- Handles session context
- Isolation Level Control:
@Transactional(isolation = Isolation.READ_COMMITTED)
public void sensitiveOperation() {
// Controls how concurrent transactions interact
}
- Propagation Behavior:
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void newTransaction() {
// Forces new transaction even if called within existing one
}
- Exception Handling:
@Transactional(rollbackFor = CustomException.class)
public void operation() {
// Specifies which exceptions trigger rollback
}
General database questions
-
Can single db connection can make multiple request concurrently.
No, a single database connection cannot process multiple queries simultaneously. Each connection can only handle one query at a time. For concurrent queries, you need multiple connections from the connection pool:
// Two concurrent requests
@Transactional
public void request1() {
// Uses connection 1
userRepository.findById(1);
}
@Transactional
public void request2() {
// Uses connection 2
userRepository.findById(2);
}
This is why connection pools maintain multiple connections - to handle concurrent requests from different users.
-
Transaction Propagation
Transaction propagation controls how database transactions behave when one transactional method calls another. Here's why it matters:
Consider two database operations:
@Transactional
void methodA() {
saveCustomer();
methodB(); // What happens to the transaction here?
}
@Transactional
void methodB() {
saveOrder();
}
Without propagation rules, you'd face questions like:
- Should methodB run in methodA's transaction?
- Should it start a new transaction?
- What if methodA's transaction fails?
Propagation answers these questions:
- REQUIRED: methodB joins methodA's transaction
- REQUIRES_NEW: methodB starts fresh transaction
- SUPPORTS: methodB uses methodA's transaction if it exists
- MANDATORY: methodB must have a transaction from methodA
This ensures data consistency and proper transaction management across method calls.
-
Are we using http or something else for db communication from spring boot.?
Database communication between Spring Boot and MySQL uses the MySQL Protocol, not HTTP. This is a TCP-based protocol specifically designed for client-server database communication.
The connection flow works like this:
- Application opens TCP connection to MySQL's port (default 3306)
- MySQL Protocol handles authentication
- Commands/queries are sent via MySQL Protocol
- JDBC driver (mysql-connector-j) translates between Java and MySQL Protocol
- Results return through same protocol
Connection Example:
spring.datasource.url=jdbc:mysql://localhost:3306/dbname
The "jdbc:mysql://" prefix indicates MySQL Protocol usage rather than HTTP.
4.### Comparing DB connection reuse and creation
Let's break down what happens when a connection is created:
TCP connection establishment (3-way handshake) Database authentication Session state initialization Security context setup Memory allocation both on application and database side
This process typically takes between 10-50 milliseconds, which might not seem like much, but it adds up significantly under load
Connection Acquisition Time:
Under 1ms: Excellent (reusing pooled connection) 1-5ms: Good (minimal contention) 5-10ms: Warning (pool saturation)
10ms: Critical (connection starvation)
Number of Thread required for 1000 request per second.
Your calculations are on the right track, but let’s clarify and refine the logic.
Request Handling Breakdown
- Requests per second: 1000 requests/second.
- Processing time per request: 300ms = 0.3 seconds.
This implies that 1 thread can process ( \frac{1}{0.3} \approx 3.33 ) requests per second.
Threads Needed To handle 1000 requests per second: [ \text{Threads required} = \frac{1000 \text{ requests/second}}{3.33 \text{ requests/thread/second}} \approx 300 \text{ threads}. ]
So, you need approximately 300 threads to handle the load.
Core Utilization Modern CPUs can run multiple threads per core, but typically:
- 1 core can handle ~1 thread at 100% CPU utilization (for CPU-bound tasks).
- For I/O-bound tasks (like database queries), threads spend some time waiting, so more threads per core can be managed efficiently.
Assume a Thread per Core ratio:
- For CPU-bound tasks: 1 thread/core.
- For I/O-bound tasks: ( 4-6 \text{ threads/core} ).
Since database interactions are usually I/O-bound, a single core can handle more than 1 thread. Let’s assume 5 threads per core for I/O-bound tasks.
[ \text{Cores needed} = \frac{\text{Threads required}}{\text{Threads per core}} = \frac{300}{5} = 60 \text{ cores}. ]
Database Connection Pool If each thread requires a database connection and all 300 threads are actively handling requests, you’d need ~300 database connections in your connection pool. However:
- Connection pooling often reuses connections, so you may not need exactly 300 active connections simultaneously unless all threads are fully utilized.
Consider starting with 200-300 connections in the pool and monitor performance under load.
Horizontal Scaling If a single machine with 60 cores is not feasible, you can horizontally scale your application by running multiple instances.
-
Suppose each instance has 4 cores:
- Threads per instance = ( 4 \times 5 = 20 \text{ threads}).
- Requests per instance = ( 20 \times 3.33 \approx 66 \text{ requests/second}).
-
Total instances needed: [ \text{Instances needed} = \frac{\text{Total requests/second}}{\text{Requests per instance}} = \frac{1000}{66} \approx 15. ]
You’d need approximately 15 instances to handle 1000 requests/second.
Optimizations to Consider
- Asynchronous Processing: Use asynchronous frameworks like Spring WebFlux to reduce the number of threads required, especially for I/O-bound operations.
- Caching: Reduce the load on the database by caching frequently accessed data.
- Load Testing: Simulate your target traffic and observe actual performance metrics to fine-tune thread counts, connection pools, and scaling needs.