Databases - ashishranjandev/interview-wiki GitHub Wiki

When to use a database

  • Relational - Payment Systems, Booking Systems
  • Wide Column Store - Reporting, Business Intelligence
  • In memory store - Flash Sale, In memory Cache
  • Graph - Social relations, Anti-Money Laundering, Machine Learning
  • Time Series Database - Stock Market, Trading platform, Stock Exchange, Ads real-time bidding
  • Object Store - File, Audio, Video, Picture

Postgres

Postgres Performance Improvements

Tuning query performance in a PostgreSQL database involves several strategies and best practices. Here are some key steps you can take:

1. Indexing

  • Create Indexes: Properly indexed tables can significantly speed up query processing. Identify columns frequently used in WHERE clauses and create B-tree or hash indexes on those columns⁵.
  • Use Partial Indexes: For columns with many null values, partial indexes can be more efficient.

2. Query Optimization

  • Analyze Queries: Use the EXPLAIN command to understand how PostgreSQL executes a query and identify bottlenecks.
  • Rewrite Queries: Simplify complex queries, avoid unnecessary subqueries, and use joins efficiently.
  • Prepared Statements: Use prepared statements to reduce parsing and planning time for frequently executed queries³.

3. Configuration Tuning

  • Shared Buffers: Increase the shared_buffers setting to allocate more memory for caching data.
  • Work Mem: Adjust the work_mem setting to optimize memory usage for complex queries and sorts.
  • Effective Cache Size: Set effective_cache_size to reflect the amount of memory available for caching data, helping the query planner make better decisions¹.

4. Maintenance Tasks

  • VACUUM: Regularly run VACUUM to reclaim storage and update statistics.
  • ANALYZE: Use ANALYZE to update the query planner statistics, ensuring it has accurate data for optimization⁵.

5. Hardware and OS Optimization

  • Upgrade Hardware: Ensure your server has sufficient CPU, RAM, and fast storage (e.g., SSDs).
  • OS Tuning: Optimize your operating system settings for PostgreSQL, such as adjusting kernel parameters and file system settings⁴.

6. Monitoring and Tools

  • pg_stat_statements: Use this extension to track and analyze query performance.
  • Logging: Enable detailed logging to identify slow queries and performance issues.
  • Performance Monitoring Tools: Utilize tools like pgAdmin, pgBadger, and third-party solutions to monitor and analyze database performance⁶.

Would you like more detailed guidance on any specific area?

¹: PostgreSQL Documentation ³: Dreams of Code ⁴: Stackify ⁵: Techie Clues ⁶: Atatus

Indexing

Indexing with multi-column

What is a Multi-Column Index? A multi-column index, also known as a composite index, is an index that includes more than one column from a database table. This type of index is useful when you frequently query a table using multiple columns in the WHERE clause.

How Does It Work? Imagine you have a table of students with columns for first_name, last_name, and grade. If you often search for students by both their last_name and first_name, you might create a multi-column index on these two columns.

Structure and Order

  • Order Matters: The order of columns in a multi-column index is crucial. For example, an index on (last_name, first_name) is different from an index on (first_name, last_name). The database uses the order to optimize searches.
  • Hierarchical Search: The index is structured hierarchically. First, it sorts by the first column (last_name), and within each last_name, it sorts by the second column (first_name).