Databases - ashishranjandev/interview-wiki GitHub Wiki
When to use a database
- Relational - Payment Systems, Booking Systems
- Wide Column Store - Reporting, Business Intelligence
- In memory store - Flash Sale, In memory Cache
- Graph - Social relations, Anti-Money Laundering, Machine Learning
- Time Series Database - Stock Market, Trading platform, Stock Exchange, Ads real-time bidding
- Object Store - File, Audio, Video, Picture
Postgres
Postgres Performance Improvements
Tuning query performance in a PostgreSQL database involves several strategies and best practices. Here are some key steps you can take:
1. Indexing
- Create Indexes: Properly indexed tables can significantly speed up query processing. Identify columns frequently used in
WHEREclauses and create B-tree or hash indexes on those columns⁵. - Use Partial Indexes: For columns with many null values, partial indexes can be more efficient.
2. Query Optimization
- Analyze Queries: Use the
EXPLAINcommand to understand how PostgreSQL executes a query and identify bottlenecks. - Rewrite Queries: Simplify complex queries, avoid unnecessary subqueries, and use joins efficiently.
- Prepared Statements: Use prepared statements to reduce parsing and planning time for frequently executed queries³.
3. Configuration Tuning
- Shared Buffers: Increase the
shared_bufferssetting to allocate more memory for caching data. - Work Mem: Adjust the
work_memsetting to optimize memory usage for complex queries and sorts. - Effective Cache Size: Set
effective_cache_sizeto reflect the amount of memory available for caching data, helping the query planner make better decisions¹.
4. Maintenance Tasks
- VACUUM: Regularly run
VACUUMto reclaim storage and update statistics. - ANALYZE: Use
ANALYZEto update the query planner statistics, ensuring it has accurate data for optimization⁵.
5. Hardware and OS Optimization
- Upgrade Hardware: Ensure your server has sufficient CPU, RAM, and fast storage (e.g., SSDs).
- OS Tuning: Optimize your operating system settings for PostgreSQL, such as adjusting kernel parameters and file system settings⁴.
6. Monitoring and Tools
- pg_stat_statements: Use this extension to track and analyze query performance.
- Logging: Enable detailed logging to identify slow queries and performance issues.
- Performance Monitoring Tools: Utilize tools like pgAdmin, pgBadger, and third-party solutions to monitor and analyze database performance⁶.
Would you like more detailed guidance on any specific area?
¹: PostgreSQL Documentation ³: Dreams of Code ⁴: Stackify ⁵: Techie Clues ⁶: Atatus
Indexing
Indexing with multi-column
What is a Multi-Column Index? A multi-column index, also known as a composite index, is an index that includes more than one column from a database table. This type of index is useful when you frequently query a table using multiple columns in the WHERE clause.
How Does It Work? Imagine you have a table of students with columns for first_name, last_name, and grade. If you often search for students by both their last_name and first_name, you might create a multi-column index on these two columns.
Structure and Order
- Order Matters: The order of columns in a multi-column index is crucial. For example, an index on (last_name, first_name) is different from an index on (first_name, last_name). The database uses the order to optimize searches.
- Hierarchical Search: The index is structured hierarchically. First, it sorts by the first column (last_name), and within each last_name, it sorts by the second column (first_name).