Introduction to Data Warehouse - sachit914/datawarehouse GitHub Wiki
Introduction To Data Warehouse
DWs are central repositories of integrated data from one or more disparate sources.
It is a system used for reporting and data analysis.
Why We need Data Warehouse
You need to integrate many different sources of data.
Avoid Users are running reports directly against operational systems.
(Example we have banking system )
In a banking system, running reports directly from the operational systems (such as those that handle real-time transactions like deposits, withdrawals, and loans) can cause several problems.
System Crashes: Running big reports can overload the system, causing it to slow down or even crash. This would disrupt critical banking activities, leaving customers unable to access their accounts or make transactions.
Solution
Instead of directly running reports on the operational system, banks use a data warehouse. This is a separate system where data from the operational system is regularly copied, cleaned, and stored.
Reports are then generated from this warehouse without affecting the live operations.
You have tons of historical data that you need to gather in one easily accessible place even if the source transaction systems doesn't
transaction system may keep only recent data (from 2 years)