Home - nagasudhirpulla/wrldc_scada_files_db_ingestion GitHub Wiki
Problem Statement
Store, Visualize and Analze SCADA history data in IT LAN / through secure internet access
Challenges
-
Store the data in a database for better data integrity, faster data access and data manipulation
-
In order to store minute-wise data, each measurement creates 1440 samples per day. Our requirement is to store at least 1000 measurements per day. Hence we need to store more than 1000*1440 samples (database row) per day. This means, we need to store huge amounts of data preserving the speed of data access and data manipulation
-
The user should have flexibility to add / remove the measurements that are being ingested into the database.
-
Create a script that triggers the data ingestion process using SCADA text files every day so as to keep the database upto date
-
Create a secure web portal that enables users to visualize or analyze the database from the public internet
Approach
-
We are using PostgreSQL which is an opensource database to store the data.
-
In order to store millions of rows in a single database table, we are using TimescaleDB, which is an opensource extension of PostgreSQL. TimescaleDB partitions a single timeseries table into multiple tables (chunks) based on time. This table partitioning enables faster data access and data manipulation compared to traditional relational databases.
-
User can manipulate an excel sheet that defines which measurements will be ingested into the database.
-
To ingest the data daily into the database, we have created a python script that triggers daily at a user defined time.
-
To enable data access or data analysis over the internet with security, we have hosted a web server that use ASP.NET Core Web framework.