Home - nagasudhirpulla/wrldc_scada_files_db_ingestion GitHub Wiki

Problem Statement

Store, Visualize and Analze SCADA history data in IT LAN / through secure internet access

Store the data in a database for better data integrity, faster data access and data manipulation
In order to store minute-wise data, each measurement creates 1440 samples per day. Our requirement is to store at least 1000 measurements per day. Hence we need to store more than 1000*1440 samples (database row) per day. This means, we need to store huge amounts of data preserving the speed of data access and data manipulation
The user should have flexibility to add / remove the measurements that are being ingested into the database.
Create a script that triggers the data ingestion process using SCADA text files every day so as to keep the database upto date
Create a secure web portal that enables users to visualize or analyze the database from the public internet

We are using PostgreSQL which is an opensource database to store the data.
In order to store millions of rows in a single database table, we are using TimescaleDB, which is an opensource extension of PostgreSQL. TimescaleDB partitions a single timeseries table into multiple tables (chunks) based on time. This table partitioning enables faster data access and data manipulation compared to traditional relational databases.
User can manipulate an excel sheet that defines which measurements will be ingested into the database.
To ingest the data daily into the database, we have created a python script that triggers daily at a user defined time.
To enable data access or data analysis over the internet with security, we have hosted a web server that use ASP.NET Core Web framework.