Database Server - OtagoPolytechnic/Cloudy-with-a-Chance-of-LoRa GitHub Wiki

Overview

This database server's goal is to host a Postgres database that recieves data from our sensors hosted with LoRa and be able to take this data and build our own API. We chose Postgres as it is already used by the CO2 team and is easy to set up to go with LoRaWAN.

Installation

Install documentation

sudo apt install postgresql
image

Connect to the database template with postgres user sudo -u postgres psql template1
ALTER USER postgres with encryped password 'password';

Schema

CREATE TABLE devices (
    device_id varchar(30) PRIMARY KEY,
    device_name VARCHAR(255) NOT NULL
);

CREATE TABLE temperature (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    device_id varchar(30) REFERENCES devices(device_id),
    temperature DECIMAL(7, 2),
    PRIMARY KEY (device_id, timestamp),
    UNIQUE (device_id, timestamp)
);

CREATE TABLE pressure (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    device_id varchar(30) REFERENCES devices(device_id),
    pressure DECIMAL(7, 2),
    PRIMARY KEY (device_id, timestamp),
    UNIQUE (device_id, timestamp)
);

CREATE TABLE humidity (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    device_id varchar(30) REFERENCES devices(device_id),
    humidity DECIMAL(7, 2),
    PRIMARY KEY (device_id, timestamp),
    UNIQUE (device_id, timestamp)
);

CREATE TABLE wind (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    device_id varchar(30) REFERENCES devices(device_id),
    wind_speed DECIMAL(6, 3),
    wind_direction varchar(3),
    PRIMARY KEY (device_id, timestamp),
    UNIQUE (device_id, timestamp)
);

CREATE TABLE rain_gauge (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    device_id varchar(30) REFERENCES devices(device_id),
    rain_gauge DECIMAL(5, 2),
    PRIMARY KEY (device_id, timestamp)
);

CREATE TABLE dust (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    device_id varchar(30) REFERENCES devices(device_id),
    dust DECIMAL(5, 2),
    PRIMARY KEY (device_id, timestamp),
    UNIQUE (device_id, timestamp)
);

CREATE TABLE cloud_images (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP PRIMARY KEY,
    image_data BYTEA
);

CREATE TABLE ai_result (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP PRIMARY KEY,
    api_response JSONB NOT NULL
);

CREATE TABLE co2 (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    device_id varchar(30) REFERENCES devices(device_id),
    co2_level DECIMAL(7, 2),
    PRIMARY KEY (device_id, timestamp),
    UNIQUE (device_id, timestamp)
);

CREATE TABLE gas (
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    device_id varchar(30) REFERENCES devices(device_id),
    gas_level DECIMAL(7, 2),
    PRIMARY KEY (device_id, timestamp),
    UNIQUE (device_id, timestamp)
);

CREATE TABLE rainfall_measurement (
    timestamp TIMESTAMP,
    device_id VARCHAR(30),
    rainfall_mm DECIMAL(5, 4)
);

Allowing connections

This is stored in the .env file on the server, allowing a postgres connection. DATABASE_URL=postgres://{username}:{password}@{internal ip}:5432/{database name}
5432 is the port used for Postgres.

PG_CRON

I have used PostGres Cron jobs to help clear our images database after 48 hours to prevent the database from blowing out with images.
On the database server:
sudo apt-get install postgresql-16-cron
Changed the line in postgresql.conf to shared_preload_libraries = 'pg_cron'
The change requires a restart. sudo systemctl restart postgresql
In the database postgres: CREATE EXTENSION pg_cron;
Create a function:

CREATE OR REPLACE FUNCTION cleanup_old_cloud_images()
RETURNS VOID AS $$
BEGIN
    DELETE FROM cloud_images
    WHERE timestamp < NOW() - INTERVAL '48 hours';
END;
$$ LANGUAGE plpgsql;

Add a Cron schedule: SELECT cron.schedule('cleanup_old_cloud_images', '0 * * * *', 'SELECT cleanup_old_cloud_images();');

Cron schedule for backups:
SELECT cron.schedule('0 0 * * *', $$pg_dump -U weather_admin -h localhost weather > /home/weather-app/backup/backup_$(date +\%Y\%m\%d).sql$$);

Send Data

We need to make sure the device is inserted into the database table. The id will correlate to the fport and will be used to connect each of the devices to a table

INSERT INTO devices (device_id, device_name)
    VALUES ('dust-sensor-01', 'Dust Sensor');
INSERT INTO devices (device_id, device_name)
    VALUES ('weather-station-01', 'Master Board');
INSERT INTO devices (device_id, device_name)
    VALUES('gas-sensor-01', 'Gas Sensor');
 INSERT INTO devices (device_id, device_name)
    VALUES('temp-humidity', 'Temperature & Humidity Sensor');

Functions

Temperature, grabbing the top 3 temperatures and averaging them.

CREATE OR REPLACE FUNCTION average_temp()
RETURNS DECIMAL AS $$DECLARE    
avg_temperature DECIMAL;BEGIN    
SELECT ROUND(AVG(temperature), 2)    
INTO avg_temperature    
FROM (        
SELECT temperature        
FROM temperature        
ORDER BY timestamp DESC        
LIMIT 3) AS top_temperatures;    
RETURN avg_temperature;END;
$$ LANGUAGE plpgsql;

Return 7 or 30 day average of data. This wont accept incorrect values

CREATE OR REPLACE FUNCTION get_daily_avg_data(table_name TEXT, column_name TEXT, days_interval INT)
RETURNS TABLE(day DATE, day_of_week TEXT, avg_value DECIMAL(7,2)) AS $$
DECLARE
    query TEXT;
BEGIN
    -- Check if days_interval is either 7 or 30; if not, raise an exception
    IF days_interval != 7 AND days_interval != 30 THEN
        RAISE EXCEPTION 'Invalid days_interval: %; only 7 or 30 are allowed.', days_interval;
    END IF;
    query := format(
        'SELECT 
            DATE_TRUNC(''day'', timestamp)::DATE AS day,
            TO_CHAR(timestamp, ''Day'') AS day_of_week,
            ROUND(AVG(%I), 2)::DECIMAL(7,2) AS avg_value
         FROM %I
         WHERE timestamp >= NOW() - INTERVAL ''1 day'' * ($1 - 1)
         GROUP BY day, day_of_week
         ORDER BY day DESC',
        column_name, table_name
    );

    -- Execute the dynamic query and return the result
    RETURN QUERY EXECUTE query USING days_interval;
END;
$$ LANGUAGE plpgsql;

24 hour average:

CREATE OR REPLACE FUNCTION get_hourly_avg_data(table_name TEXT, column_name TEXT)
RETURNS TABLE(hour TIMESTAMP, avg_value DECIMAL(7,2)) AS $$
DECLARE
    query TEXT;
BEGIN
    query := format(
        'SELECT 
            DATE_TRUNC(''hour'', timestamp) AS hour,
            ROUND(AVG(%I), 2)::DECIMAL(7,2) AS avg_value
         FROM %I
         WHERE timestamp >= NOW() - INTERVAL ''24 hours''
         GROUP BY hour
         ORDER BY hour DESC',
        column_name, table_name
    );
    RETURN QUERY EXECUTE query;
END;
$$ LANGUAGE plpgsql;

Rainfall calculation

CREATE OR REPLACE FUNCTION calculate_rainfall()
RETURNS TRIGGER AS $$
DECLARE
    previous_value DECIMAL(6, 4);
    rainfall_increment DECIMAL(6, 4) := 0.2794; -- mm per tick
    rainfall_amount DECIMAL(6, 4);
BEGIN
    -- Retrieve the previous rain_gauge value for the same device
    SELECT rain_gauge INTO previous_value
    FROM rain_gauge
    WHERE device_id = NEW.device_id
    ORDER BY timestamp DESC
    LIMIT 1
    OFFSET 1;

    -- Check if there is a previous value
    IF previous_value IS NOT NULL THEN
        -- Calculate the difference in ticks
        rainfall_amount := (NEW.rain_gauge - previous_value) * rainfall_increment;

        -- Ensure non-negative rainfall and insert into rainfall_measurement
        INSERT INTO rainfall_measurement (timestamp, device_id, rainfall_mm)
        VALUES (NEW.timestamp, NEW.device_id, GREATEST(rainfall_amount, 0));
    ELSE
        -- if no previous value, log 0 rainfall
        INSERT INTO rainfall_measurement (timestamp, device_id, rainfall_mm)
        VALUES (NEW.timestamp, NEW.device_id, 0);
    END IF;

    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

Trigger for rainfall

CREATE TRIGGER rain_gauge_trigger
AFTER INSERT ON rain_gauge
FOR EACH ROW
EXECUTE FUNCTION calculate_rainfall();

Old

Last 7 days of temperature averaged by day

CREATE OR REPLACE FUNCTION get_last_7_days_avg_temperatures()
RETURNS TABLE(day DATE, day_of_week TEXT, avg_temperature DECIMAL(5,2)) AS $$
BEGIN
    RETURN QUERY
    SELECT 
        DATE_TRUNC('day', timestamp)::DATE AS day,  -- Cast to DATE
        TO_CHAR(timestamp, 'Day') AS day_of_week,
        ROUND(AVG(temperature), 2)::DECIMAL(5,2) AS avg_temperature  -- Round and cast to 2 decimals
    FROM 
        temperature
    WHERE 
        timestamp >= NOW() - INTERVAL '7 days'
    GROUP BY 
        day, day_of_week
    ORDER BY 
        day DESC;
END;
$$ LANGUAGE plpgsql;

image

Last 30 days:

CREATE OR REPLACE FUNCTION get_last_30_days_daily_avg_temperatures()
RETURNS TABLE(day DATE, day_of_week TEXT, avg_temperature DECIMAL(5,2)) AS $$
BEGIN
    RETURN QUERY
    SELECT 
        DATE_TRUNC('day', timestamp)::DATE AS day,          -- Cast to DATE
        TO_CHAR(timestamp, 'Day') AS day_of_week,           -- Day of the week
        ROUND(AVG(temperature), 2)::DECIMAL(5,2) AS avg_temperature  -- Average temperature to 2 decimals
    FROM 
        temperature
    WHERE 
        timestamp >= NOW() - INTERVAL '30 days'             -- Last 30 days
    GROUP BY 
        day, day_of_week
    ORDER BY 
        day DESC;
END;
$$ LANGUAGE plpgsql;

image

Creating Fake data

First you need to change the .env file in fullstack to the following: DATABASE_URL=postgres://postgres:password@localhost:5432/local_db

NEXT_PUBLIC_API_BASE_URL=http://localhost:3000/

  1. open git bash terminal in /fullstack
  2. docker --version
    this just checks if you have docker installed
  3. docker pull postgres
    downloads latest PostgreSQL image.
  4. docker run --name local_postgres -e POSTGRES_PASSWORD=password -d -p 5432:5432 postgres
    start PostgreSQL container
  5. docker ps Check if the container is running:
  6. docker exec -it local_postgres psql -U postgres
    This opens the PostgreSQL interactive shell
  7. CREATE DATABASE local_db;
  8. \c local_db

Once you are in the database, copy the schema and functions from the wiki. Next use chatgpt to generate fake data. To do this, copy and paste the schema and functions mentioned before and ask for fake data for temperature and humidity.

It should look something like this:

INSERT INTO temperature (timestamp, device_id, temperature) SELECT NOW() - INTERVAL '1 hour' * n, 'temp-humidity', ROUND((RANDOM() * 20 + 10)::numeric, 2) FROM generate_series(1, 100) n;

continue to do this for all the other tables.

npm run dev and the front end should didplay fake data