SQLite mySQL MSSQL Postgres - sgml/signature GitHub Wiki

Keywords

Fuzzy Matching

Postgres

Release Notes

Extensions

Functions

JSON/JSONB/HBase/XML

Reliability

Password Encryption

testgres

Psql

pg_dump / pg_restore

Host-Based Authentication (hba)

postgresql.conf

PostGIS

GUIs

RegExp

# UUID4 RegExp: wrap with paren on the outside to deferences the first index. Indices start at one

(REGEXP_MATCH(foo, '[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}'))[1] 

Copying tables from one database to another

import os

# Define the table name to be copied and the source/target databases
table_name = 'your_table_name'
source_db = 'foo'
target_db = 'bar'

# Define the SQL commands
sql_commands = f"""
\\c {source_db}
COPY {table_name} TO STDOUT WITH CSV HEADER;

\\c {target_db}
COPY {table_name} FROM STDIN WITH CSV HEADER;
"""

# Write the SQL commands to a file
with open('pgplsql.sql', 'w') as file:
    file.write(sql_commands)

# Execute the SQL file as a separate process
os.system('psql -f pgplsql.sql')

Splitting DataBase Tables

steps:
  - step: Create a Partitioned Table
    details: >
      First, create a partitioned table that will be split by month. 
      You can use **range partitioning** based on a date column.
    code: |
      CREATE TABLE sales (
          id SERIAL PRIMARY KEY,
          sale_date DATE NOT NULL,
          amount NUMERIC(10, 2) NOT NULL
      ) PARTITION BY RANGE (sale_date);
  - step: Create Partitions for Each Month
    details: >
      Next, create partitions for each month. You can automate this process using a function.
    code: |
      CREATE OR REPLACE FUNCTION create_monthly_partitions() RETURNS VOID AS $$
      DECLARE
          month_start DATE;
          month_end DATE;
      BEGIN
          month_start := DATE_TRUNC('month', CURRENT_DATE);
          month_end := month_start + INTERVAL '1 MONTH' - INTERVAL '1 DAY';

          EXECUTE 'CREATE TABLE sales_' || TO_CHAR(month_start, 'YYYY_MM') || ' PARTITION OF sales FOR VALUES FROM (''' || month_start || ''') TO (''' || month_end || ''')';
      END;
      $$ LANGUAGE plpgsql;
  - step: Schedule the Function to Run Monthly
    details: >
      You can use a **cron job** or a **scheduler** like pg_cron to run this function at the start of each month.
  - step: Create a New Database at the Start of Each Month
    details: >
      To create a new database at the start of each month, you can use a similar approach with a function and a trigger.
    code: |
      CREATE OR REPLACE FUNCTION create_new_database() RETURNS VOID AS $$
      BEGIN
          EXECUTE 'CREATE DATABASE db_' || TO_CHAR(CURRENT_DATE, 'YYYY_MM');
      END;
      $$ LANGUAGE plpgsql;

      CREATE OR REPLACE FUNCTION create_database_trigger() RETURNS TRIGGER AS $$
      BEGIN
          IF (NEW.sale_date IS NOT NULL) THEN
              PERFORM create_new_database();
          END IF;
          RETURN NEW;
      END;
      $$ LANGUAGE plpgsql;

      CREATE TRIGGER create_database_trigger
      AFTER INSERT ON sales
      FOR EACH ROW EXECUTE FUNCTION create_database_trigger();
  - step: Schedule the Trigger Function
    details: >
      You can also schedule the trigger function to run at the start of each month using a scheduler.
  - summary: >
      By following these steps, you can automate the process of splitting your table into monthly partitions and creating a new database at the start of each month. This approach ensures that your data is organized efficiently and that new databases are created regularly.

ANSI SQL

Triggers

The return value of a row-level trigger fired AFTER or a statement-level trigger fired BEFORE or AFTER is always ignored; it might as well be null. However, any of these types of triggers might still abort the entire operation by raising an error.

This example trigger ensures that any time a row is inserted or updated in the table, the current user name and time are stamped into the row. And it checks that an employee's name is given and that the salary is a positive value.

CREATE TABLE emp (
    empname           text,
    salary            integer,
    last_date         timestamp,
    last_user         text
);

CREATE FUNCTION emp_stamp() RETURNS trigger AS $emp_stamp$
    BEGIN
        -- Check that empname and salary are given
        IF NEW.empname IS NULL THEN
            RAISE EXCEPTION 'empname cannot be null';
        END IF;
        IF NEW.salary IS NULL THEN
            RAISE EXCEPTION '% cannot have null salary', NEW.empname;
        END IF;

        -- Who works for us when they must pay for it?
        IF NEW.salary < 0 THEN
            RAISE EXCEPTION '% cannot have a negative salary', NEW.empname;
        END IF;

        -- Remember who changed the payroll when
        NEW.last_date := current_timestamp;
        NEW.last_user := current_user;
        RETURN NEW;
    END;
$emp_stamp$ LANGUAGE plpgsql;

CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp
    FOR EACH ROW EXECUTE FUNCTION emp_stamp();

Statement-level BEFORE triggers naturally fire before the statement starts to do anything, while statement-level AFTER triggers fire at the very end of the statement. These types of triggers may be defined on tables, views, or foreign tables. Row-level BEFORE triggers fire immediately before a particular row is operated on, while row-level AFTER triggers fire at the end of the statement (but before any statement-level AFTER triggers). These types of triggers may only be defined on tables and foreign tables, not views. INSTEAD OF triggers may only be defined on views, and only at row level; they fire immediately as each row in the view is identified as needing to be operated on.

The execution of an AFTER trigger can be deferred to the end of the transaction, rather than the end of the statement, if it was defined as a constraint trigger. In all cases, a trigger is executed as part of the same transaction as the statement that triggered it, so if either the statement or the trigger causes an error, the effects of both will be rolled back.

SQLAlchemy

Postgres

Postgres Extensions

Profiling

import subprocess
import sys

def run_psql_profiler(pgbouncer_instance):
    command = [
        'psql',
        '-h', pgbouncer_instance['host'],
        '-p', str(pgbouncer_instance['port']),
        '-U', pgbouncer_instance['user'],
        '-d', 'pgbouncer',
        '-c', 'SHOW POOLS;'
    ]

    result = subprocess.run(command, capture_output=True, text=True, env={"PGPASSWORD": pgbouncer_instance['password']})

    if result.returncode == 0:
        print("PSQL Profiler Output:")
        print(result.stdout)
    else:
        print("Error running PSQL Profiler:")
        print(result.stderr)

if __name__ == "__main__":
    if len(sys.argv) != 5:
        print("Usage: python script.py <host> <port> <user> <password>")
        sys.exit(1)

    pgbouncer_instance = {
        'host': sys.argv[1],
        'port': int(sys.argv[2]),
        'user': sys.argv[3],
        'password': sys.argv[4]
    }

    run_psql_profiler(pgbouncer_instance)

PGBouncer

PGBouncer Copy/Paste

import psycopg2

# Define the connection parameters
pgbouncer_host = 'localhost'
pgbouncer_port = 6432
pgbouncer_user = 'pgbouncer'
pgbouncer_password = 'your_password'

# Connect to PgBouncer
conn = psycopg2.connect(
    host=pgbouncer_host,
    port=pgbouncer_port,
    user=pgbouncer_user,
    password=pgbouncer_password,
    dbname='pgbouncer'
)

# Create a cursor object
cur = conn.cursor()

# Query the runtime settings
cur.execute("SHOW PGBouncerSettings;")
settings = cur.fetchall()

# Print the settings
for setting in settings:
    print(setting)

# Close the cursor and connection
cur.close()
conn.close()

JSONB

## Aliases
Output aliases cannot be used in the WHERE part. You have two choices: subquery or duplicate the definition.
```
select jsonb_array_elements((ARRAY(select jsonb_array_elements(msg->'root') ele ))[2]::jsonb) filterin
from js
where jsonb_array_elements((ARRAY(select jsonb_array_elements(msg->'root') ele ))[2]::jsonb)->>'cid'='CID1';
```
or
```
SELECT filterin FROM 
(select jsonb_array_elements((ARRAY(select jsonb_array_elements(msg->'root') ele ))[2]::jsonb) filterin
from js) data
WHERE filterin->>'cid'='CID1';
```

Malformed JSONB

JSONB columns used to store metadata can be malformed JSON. Use regexp replacements to parse it. For example, this parses any value that begins with a capital letter, such as "'Foo'" which regexp_match returns as "{'Foo'}"

select uuid, id, (regexp_replace( regexp_match(meta::text, '\u0027[A-Z][^\u0027]+.')::text, '[^A-Za-z\s]+', '','g' ) ) as name
from accounts
where uuid = 'fdfdfa-3r3434-sdfaf-334343'

SQLite

Architecture

Flat File Interop

Restoration

Usage

Funding

Internals

MySQL

MSSQL

PostgreSQL

Triggers

Abstraction

Implicit/Explicit Optimization

Implementing a Graph Database

Replication

https://kb.promise.com/thread/what-s-the-difference-between-asynchronous-and-synchronous-mirrors/ https://library.netapp.com/ecmdocs/ECMP12404965/html/GUID-B513B031-D524-4E0D-8FB1-3984C9D9FA20.html https://www.brianstorti.com/replication/ http://www.cattell.net/datastores/Datastores.pdf http://www.cs.sjsu.edu/faculty/kim/nosql/contents/presentation/slides/ScalableSQLandNoSQLDataStores.pdf

Pandas

Serverless

Excel

XSLT / XQuery Integration

Write Ahead Log / Change Data Capture

NoSQL / NewSQL

MMA

select
    (regexp_replace(regexp_replace('2021-5-9', '(-)([0-9]$)', '\10\2', 'g'), '(-)([0-9]-)', '\10\2', 'g'))

Write Ahead Log

+----------------+     +-----------------+     +-----------------+
|                |     |                 |     |                 |
|  Transaction   |     |  Write-Ahead    |     |  Database Disk  |
|    (Begin)     | --> |      Log        | --> |     Storage     |
|                |     |                 |     |                 |
+----------------+     +-----------------+     +-----------------+
       |                     |                       |
       |                     |                       |
+----------------+     +-----------------+     +-----------------+
|                |     |                 |     |                 |
|  Transaction   |     |  Write-Ahead    |     |  Database Disk  |
|    (Commit)    | --> |      Log        | --> |     Storage     |
|                |     |                 |     |                 |
+----------------+     +-----------------+     +-----------------+

Process:

  • A transaction is first recorded in memory
  • Any CRUD mutations are written to the Write-Ahead Log (WAL) from memory next
  • Finally, the transaction is committed to the database storage, and the changes are flushed from the WAL

If a crash occurs, all committed transactions can be replayed from the WAL to bring the database back to a consistent state

Comparison

,Teradata,Oracle
Use Case,"Large-scale data warehousing and analytics","Wide range of applications including transaction processing, data warehousing, and enterprise applications"
Architecture,"Massively Parallel Processing (MPP)","Shared-nothing"

change data capture

Feature,PostgreSQL,Oracle,Teradata,MySQL
WAL Purpose,"Ensures data integrity and durability","Provides atomicity and durability","Ensures data consistency and recovery","Ensures data durability and recovery"
WAL Mechanism,"Records changes before applying them","Logs changes before applying them","Uses a combination of logging and checkpointing","Logs changes before applying them"
Checkpointing,"Periodic checkpoints to flush WAL to disk","Periodic checkpoints to flush WAL to disk","Frequent checkpoints to ensure data consistency","Periodic checkpoints to flush WAL to disk"
Recovery,"Redo and undo operations for crash recovery","Redo and undo operations for crash recovery","Redo operations for crash recovery","Redo operations for crash recovery"
Performance Impact,"Reduced disk writes due to sequential logging","Reduced disk writes due to sequential logging","Optimized for large-scale data warehousing","Reduced disk writes due to sequential logging"
Archiving,"Supports continuous archiving and point-in-time recovery","Supports continuous archiving and point-in-time recovery","Supports data replication and recovery","Supports point-in-time recovery"
Configurability,"Various WAL levels (minimal, replica, logical)","Configurable logging levels and checkpoint intervals","Configurable logging and checkpoint settings","Configurable logging and checkpoint settings"
⚠️ **GitHub.com Fallback** ⚠️