Database Migration - openguard-bot/openguard GitHub Wiki

Database Migration

This guide covers the migration process from JSON file storage to PostgreSQL database, including data preservation, validation, and rollback procedures.

๐Ÿ“‹ Migration Overview

Migration Scope

AIMod's migration from JSON to PostgreSQL involves converting the following data:

Data Files to Migrate:

  • guild_config.json โ†’ guild_config table
  • user_infractions.json โ†’ user_infractions table
  • appeals.json โ†’ appeals table
  • global_bans.json โ†’ global_bans table
  • logging_data.json โ†’ event_logs table
  • botdetect_config.json โ†’ botdetect_config table
  • user_data.json โ†’ user_data table

Benefits of Migration

Performance Improvements:

  • 10x faster query performance
  • Concurrent access support
  • ACID transaction guarantees
  • Efficient indexing and searching

Scalability Benefits:

  • Support for millions of records
  • Connection pooling
  • Query optimization
  • Horizontal scaling potential

Data Integrity:

  • Foreign key constraints
  • Data type validation
  • Atomic operations
  • Backup and recovery

๐Ÿ”ง Pre-Migration Setup

Prerequisites

System Requirements:

  • PostgreSQL 13+ installed and running
  • Python 3.11+ with asyncpg
  • Sufficient disk space (2x current JSON data size)
  • Database user with appropriate permissions

Backup Existing Data:

# Create backup directory
mkdir -p backups/pre-migration

# Backup JSON data
cp -r wdiscordbot-json-data/ backups/pre-migration/
tar -czf backups/pre-migration/json-backup-$(date +%Y%m%d).tar.gz wdiscordbot-json-data/

# Backup current bot configuration
cp .env backups/pre-migration/
cp bot.py backups/pre-migration/

Database Preparation

Install PostgreSQL

# Ubuntu/Debian
sudo apt update
sudo apt install postgresql postgresql-contrib

# Start and enable PostgreSQL
sudo systemctl start postgresql
sudo systemctl enable postgresql

Create Database and User

# Switch to postgres user
sudo -u postgres psql

# Create database and user
CREATE DATABASE aimod_bot;
CREATE USER aimod_user WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE aimod_bot TO aimod_user;
ALTER USER aimod_user CREATEDB;
\q

Configure Database Connection

# Update .env file
echo "DATABASE_URL=postgresql://aimod_user:secure_password@localhost:5432/aimod_bot" >> .env
echo "REDIS_URL=redis://localhost:6379" >> .env

๐Ÿ—„๏ธ Database Schema Creation

Automated Schema Setup

The migration script automatically creates all required tables:

# Run schema creation
python -c "
from database.connection import initialize_database
import asyncio
asyncio.run(initialize_database())
print('Database schema created successfully')
"

Manual Schema Creation

If you need to create the schema manually:

-- Connect to database
psql -h localhost -U aimod_user -d aimod_bot

-- Guild configuration table
CREATE TABLE IF NOT EXISTS guild_config (
    guild_id BIGINT NOT NULL,
    key VARCHAR(255) NOT NULL,
    value JSONB,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (guild_id, key)
);

-- User infractions table
CREATE TABLE IF NOT EXISTS user_infractions (
    id SERIAL PRIMARY KEY,
    guild_id BIGINT NOT NULL,
    user_id BIGINT NOT NULL,
    timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
    rule_violated VARCHAR(50),
    action_taken VARCHAR(100),
    reasoning TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Appeals table
CREATE TABLE IF NOT EXISTS appeals (
    id SERIAL PRIMARY KEY,
    user_id BIGINT NOT NULL,
    guild_id BIGINT NOT NULL,
    infraction_id INTEGER REFERENCES user_infractions(id),
    reason TEXT NOT NULL,
    status VARCHAR(20) DEFAULT 'PENDING',
    admin_response TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Global bans table
CREATE TABLE IF NOT EXISTS global_bans (
    user_id BIGINT PRIMARY KEY,
    reason TEXT NOT NULL,
    banned_by BIGINT NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Event logs table
CREATE TABLE IF NOT EXISTS event_logs (
    id SERIAL PRIMARY KEY,
    guild_id BIGINT NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    user_id BIGINT,
    channel_id BIGINT,
    data JSONB,
    timestamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Bot detection config table
CREATE TABLE IF NOT EXISTS botdetect_config (
    guild_id BIGINT PRIMARY KEY,
    enabled BOOLEAN DEFAULT false,
    keywords JSONB DEFAULT '[]',
    action VARCHAR(50) DEFAULT 'timeout',
    timeout_duration INTEGER DEFAULT 3600,
    log_channel BIGINT,
    whitelist_roles JSONB DEFAULT '[]',
    whitelist_users JSONB DEFAULT '[]',
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- User data table
CREATE TABLE IF NOT EXISTS user_data (
    user_id BIGINT PRIMARY KEY,
    username VARCHAR(255),
    discriminator VARCHAR(10),
    avatar_url TEXT,
    first_seen TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    last_seen TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    message_count INTEGER DEFAULT 0,
    infraction_count INTEGER DEFAULT 0
);

-- Create indexes for performance
CREATE INDEX IF NOT EXISTS idx_guild_config_guild_id ON guild_config(guild_id);
CREATE INDEX IF NOT EXISTS idx_user_infractions_guild_user ON user_infractions(guild_id, user_id);
CREATE INDEX IF NOT EXISTS idx_user_infractions_timestamp ON user_infractions(timestamp);
CREATE INDEX IF NOT EXISTS idx_appeals_user_guild ON appeals(user_id, guild_id);
CREATE INDEX IF NOT EXISTS idx_event_logs_guild_type ON event_logs(guild_id, event_type);
CREATE INDEX IF NOT EXISTS idx_event_logs_timestamp ON event_logs(timestamp);

๐Ÿ”„ Migration Process

Running the Migration

The migration script handles all data conversion automatically:

# Run the complete migration
python migrate_json_to_postgresql.py

Migration Script Overview

The migration script performs the following steps:

  1. Validation: Checks JSON file integrity
  2. Conversion: Transforms JSON data to PostgreSQL format
  3. Import: Inserts data into database tables
  4. Verification: Validates migration success
  5. Cleanup: Optional JSON file archival

Detailed Migration Steps

1. Guild Configuration Migration

async def migrate_guild_config():
    """Migrate guild_config.json to PostgreSQL."""
    
    json_file = "wdiscordbot-json-data/guild_config.json"
    if not os.path.exists(json_file):
        print("No guild_config.json found, skipping...")
        return
    
    print("Migrating guild configuration...")
    
    with open(json_file, "r") as f:
        data = json.load(f)
    
    migrated_count = 0
    async with get_connection() as conn:
        for guild_id_str, config in data.items():
            guild_id = int(guild_id_str)
            
            for key, value in config.items():
                await conn.execute("""
                    INSERT INTO guild_config (guild_id, key, value)
                    VALUES ($1, $2, $3)
                    ON CONFLICT (guild_id, key) DO NOTHING
                """, guild_id, key, json.dumps(value))
                migrated_count += 1
    
    print(f"โœ… Migrated {migrated_count} guild configuration entries")

2. User Infractions Migration

async def migrate_user_infractions():
    """Migrate user_infractions.json to PostgreSQL."""
    
    json_file = "wdiscordbot-json-data/user_infractions.json"
    if not os.path.exists(json_file):
        print("No user_infractions.json found, skipping...")
        return
    
    print("Migrating user infractions...")
    
    with open(json_file, "r") as f:
        data = json.load(f)
    
    migrated_count = 0
    async with get_connection() as conn:
        for key, infractions in data.items():
            # Parse guild_id and user_id from key format: "guild_id_user_id"
            parts = key.split("_")
            if len(parts) >= 2:
                guild_id = int(parts[0])
                user_id = int(parts[1])
                
                for infraction in infractions:
                    # Convert timestamp format
                    timestamp_str = infraction.get("timestamp", "")
                    try:
                        timestamp = datetime.fromisoformat(timestamp_str.replace("Z", "+00:00"))
                    except:
                        timestamp = datetime.utcnow()
                    
                    await conn.execute("""
                        INSERT INTO user_infractions 
                        (guild_id, user_id, timestamp, rule_violated, action_taken, reasoning)
                        VALUES ($1, $2, $3, $4, $5, $6)
                    """, 
                        guild_id,
                        user_id,
                        timestamp,
                        infraction.get("rule_violated", ""),
                        infraction.get("action_taken", ""),
                        infraction.get("reasoning", "")
                    )
                    migrated_count += 1
    
    print(f"โœ… Migrated {migrated_count} user infractions")

3. Appeals Migration

async def migrate_appeals():
    """Migrate appeals.json to PostgreSQL."""
    
    json_file = "wdiscordbot-json-data/appeals.json"
    if not os.path.exists(json_file):
        print("No appeals.json found, skipping...")
        return
    
    print("Migrating appeals...")
    
    with open(json_file, "r") as f:
        data = json.load(f)
    
    migrated_count = 0
    async with get_connection() as conn:
        for appeal_data in data:
            # Convert timestamp
            created_at = datetime.fromisoformat(
                appeal_data.get("created_at", datetime.utcnow().isoformat())
            )
            
            await conn.execute("""
                INSERT INTO appeals 
                (user_id, guild_id, reason, status, admin_response, created_at)
                VALUES ($1, $2, $3, $4, $5, $6)
            """,
                appeal_data.get("user_id"),
                appeal_data.get("guild_id"),
                appeal_data.get("reason", ""),
                appeal_data.get("status", "PENDING"),
                appeal_data.get("admin_response"),
                created_at
            )
            migrated_count += 1
    
    print(f"โœ… Migrated {migrated_count} appeals")

Migration Validation

Data Integrity Checks

async def validate_migration():
    """Validate that migration was successful."""
    
    print("\n=== Migration Validation ===")
    
    # Check record counts
    validation_results = {}
    
    # Guild config validation
    json_guilds = len(load_json_data("guild_config.json", {}))
    async with get_connection() as conn:
        db_guilds = await conn.fetchval(
            "SELECT COUNT(DISTINCT guild_id) FROM guild_config"
        )
    validation_results["guild_config"] = (json_guilds, db_guilds)
    
    # User infractions validation
    json_infractions = sum(
        len(infractions) 
        for infractions in load_json_data("user_infractions.json", {}).values()
    )
    async with get_connection() as conn:
        db_infractions = await conn.fetchval(
            "SELECT COUNT(*) FROM user_infractions"
        )
    validation_results["user_infractions"] = (json_infractions, db_infractions)
    
    # Print validation results
    all_valid = True
    for table, (json_count, db_count) in validation_results.items():
        status = "โœ…" if json_count == db_count else "โŒ"
        print(f"{status} {table}: JSON={json_count}, DB={db_count}")
        if json_count != db_count:
            all_valid = False
    
    if all_valid:
        print("\n๐ŸŽ‰ Migration validation successful!")
    else:
        print("\nโš ๏ธ  Migration validation failed - please review the data")
    
    return all_valid

๐Ÿงช Testing Migration

Pre-Migration Testing

# Test database connection
python -c "
from database.connection import get_pool
import asyncio

async def test():
    pool = await get_pool()
    print('Database connection successful' if pool else 'Connection failed')

asyncio.run(test())
"

# Test JSON file integrity
python -c "
import json
import os

files = ['guild_config.json', 'user_infractions.json', 'appeals.json', 'global_bans.json']
for file in files:
    path = f'wdiscordbot-json-data/{file}'
    if os.path.exists(path):
        try:
            with open(path) as f:
                json.load(f)
            print(f'โœ… {file} is valid JSON')
        except Exception as e:
            print(f'โŒ {file} is invalid: {e}')
    else:
        print(f'โš ๏ธ  {file} not found')
"

Post-Migration Testing

# Run comprehensive migration test
python test_postgresql_migration.py

# Test specific functionality
python -c "
from database.operations import get_guild_config, set_guild_config
import asyncio

async def test():
    # Test configuration operations
    await set_guild_config(123456789, 'TEST_KEY', 'test_value')
    value = await get_guild_config(123456789, 'TEST_KEY')
    print(f'Config test: {value}')
    
    # Test infraction operations
    from database.operations import get_user_infractions
    infractions = await get_user_infractions(123456789, 987654321)
    print(f'Infractions test: {len(infractions)} records')

asyncio.run(test())
"

๐Ÿ”„ Rollback Procedures

Emergency Rollback

If migration fails or issues are discovered:

# Stop the bot
sudo systemctl stop aimod-bot.service

# Restore JSON files
cp -r backups/pre-migration/wdiscordbot-json-data/ ./

# Revert to JSON-based code
git checkout json-storage-branch  # If you have a backup branch

# Restart with JSON storage
python bot.py

Partial Rollback

To rollback specific tables:

-- Connect to database
psql -h localhost -U aimod_user -d aimod_bot

-- Clear specific table
TRUNCATE TABLE guild_config CASCADE;

-- Re-run specific migration
-- Then run: python migrate_guild_config_only.py

Data Recovery

# Recover from backup
tar -xzf backups/pre-migration/json-backup-YYYYMMDD.tar.gz

# Verify data integrity
python -c "
import json
import os

for file in os.listdir('wdiscordbot-json-data'):
    if file.endswith('.json'):
        with open(f'wdiscordbot-json-data/{file}') as f:
            data = json.load(f)
            print(f'{file}: {len(data)} records')
"

๐Ÿ“Š Migration Monitoring

Progress Tracking

The migration script provides detailed progress information:

=== AIMod JSON to PostgreSQL Migration ===

Initializing database connection...
โœ… Database connection established

Migrating guild configuration...
Progress: [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ] 100% (1250/1250)
โœ… Migrated 1250 guild configuration entries

Migrating user infractions...
Progress: [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ] 100% (5430/5430)
โœ… Migrated 5430 user infractions

Migrating appeals...
Progress: [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ] 100% (23/23)
โœ… Migrated 23 appeals

=== Migration Complete ===
Total time: 2m 34s
All JSON data has been successfully migrated to PostgreSQL

Performance Metrics

Monitor migration performance:

import time
import psutil

def monitor_migration():
    """Monitor system resources during migration."""
    
    start_time = time.time()
    process = psutil.Process()
    
    print(f"CPU usage: {process.cpu_percent()}%")
    print(f"Memory usage: {process.memory_info().rss / 1024 / 1024:.2f} MB")
    print(f"Elapsed time: {time.time() - start_time:.2f}s")

๐Ÿ”ง Troubleshooting

Common Issues

Permission Errors:

# Fix database permissions
sudo -u postgres psql
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO aimod_user;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO aimod_user;

Memory Issues:

# Process large files in chunks
# Increase PostgreSQL memory settings
sudo nano /etc/postgresql/13/main/postgresql.conf
# shared_buffers = 256MB
# work_mem = 4MB

Encoding Issues:

# Handle encoding problems
with open(json_file, "r", encoding="utf-8") as f:
    data = json.load(f)

Connection Timeouts:

# Increase connection timeout
DATABASE_URL = "postgresql://user:pass@localhost:5432/db?connect_timeout=60"

Migration Logs

Check migration logs for issues:

# View migration output
tail -f migration.log

# Check for errors
grep -i error migration.log

# Check database logs
sudo tail -f /var/log/postgresql/postgresql-13-main.log

Next: Developer Guide - Development setup and contribution guidelines