Deployment Infrastructure - sbuddharaju369/WebsiteAnalyzer GitHub Wiki

PoC/Demo Deployment Architecture

Lightweight Demo Infrastructure

Platform: Replit Autoscale Deployment

  • Runtime: Python 3.11 with Nix package management
  • Resources: 2 vCPU, 4GB RAM, 20GB storage
  • Port Configuration: 5000 (optimized for Replit's network)
  • Auto-scaling: Handles 0-100 concurrent users seamlessly

Database Strategy:

  • Primary: Replit PostgreSQL (Neon-backed) for metadata and query history
  • Vector Storage: Local ChromaDB with SQLite backend
  • File Storage: Local filesystem with JSON cache files
  • Backup: Automatic Replit snapshots

Configuration:

# .streamlit/config.toml [server] headless = true address = "0.0.0.0" port = 5000 maxUploadSize = 50 [theme] base = "light" primaryColor = "#1f77b4"

Demo Limitations:

  • Maximum 50 pages per crawl session
  • 24-hour cache retention
  • Single-user concurrent sessions
  • Basic rate limiting (1 request/second)

Estimated Costs: -25/month depending on usage

Production-Hardened Deployment Architecture

Multi-Tier Production Infrastructure

Primary Deployment Platform: AWS/GCP with Container Orchestration

Application Tier

Containerization Strategy:

# Production Dockerfile FROM python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . EXPOSE 8080 CMD ["streamlit", "run", "app.py", "--server.port=8080", "--server.address=0.0.0.0"]

Kubernetes Deployment:

  • Pods: 3-5 replicas with horizontal auto-scaling
  • Resources: 4 vCPU, 8GB RAM per pod
  • Load Balancer: NGINX Ingress with SSL termination
  • Health Checks: Liveness and readiness probes

Database Architecture

Primary Database:

PostgreSQL 14+ (AWS RDS/Google Cloud SQL)

  • Configuration: db.r5.xlarge (4 vCPU, 32GB RAM)
  • Storage: 500GB SSD with automated backups
  • High Availability: Multi-AZ deployment with read replicas
  • Connection Pooling: PgBouncer with 100 max connections

Vector Database:

Managed ChromaDB or Pinecone

  • ChromaDB: Self-hosted on dedicated instances
    • Specs: 8 vCPU, 32GB RAM, 1TB NVMe SSD
    • Clustering: 3-node cluster with replication
  • Alternative: Pinecone (managed service)
    • Plan: Standard tier with 100M+ vectors
    • Performance: Sub-50ms query latency

Caching Layer:

Redis Cluster

  • Configuration: 3-node cluster with 16GB memory each
  • Purpose: Session management, query caching, rate limiting
  • Persistence: RDB + AOF for durability

Content Delivery & Storage

Object Storage:

AWS S3/Google Cloud Storage

  • Cache Files: Compressed JSON with embeddings
  • Static Assets: CSS, JS, images with CloudFront CDN
  • Backup Strategy: Cross-region replication
  • Lifecycle Policies: Automatic archival after 90 days

CDN Configuration:

  • Primary: CloudFront/Cloud CDN
  • Edge Locations: Global distribution
  • Caching: Static assets (1 year), API responses (5 minutes)
  • Compression: Gzip/Brotli for text content

Security & Monitoring

API Security:

# Rate limiting configuration RATE_LIMITS = { 'crawling': '5 per hour per user', 'questions': '100 per hour per user', 'search': '1000 per hour per user' } # Authentication AUTH_PROVIDERS = ['OAuth2', 'API_Key', 'JWT']

Infrastructure Security:

  • WAF: AWS WAF/Google Cloud Armor with DDoS protection
  • VPC: Private subnets for database and application tiers
  • Secrets Management: AWS Secrets Manager/Google Secret Manager
  • SSL/TLS: Let's Encrypt with automatic renewal

Monitoring Stack:

  • Application Monitoring: DataDog/New Relic
  • Infrastructure: Prometheus + Grafana
  • Logging: ELK Stack (Elasticsearch, Logstash, Kibana)
  • Alerting: PagerDuty integration for critical issues

Performance Optimization

Auto-Scaling Configuration:

# Kubernetes HPA Configuration apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler spec: minReplicas: 3 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80

Database Optimization:

  • Connection Pooling: PgBouncer with 500 max connections
  • Query Optimization: Automated EXPLAIN analysis
  • Indexing Strategy: Composite indexes on frequently queried columns
  • Partitioning: Date-based partitioning for large tables

Disaster Recovery & Backup

Backup Strategy:

  • Database: Automated daily backups with 30-day retention
  • Vector Database: Weekly full backups with incremental daily
  • Application State: Container image versioning and rollback capability
  • File Storage: Cross-region replication with versioning

Recovery Procedures:

  • RTO (Recovery Time Objective): 15 minutes
  • RPO (Recovery Point Objective): 1 hour
  • Failover: Automated with health check triggers
  • Testing: Monthly disaster recovery drills

Scaling Scenarios & Resource Planning

Small Scale (100-500 users)

Infrastructure:

  • 3 application pods (2 vCPU, 4GB RAM each)
  • PostgreSQL: db.t3.large (2 vCPU, 8GB RAM)
  • ChromaDB: Single instance (4 vCPU, 16GB RAM)
  • Redis: Single instance (2GB memory)

Estimated Costs: 00-1,200/month

Medium Scale (500-5,000 users)

Infrastructure:

  • 5-10 application pods with auto-scaling
  • PostgreSQL: db.r5.xlarge with read replicas
  • ChromaDB: 3-node cluster
  • Redis: 3-node cluster
  • CDN and advanced monitoring

Estimated Costs: $2,500-4,000/month

Large Scale (5,000+ users)

Infrastructure:

  • 10-50 application pods across multiple zones
  • PostgreSQL: Multi-AZ with multiple read replicas
  • Managed vector database (Pinecone/Weaviate Cloud)
  • Full observability and security stack
  • Dedicated DevOps engineer

Estimated Costs: $8,000-15,000/month

DevOps & CI/CD Pipeline

Development Workflow

GitHub Actions Pipeline

name: Production Deployment on: push: branches: [main] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Run Tests run: | python -m pytest tests/ python -m flake8 src/

build: needs: test runs-on: ubuntu-latest steps: - name: Build Docker Image run: docker build -t web-analyzer:$GITHUB_SHA .

deploy: needs: build runs-on: ubuntu-latest steps: - name: Deploy to Kubernetes run: kubectl apply -f k8s/

Environment Management

  • Development: Local Docker Compose
  • Staging: Kubernetes cluster with production data subset
  • Production: Full production infrastructure with blue-green deployment

Performance Benchmarks

  • Crawling: 50-100 pages/minute per worker
  • Query Response: <2 seconds for most questions
  • Concurrent Users: 1,000+ with proper scaling
  • Embedding Processing: 10,000 chunks/minute
  • Database Queries: <100ms average response time

This infrastructure design provides a clear evolution path from simple demo to enterprise-grade production deployment, with appropriate cost scaling and performance optimization at each tier.