Scalability - ruvnet/ruv-FANN GitHub Wiki

Scalability

Scalability Principles
Horizontal vs Vertical Scaling
Database Scalability
API Scalability
Performance Testing
Capacity Planning
Best Practices
Implementation Strategies

Scalability Principles

Definition

Scalability is the ability of a system to handle increased load by adding resources to the system. It encompasses both the capability to scale up (vertical scaling) and scale out (horizontal scaling) to meet growing demands.

Core Principles

1. Statelessness

// ✅ Stateless design - scalable
class UserService {
  async getUserById(id, token) {
    const user = await this.userRepository.findById(id);
    if (!this.authService.validateToken(token)) {
      throw new UnauthorizedException();
    }
    return user;
  }
}

// ❌ Stateful design - not scalable
class UserService {
  constructor() {
    this.currentUser = null; // State stored in memory
  }
  
  async login(credentials) {
    this.currentUser = await this.authenticate(credentials);
    return this.currentUser;
  }
}

2. Loose Coupling

Systems should be designed with minimal dependencies between components to allow independent scaling.

# Microservices architecture example
services:
  user-service:
    image: user-service:latest
    replicas: 3
  
  order-service:
    image: order-service:latest
    replicas: 5
  
  notification-service:
    image: notification-service:latest
    replicas: 2

3. Asynchronous Processing

// Message queue for async processing
const queue = new MessageQueue('order-processing');

// Producer
async function createOrder(orderData) {
  const order = await saveOrder(orderData);
  await queue.publish('order.created', order);
  return order; // Return immediately
}

// Consumer
queue.subscribe('order.created', async (order) => {
  await processPayment(order);
  await updateInventory(order);
  await sendConfirmationEmail(order);
});

4. Data Partitioning

// Horizontal partitioning (sharding)
class ShardedUserRepository {
  getShardKey(userId) {
    return userId % this.shardCount;
  }
  
  async findById(userId) {
    const shard = this.getShardKey(userId);
    return this.shards[shard].findById(userId);
  }
}

Horizontal vs Vertical Scaling

Vertical Scaling (Scale Up)

Definition

Adding more power (CPU, RAM, Storage) to existing machines.

Advantages

Simplicity: No application changes required
Data consistency: Single database instance
Lower complexity: Fewer moving parts
Cost-effective for small scales: Initial scaling is straightforward

Disadvantages

Hardware limits: Physical constraints on single machine capacity
Single point of failure: System depends on one machine
Expensive at scale: High-end hardware costs increase exponentially
Downtime during upgrades: System unavailable during hardware changes

Implementation Example

# Before scaling
resources:
  cpu: "2"
  memory: "4Gi"
  storage: "100Gi"

# After vertical scaling
resources:
  cpu: "8"
  memory: "16Gi"
  storage: "500Gi"

Horizontal Scaling (Scale Out)

Definition

Adding more machines to the resource pool to handle increased load.

Advantages

No upper limit: Theoretically unlimited scaling
Fault tolerance: Failure of one machine doesn't bring down the system
Cost-effective: Use commodity hardware
Elastic scaling: Add/remove capacity based on demand

Disadvantages

Complexity: Application must be designed for distributed systems
Data consistency challenges: Distributed data management
Network overhead: Communication between nodes
State management: Session and data sharing complexities

Implementation Example

# Load balancer configuration
apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web-app
  ports:
  - port: 80
    targetPort: 8080
  type: LoadBalancer

---
# Horizontal scaling with replicas
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 5  # Scale to 5 instances
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
      - name: web-app
        image: web-app:latest
        ports:
        - containerPort: 8080

Hybrid Approach

// Auto-scaling configuration
const scalingPolicy = {
  vertical: {
    minCpu: "1",
    maxCpu: "4",
    minMemory: "2Gi",
    maxMemory: "8Gi"
  },
  horizontal: {
    minReplicas: 2,
    maxReplicas: 10,
    targetCPUUtilization: 70,
    scaleUpCooldown: "5m",
    scaleDownCooldown: "10m"
  }
};

Database Scalability

Read Replicas

// Database connection routing
class DatabaseManager {
  constructor() {
    this.writeDB = new Connection('primary-db');
    this.readReplicas = [
      new Connection('read-replica-1'),
      new Connection('read-replica-2'),
      new Connection('read-replica-3')
    ];
  }
  
  async write(query, params) {
    return this.writeDB.execute(query, params);
  }
  
  async read(query, params) {
    const replica = this.getRandomReplica();
    return replica.execute(query, params);
  }
  
  getRandomReplica() {
    const index = Math.floor(Math.random() * this.readReplicas.length);
    return this.readReplicas[index];
  }
}

Database Sharding

// User-based sharding strategy
class ShardedDatabase {
  constructor() {
    this.shards = {
      'shard-1': new Connection('user-db-1'), // Users A-H
      'shard-2': new Connection('user-db-2'), // Users I-P
      'shard-3': new Connection('user-db-3'), // Users Q-Z
    };
  }
  
  getShardForUser(username) {
    const firstLetter = username[0].toLowerCase();
    if (firstLetter <= 'h') return 'shard-1';
    if (firstLetter <= 'p') return 'shard-2';
    return 'shard-3';
  }
  
  async getUserData(username) {
    const shardKey = this.getShardForUser(username);
    return this.shards[shardKey].findUser(username);
  }
}

Connection Pooling

// Database connection pool
const pool = new Pool({
  host: 'database-server',
  port: 5432,
  database: 'app_db',
  user: 'app_user',
  password: process.env.DB_PASSWORD,
  min: 5,    // Minimum connections
  max: 20,   // Maximum connections
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000
});

class UserRepository {
  async findById(id) {
    const client = await pool.connect();
    try {
      const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
      return result.rows[0];
    } finally {
      client.release();
    }
  }
}

Caching Strategies

// Multi-level caching
class CachedUserService {
  constructor() {
    this.l1Cache = new Map(); // In-memory cache
    this.l2Cache = new RedisClient(); // Distributed cache
    this.database = new Database();
  }
  
  async getUser(id) {
    // L1 Cache check
    if (this.l1Cache.has(id)) {
      return this.l1Cache.get(id);
    }
    
    // L2 Cache check
    const cachedUser = await this.l2Cache.get(`user:${id}`);
    if (cachedUser) {
      this.l1Cache.set(id, cachedUser);
      return cachedUser;
    }
    
    // Database fetch
    const user = await this.database.getUserById(id);
    if (user) {
      this.l1Cache.set(id, user);
      await this.l2Cache.setex(`user:${id}`, 3600, JSON.stringify(user));
    }
    
    return user;
  }
}

API Scalability

Load Balancing

# Nginx load balancer configuration
upstream api_servers {
    least_conn;
    server api1.example.com:3000 weight=3;
    server api2.example.com:3000 weight=2;
    server api3.example.com:3000 weight=1;
    server api4.example.com:3000 backup;
}

server {
    listen 80;
    server_name api.example.com;
    
    location / {
        proxy_pass http://api_servers;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        
        # Health checks
        proxy_connect_timeout 5s;
        proxy_send_timeout 10s;
        proxy_read_timeout 10s;
    }
}

Rate Limiting

// Rate limiting middleware
const rateLimit = require('express-rate-limit');
const RedisStore = require('rate-limit-redis');

const limiter = rateLimit({
  store: new RedisStore({
    client: redisClient,
    prefix: 'rl:'
  }),
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // Limit each IP to 100 requests per windowMs
  message: 'Too many requests, please try again later.',
  standardHeaders: true,
  legacyHeaders: false,
  keyGenerator: (req) => {
    // Rate limit by user ID if authenticated, otherwise by IP
    return req.user?.id || req.ip;
  }
});

app.use('/api/', limiter);

Request Queuing

// Queue-based request processing
const Queue = require('bull');
const requestQueue = new Queue('request processing', {
  redis: { port: 6379, host: '127.0.0.1' }
});

// API endpoint
app.post('/api/heavy-task', async (req, res) => {
  const job = await requestQueue.add('process-request', {
    userId: req.user.id,
    data: req.body,
    timestamp: Date.now()
  }, {
    attempts: 3,
    backoff: 'exponential',
    delay: 2000
  });
  
  res.json({ jobId: job.id, status: 'queued' });
});

// Worker process
requestQueue.process('process-request', 5, async (job) => {
  const { userId, data } = job.data;
  
  job.progress(10);
  const result = await heavyProcessing(data);
  
  job.progress(50);
  await saveResults(userId, result);
  
  job.progress(100);
  await notifyUser(userId, result);
  
  return result;
});

API Versioning

// Version-aware routing
class APIRouter {
  constructor() {
    this.routes = new Map();
  }
  
  register(version, path, handler) {
    const key = `${version}:${path}`;
    this.routes.set(key, handler);
  }
  
  async handle(req, res) {
    const version = req.headers['api-version'] || '1.0';
    const path = req.path;
    const key = `${version}:${path}`;
    
    const handler = this.routes.get(key) || this.routes.get(`1.0:${path}`);
    if (!handler) {
      return res.status(404).json({ error: 'Endpoint not found' });
    }
    
    return handler(req, res);
  }
}

// Usage
const router = new APIRouter();
router.register('1.0', '/users', getUsersV1);
router.register('2.0', '/users', getUsersV2);
router.register('2.1', '/users', getUsersV21);

Circuit Breaker Pattern

class CircuitBreaker {
  constructor(service, options = {}) {
    this.service = service;
    this.failureThreshold = options.failureThreshold || 5;
    this.timeout = options.timeout || 60000;
    this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
    this.failureCount = 0;
    this.lastFailureTime = null;
  }
  
  async call(...args) {
    if (this.state === 'OPEN') {
      if (Date.now() - this.lastFailureTime > this.timeout) {
        this.state = 'HALF_OPEN';
      } else {
        throw new Error('Circuit breaker is OPEN');
      }
    }
    
    try {
      const result = await this.service(...args);
      if (this.state === 'HALF_OPEN') {
        this.state = 'CLOSED';
        this.failureCount = 0;
      }
      return result;
    } catch (error) {
      this.failureCount++;
      this.lastFailureTime = Date.now();
      
      if (this.failureCount >= this.failureThreshold) {
        this.state = 'OPEN';
      }
      
      throw error;
    }
  }
}

// Usage
const externalAPIBreaker = new CircuitBreaker(callExternalAPI, {
  failureThreshold: 3,
  timeout: 30000
});

async function getUserData(id) {
  try {
    return await externalAPIBreaker.call(id);
  } catch (error) {
    // Fallback to cached data
    return getCachedUserData(id);
  }
}

Performance Testing

Load Testing with Artillery

# artillery-config.yml
config:
  target: 'https://api.example.com'
  phases:
    - duration: 300  # 5 minutes
      arrivalRate: 10  # 10 requests per second
    - duration: 600  # 10 minutes
      arrivalRate: 50  # 50 requests per second
    - duration: 300  # 5 minutes
      arrivalRate: 100 # 100 requests per second

scenarios:
  - name: "User Journey"
    weight: 70
    flow:
      - post:
          url: "/auth/login"
          json:
            email: "[email protected]"
            password: "password"
          capture:
            - json: "$.token"
              as: "authToken"
      - get:
          url: "/api/users/profile"
          headers:
            Authorization: "Bearer {{ authToken }}"
      - get:
          url: "/api/users/dashboard"
          headers:
            Authorization: "Bearer {{ authToken }}"

  - name: "Public API"
    weight: 30
    flow:
      - get:
          url: "/api/public/status"
      - get:
          url: "/api/public/metrics"

Stress Testing Script

// stress-test.js
const cluster = require('cluster');
const numCPUs = require('os').cpus().length;
const axios = require('axios');

if (cluster.isMaster) {
  console.log(`Master ${process.pid} is running`);
  
  // Fork workers
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
  
  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died`);
  });
} else {
  // Worker processes
  console.log(`Worker ${process.pid} started`);
  
  const stats = {
    requests: 0,
    errors: 0,
    totalTime: 0
  };
  
  async function stressTest() {
    const startTime = Date.now();
    
    while (Date.now() - startTime < 60000) { // Run for 1 minute
      try {
        const start = Date.now();
        await axios.get('https://api.example.com/health');
        stats.totalTime += Date.now() - start;
        stats.requests++;
      } catch (error) {
        stats.errors++;
      }
    }
    
    console.log(`Worker ${process.pid} stats:`, {
      requests: stats.requests,
      errors: stats.errors,
      avgResponseTime: stats.totalTime / stats.requests,
      errorRate: (stats.errors / (stats.requests + stats.errors)) * 100
    });
  }
  
  stressTest();
}

Performance Monitoring

// Performance metrics collection
class PerformanceMonitor {
  constructor() {
    this.metrics = {
      requestCount: 0,
      errorCount: 0,
      responseTimes: [],
      activeConnections: 0
    };
    
    // Start metrics collection
    setInterval(() => this.collectMetrics(), 1000);
  }
  
  recordRequest(responseTime, isError = false) {
    this.metrics.requestCount++;
    this.metrics.responseTimes.push(responseTime);
    
    if (isError) {
      this.metrics.errorCount++;
    }
    
    // Keep only last 1000 response times
    if (this.metrics.responseTimes.length > 1000) {
      this.metrics.responseTimes.shift();
    }
  }
  
  getMetrics() {
    const responseTimes = this.metrics.responseTimes;
    const sorted = [...responseTimes].sort((a, b) => a - b);
    
    return {
      requestsPerSecond: this.metrics.requestCount,
      errorRate: (this.metrics.errorCount / this.metrics.requestCount) * 100,
      averageResponseTime: responseTimes.reduce((a, b) => a + b, 0) / responseTimes.length,
      p50: sorted[Math.floor(sorted.length * 0.5)],
      p95: sorted[Math.floor(sorted.length * 0.95)],
      p99: sorted[Math.floor(sorted.length * 0.99)],
      activeConnections: this.metrics.activeConnections
    };
  }
  
  collectMetrics() {
    const metrics = this.getMetrics();
    console.log('Performance Metrics:', metrics);
    
    // Reset counters
    this.metrics.requestCount = 0;
    this.metrics.errorCount = 0;
    
    // Send to monitoring system
    this.sendToMonitoring(metrics);
  }
  
  async sendToMonitoring(metrics) {
    // Send to monitoring service (e.g., Prometheus, Datadog)
    try {
      await axios.post('http://monitoring-service/metrics', {
        timestamp: Date.now(),
        service: 'api-server',
        ...metrics
      });
    } catch (error) {
      console.error('Failed to send metrics:', error.message);
    }
  }
}

Capacity Planning

Resource Utilization Analysis

// Capacity planning calculator
class CapacityPlanner {
  constructor(currentMetrics, growthRate) {
    this.current = currentMetrics;
    this.growthRate = growthRate; // Monthly growth rate
  }
  
  calculateFutureCapacity(months) {
    const growthMultiplier = Math.pow(1 + this.growthRate, months);
    
    return {
      cpu: {
        current: this.current.cpu.utilization,
        projected: this.current.cpu.utilization * growthMultiplier,
        recommendedCapacity: this.current.cpu.capacity * growthMultiplier * 1.2 // 20% buffer
      },
      memory: {
        current: this.current.memory.usage,
        projected: this.current.memory.usage * growthMultiplier,
        recommendedCapacity: this.current.memory.total * growthMultiplier * 1.3 // 30% buffer
      },
      storage: {
        current: this.current.storage.used,
        projected: this.current.storage.used * growthMultiplier,
        recommendedCapacity: this.current.storage.total * growthMultiplier * 1.5 // 50% buffer
      },
      bandwidth: {
        current: this.current.bandwidth.usage,
        projected: this.current.bandwidth.usage * growthMultiplier,
        recommendedCapacity: this.current.bandwidth.capacity * growthMultiplier * 1.25 // 25% buffer
      }
    };
  }
  
  generateScalingPlan(timeHorizon = 12) {
    const plan = [];
    
    for (let month = 1; month <= timeHorizon; month++) {
      const capacity = this.calculateFutureCapacity(month);
      
      const recommendations = [];
      
      if (capacity.cpu.projected > this.current.cpu.capacity * 0.8) {
        recommendations.push({
          type: 'CPU',
          action: 'Scale up CPU resources',
          urgency: capacity.cpu.projected > this.current.cpu.capacity ? 'HIGH' : 'MEDIUM'
        });
      }
      
      if (capacity.memory.projected > this.current.memory.total * 0.8) {
        recommendations.push({
          type: 'Memory',
          action: 'Increase memory allocation',
          urgency: capacity.memory.projected > this.current.memory.total ? 'HIGH' : 'MEDIUM'
        });
      }
      
      plan.push({
        month: month,
        projectedLoad: capacity,
        recommendations: recommendations
      });
    }
    
    return plan;
  }
}

// Usage example
const currentMetrics = {
  cpu: { utilization: 45, capacity: 100 },
  memory: { usage: 8, total: 16 }, // GB
  storage: { used: 500, total: 1000 }, // GB
  bandwidth: { usage: 100, capacity: 1000 } // Mbps
};

const planner = new CapacityPlanner(currentMetrics, 0.15); // 15% monthly growth
const scalingPlan = planner.generateScalingPlan(12);

Auto-scaling Configuration

# Kubernetes Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "100"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60

Cost Analysis

// Cost optimization calculator
class CostOptimizer {
  constructor(pricingModel) {
    this.pricing = pricingModel;
  }
  
  calculateCosts(resources, timeframe = 'monthly') {
    const multiplier = timeframe === 'monthly' ? 24 * 30 : 
                     timeframe === 'daily' ? 24 : 1;
    
    return {
      compute: resources.instances * this.pricing.instance * multiplier,
      storage: resources.storage * this.pricing.storage * multiplier,
      bandwidth: resources.bandwidth * this.pricing.bandwidth * multiplier,
      database: resources.dbInstances * this.pricing.database * multiplier,
      total: function() {
        return this.compute + this.storage + this.bandwidth + this.database;
      }
    };
  }
  
  optimizeConfiguration(currentConfig, constraints) {
    const optimizations = [];
    
    // Analyze instance utilization
    if (currentConfig.cpuUtilization < 30) {
      optimizations.push({
        type: 'DOWNSIZE',
        resource: 'compute',
        suggestion: 'Consider smaller instance types',
        potentialSavings: currentConfig.cost * 0.3
      });
    }
    
    // Analyze storage patterns
    if (currentConfig.storageUtilization < 50) {
      optimizations.push({
        type: 'STORAGE_OPTIMIZATION',
        resource: 'storage',
        suggestion: 'Consider object storage for infrequently accessed data',
        potentialSavings: currentConfig.storageCost * 0.4
      });
    }
    
    // Analyze traffic patterns
    if (currentConfig.peakToAverageRatio > 3) {
      optimizations.push({
        type: 'AUTO_SCALING',
        resource: 'compute',
        suggestion: 'Implement auto-scaling to handle traffic spikes',
        potentialSavings: currentConfig.cost * 0.25
      });
    }
    
    return optimizations;
  }
}

Best Practices

Design Patterns for Scalability

1. CQRS (Command Query Responsibility Segregation)

// Command side - Write operations
class UserCommandHandler {
  constructor(eventStore, commandValidator) {
    this.eventStore = eventStore;
    this.validator = commandValidator;
  }
  
  async createUser(command) {
    await this.validator.validate(command);
    
    const event = {
      type: 'UserCreated',
      aggregateId: command.userId,
      data: command.userData,
      timestamp: Date.now()
    };
    
    await this.eventStore.append(command.userId, event);
    return { success: true, userId: command.userId };
  }
}

// Query side - Read operations
class UserQueryHandler {
  constructor(readModelDatabase) {
    this.db = readModelDatabase;
  }
  
  async getUserById(userId) {
    return this.db.users.findById(userId);
  }
  
  async searchUsers(criteria) {
    return this.db.users.find(criteria);
  }
}

2. Event Sourcing

class EventStore {
  constructor(database) {
    this.db = database;
  }
  
  async append(aggregateId, event) {
    await this.db.events.insert({
      aggregateId,
      eventType: event.type,
      eventData: event.data,
      eventVersion: await this.getNextVersion(aggregateId),
      timestamp: event.timestamp
    });
    
    // Publish event for read model updates
    await this.publishEvent(event);
  }
  
  async getEvents(aggregateId, fromVersion = 0) {
    return this.db.events.find({
      aggregateId,
      eventVersion: { $gte: fromVersion }
    }).sort({ eventVersion: 1 });
  }
  
  async publishEvent(event) {
    // Publish to message queue for read model updates
    await this.eventBus.publish(event.type, event);
  }
}

3. Saga Pattern for Distributed Transactions

class OrderSaga {
  constructor(services) {
    this.paymentService = services.payment;
    this.inventoryService = services.inventory;
    this.shippingService = services.shipping;
  }
  
  async processOrder(orderData) {
    const sagaId = generateId();
    let compensations = [];
    
    try {
      // Step 1: Reserve inventory
      const inventoryReservation = await this.inventoryService.reserve(orderData.items);
      compensations.push(() => this.inventoryService.cancelReservation(inventoryReservation.id));
      
      // Step 2: Process payment
      const payment = await this.paymentService.charge(orderData.payment);
      compensations.push(() => this.paymentService.refund(payment.id));
      
      // Step 3: Arrange shipping
      const shipping = await this.shippingService.schedule(orderData.shipping);
      compensations.push(() => this.shippingService.cancel(shipping.id));
      
      return { success: true, orderId: sagaId };
      
    } catch (error) {
      // Execute compensations in reverse order
      for (let i = compensations.length - 1; i >= 0; i--) {
        try {
          await compensations[i]();
        } catch (compensationError) {
          console.error('Compensation failed:', compensationError);
        }
      }
      
      throw new SagaFailedException('Order processing failed', error);
    }
  }
}

Monitoring and Observability

// Distributed tracing
const opentelemetry = require('@opentelemetry/api');
const tracer = opentelemetry.trace.getTracer('user-service');

class UserService {
  async getUser(userId) {
    const span = tracer.startSpan('get-user');
    
    try {
      span.setAttributes({
        'user.id': userId,
        'service.name': 'user-service'
      });
      
      // Database call with child span
      const dbSpan = tracer.startSpan('database-query', { parent: span });
      const user = await this.database.findById(userId);
      dbSpan.end();
      
      // Cache call with child span
      const cacheSpan = tracer.startSpan('cache-set', { parent: span });
      await this.cache.set(`user:${userId}`, user);
      cacheSpan.end();
      
      span.setStatus({ code: opentelemetry.SpanStatusCode.OK });
      return user;
      
    } catch (error) {
      span.recordException(error);
      span.setStatus({ 
        code: opentelemetry.SpanStatusCode.ERROR, 
        message: error.message 
      });
      throw error;
    } finally {
      span.end();
    }
  }
}

Implementation Strategies

Gradual Migration Approach

// Feature flag for gradual rollout
class FeatureFlag {
  constructor(flagStore) {
    this.flags = flagStore;
  }
  
  async isEnabled(flagName, userId) {
    const flag = await this.flags.get(flagName);
    
    if (!flag.enabled) return false;
    
    // Percentage rollout
    if (flag.percentage) {
      const hash = this.hashUserId(userId);
      return hash < flag.percentage;
    }
    
    // User list rollout
    if (flag.userList) {
      return flag.userList.includes(userId);
    }
    
    return flag.enabled;
  }
  
  hashUserId(userId) {
    // Simple hash function for percentage rollout
    return parseInt(userId.slice(-2), 16) % 100;
  }
}

// Usage in service
class UserService {
  async getUser(userId) {
    const useNewEndpoint = await this.featureFlag.isEnabled('new-user-endpoint', userId);
    
    if (useNewEndpoint) {
      return this.getUserV2(userId);
    } else {
      return this.getUserV1(userId);
    }
  }
}

Blue-Green Deployment

# Blue-Green deployment configuration
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: user-service
spec:
  replicas: 10
  strategy:
    blueGreen:
      activeService: user-service-active
      previewService: user-service-preview
      autoPromotionEnabled: false
      scaleDownDelaySeconds: 30
      prePromotionAnalysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: user-service-preview
      postPromotionAnalysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: user-service-active
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
      - name: user-service
        image: user-service:latest
        ports:
        - containerPort: 8080

Database Migration Strategy

// Database migration with zero downtime
class DatabaseMigrator {
  constructor(oldDB, newDB) {
    this.oldDB = oldDB;
    this.newDB = newDB;
    this.migrationState = 'DUAL_WRITE';
  }
  
  async dualWrite(operation, data) {
    const results = await Promise.allSettled([
      this.oldDB[operation](data),
      this.newDB[operation](data)
    ]);
    
    // Use old DB as source of truth during migration
    const primaryResult = results[0];
    if (primaryResult.status === 'fulfilled') {
      return primaryResult.value;
    } else {
      throw primaryResult.reason;
    }
  }
  
  async read(query) {
    switch (this.migrationState) {
      case 'OLD_ONLY':
        return this.oldDB.query(query);
      
      case 'DUAL_WRITE':
        // Read from old, but verify with new
        const oldResult = await this.oldDB.query(query);
        
        // Compare with new DB in background (don't block)
        this.compareResults(query, oldResult);
        
        return oldResult;
      
      case 'DUAL_READ':
        // Read from both and compare
        const [oldRes, newRes] = await Promise.all([
          this.oldDB.query(query),
          this.newDB.query(query)
        ]);
        
        if (!this.resultsMatch(oldRes, newRes)) {
          console.warn('Migration inconsistency detected', { query, oldRes, newRes });
        }
        
        return newRes; // Start using new DB results
      
      case 'NEW_ONLY':
        return this.newDB.query(query);
    }
  }
  
  async write(data) {
    switch (this.migrationState) {
      case 'OLD_ONLY':
        return this.oldDB.write(data);
      
      case 'DUAL_WRITE':
      case 'DUAL_READ':
        return this.dualWrite('write', data);
      
      case 'NEW_ONLY':
        return this.newDB.write(data);
    }
  }
}

Conclusion

Scalability is a critical aspect of system design that requires careful planning, continuous monitoring, and iterative improvement. By implementing the principles, patterns, and strategies outlined in this guide, systems can effectively handle growth while maintaining performance and reliability.

Key takeaways:

Plan for scale from the beginning - Design systems with scalability in mind
Monitor continuously - Use metrics to identify bottlenecks early
Scale incrementally - Avoid over-engineering while planning for growth
Test thoroughly - Performance testing should be part of the development cycle
Consider costs - Balance performance requirements with budget constraints
Embrace automation - Use auto-scaling and automated deployment strategies

Scalability - ruvnet/ruv-FANN GitHub Wiki

Scalability

Table of Contents

Scalability Principles

Definition

Core Principles

1. Statelessness

2. Loose Coupling

3. Asynchronous Processing

4. Data Partitioning

Horizontal vs Vertical Scaling

Vertical Scaling (Scale Up)

Definition

Advantages

Disadvantages

Implementation Example

Horizontal Scaling (Scale Out)

Definition

Advantages

Disadvantages

Implementation Example

Hybrid Approach

Database Scalability

Read Replicas

Database Sharding

Connection Pooling

Caching Strategies

API Scalability

Load Balancing

Rate Limiting

Request Queuing

API Versioning

Circuit Breaker Pattern

Performance Testing

Load Testing with Artillery

Stress Testing Script

Performance Monitoring

Capacity Planning

Resource Utilization Analysis

Auto-scaling Configuration

Cost Analysis

Best Practices

Design Patterns for Scalability

1. CQRS (Command Query Responsibility Segregation)

2. Event Sourcing

3. Saga Pattern for Distributed Transactions

Monitoring and Observability

Implementation Strategies

Gradual Migration Approach

Blue-Green Deployment

Database Migration Strategy

Conclusion