Scalability - ruvnet/ruv-FANN GitHub Wiki

Scalability

Table of Contents

Scalability Principles

Definition

Scalability is the ability of a system to handle increased load by adding resources to the system. It encompasses both the capability to scale up (vertical scaling) and scale out (horizontal scaling) to meet growing demands.

Core Principles

1. Statelessness

// ✅ Stateless design - scalable
class UserService {
  async getUserById(id, token) {
    const user = await this.userRepository.findById(id);
    if (!this.authService.validateToken(token)) {
      throw new UnauthorizedException();
    }
    return user;
  }
}

// ❌ Stateful design - not scalable
class UserService {
  constructor() {
    this.currentUser = null; // State stored in memory
  }
  
  async login(credentials) {
    this.currentUser = await this.authenticate(credentials);
    return this.currentUser;
  }
}

2. Loose Coupling

Systems should be designed with minimal dependencies between components to allow independent scaling.

# Microservices architecture example
services:
  user-service:
    image: user-service:latest
    replicas: 3
  
  order-service:
    image: order-service:latest
    replicas: 5
  
  notification-service:
    image: notification-service:latest
    replicas: 2

3. Asynchronous Processing

// Message queue for async processing
const queue = new MessageQueue('order-processing');

// Producer
async function createOrder(orderData) {
  const order = await saveOrder(orderData);
  await queue.publish('order.created', order);
  return order; // Return immediately
}

// Consumer
queue.subscribe('order.created', async (order) => {
  await processPayment(order);
  await updateInventory(order);
  await sendConfirmationEmail(order);
});

4. Data Partitioning

// Horizontal partitioning (sharding)
class ShardedUserRepository {
  getShardKey(userId) {
    return userId % this.shardCount;
  }
  
  async findById(userId) {
    const shard = this.getShardKey(userId);
    return this.shards[shard].findById(userId);
  }
}

Horizontal vs Vertical Scaling

Vertical Scaling (Scale Up)

Definition

Adding more power (CPU, RAM, Storage) to existing machines.

Advantages

  • Simplicity: No application changes required
  • Data consistency: Single database instance
  • Lower complexity: Fewer moving parts
  • Cost-effective for small scales: Initial scaling is straightforward

Disadvantages

  • Hardware limits: Physical constraints on single machine capacity
  • Single point of failure: System depends on one machine
  • Expensive at scale: High-end hardware costs increase exponentially
  • Downtime during upgrades: System unavailable during hardware changes

Implementation Example

# Before scaling
resources:
  cpu: "2"
  memory: "4Gi"
  storage: "100Gi"

# After vertical scaling
resources:
  cpu: "8"
  memory: "16Gi"
  storage: "500Gi"

Horizontal Scaling (Scale Out)

Definition

Adding more machines to the resource pool to handle increased load.

Advantages

  • No upper limit: Theoretically unlimited scaling
  • Fault tolerance: Failure of one machine doesn't bring down the system
  • Cost-effective: Use commodity hardware
  • Elastic scaling: Add/remove capacity based on demand

Disadvantages

  • Complexity: Application must be designed for distributed systems
  • Data consistency challenges: Distributed data management
  • Network overhead: Communication between nodes
  • State management: Session and data sharing complexities

Implementation Example

# Load balancer configuration
apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web-app
  ports:
  - port: 80
    targetPort: 8080
  type: LoadBalancer

---
# Horizontal scaling with replicas
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 5  # Scale to 5 instances
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
      - name: web-app
        image: web-app:latest
        ports:
        - containerPort: 8080

Hybrid Approach

// Auto-scaling configuration
const scalingPolicy = {
  vertical: {
    minCpu: "1",
    maxCpu: "4",
    minMemory: "2Gi",
    maxMemory: "8Gi"
  },
  horizontal: {
    minReplicas: 2,
    maxReplicas: 10,
    targetCPUUtilization: 70,
    scaleUpCooldown: "5m",
    scaleDownCooldown: "10m"
  }
};

Database Scalability

Read Replicas

// Database connection routing
class DatabaseManager {
  constructor() {
    this.writeDB = new Connection('primary-db');
    this.readReplicas = [
      new Connection('read-replica-1'),
      new Connection('read-replica-2'),
      new Connection('read-replica-3')
    ];
  }
  
  async write(query, params) {
    return this.writeDB.execute(query, params);
  }
  
  async read(query, params) {
    const replica = this.getRandomReplica();
    return replica.execute(query, params);
  }
  
  getRandomReplica() {
    const index = Math.floor(Math.random() * this.readReplicas.length);
    return this.readReplicas[index];
  }
}

Database Sharding

// User-based sharding strategy
class ShardedDatabase {
  constructor() {
    this.shards = {
      'shard-1': new Connection('user-db-1'), // Users A-H
      'shard-2': new Connection('user-db-2'), // Users I-P
      'shard-3': new Connection('user-db-3'), // Users Q-Z
    };
  }
  
  getShardForUser(username) {
    const firstLetter = username[0].toLowerCase();
    if (firstLetter <= 'h') return 'shard-1';
    if (firstLetter <= 'p') return 'shard-2';
    return 'shard-3';
  }
  
  async getUserData(username) {
    const shardKey = this.getShardForUser(username);
    return this.shards[shardKey].findUser(username);
  }
}

Connection Pooling

// Database connection pool
const pool = new Pool({
  host: 'database-server',
  port: 5432,
  database: 'app_db',
  user: 'app_user',
  password: process.env.DB_PASSWORD,
  min: 5,    // Minimum connections
  max: 20,   // Maximum connections
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000
});

class UserRepository {
  async findById(id) {
    const client = await pool.connect();
    try {
      const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
      return result.rows[0];
    } finally {
      client.release();
    }
  }
}

Caching Strategies

// Multi-level caching
class CachedUserService {
  constructor() {
    this.l1Cache = new Map(); // In-memory cache
    this.l2Cache = new RedisClient(); // Distributed cache
    this.database = new Database();
  }
  
  async getUser(id) {
    // L1 Cache check
    if (this.l1Cache.has(id)) {
      return this.l1Cache.get(id);
    }
    
    // L2 Cache check
    const cachedUser = await this.l2Cache.get(`user:${id}`);
    if (cachedUser) {
      this.l1Cache.set(id, cachedUser);
      return cachedUser;
    }
    
    // Database fetch
    const user = await this.database.getUserById(id);
    if (user) {
      this.l1Cache.set(id, user);
      await this.l2Cache.setex(`user:${id}`, 3600, JSON.stringify(user));
    }
    
    return user;
  }
}

API Scalability

Load Balancing

# Nginx load balancer configuration
upstream api_servers {
    least_conn;
    server api1.example.com:3000 weight=3;
    server api2.example.com:3000 weight=2;
    server api3.example.com:3000 weight=1;
    server api4.example.com:3000 backup;
}

server {
    listen 80;
    server_name api.example.com;
    
    location / {
        proxy_pass http://api_servers;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        
        # Health checks
        proxy_connect_timeout 5s;
        proxy_send_timeout 10s;
        proxy_read_timeout 10s;
    }
}

Rate Limiting

// Rate limiting middleware
const rateLimit = require('express-rate-limit');
const RedisStore = require('rate-limit-redis');

const limiter = rateLimit({
  store: new RedisStore({
    client: redisClient,
    prefix: 'rl:'
  }),
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // Limit each IP to 100 requests per windowMs
  message: 'Too many requests, please try again later.',
  standardHeaders: true,
  legacyHeaders: false,
  keyGenerator: (req) => {
    // Rate limit by user ID if authenticated, otherwise by IP
    return req.user?.id || req.ip;
  }
});

app.use('/api/', limiter);

Request Queuing

// Queue-based request processing
const Queue = require('bull');
const requestQueue = new Queue('request processing', {
  redis: { port: 6379, host: '127.0.0.1' }
});

// API endpoint
app.post('/api/heavy-task', async (req, res) => {
  const job = await requestQueue.add('process-request', {
    userId: req.user.id,
    data: req.body,
    timestamp: Date.now()
  }, {
    attempts: 3,
    backoff: 'exponential',
    delay: 2000
  });
  
  res.json({ jobId: job.id, status: 'queued' });
});

// Worker process
requestQueue.process('process-request', 5, async (job) => {
  const { userId, data } = job.data;
  
  job.progress(10);
  const result = await heavyProcessing(data);
  
  job.progress(50);
  await saveResults(userId, result);
  
  job.progress(100);
  await notifyUser(userId, result);
  
  return result;
});

API Versioning

// Version-aware routing
class APIRouter {
  constructor() {
    this.routes = new Map();
  }
  
  register(version, path, handler) {
    const key = `${version}:${path}`;
    this.routes.set(key, handler);
  }
  
  async handle(req, res) {
    const version = req.headers['api-version'] || '1.0';
    const path = req.path;
    const key = `${version}:${path}`;
    
    const handler = this.routes.get(key) || this.routes.get(`1.0:${path}`);
    if (!handler) {
      return res.status(404).json({ error: 'Endpoint not found' });
    }
    
    return handler(req, res);
  }
}

// Usage
const router = new APIRouter();
router.register('1.0', '/users', getUsersV1);
router.register('2.0', '/users', getUsersV2);
router.register('2.1', '/users', getUsersV21);

Circuit Breaker Pattern

class CircuitBreaker {
  constructor(service, options = {}) {
    this.service = service;
    this.failureThreshold = options.failureThreshold || 5;
    this.timeout = options.timeout || 60000;
    this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
    this.failureCount = 0;
    this.lastFailureTime = null;
  }
  
  async call(...args) {
    if (this.state === 'OPEN') {
      if (Date.now() - this.lastFailureTime > this.timeout) {
        this.state = 'HALF_OPEN';
      } else {
        throw new Error('Circuit breaker is OPEN');
      }
    }
    
    try {
      const result = await this.service(...args);
      if (this.state === 'HALF_OPEN') {
        this.state = 'CLOSED';
        this.failureCount = 0;
      }
      return result;
    } catch (error) {
      this.failureCount++;
      this.lastFailureTime = Date.now();
      
      if (this.failureCount >= this.failureThreshold) {
        this.state = 'OPEN';
      }
      
      throw error;
    }
  }
}

// Usage
const externalAPIBreaker = new CircuitBreaker(callExternalAPI, {
  failureThreshold: 3,
  timeout: 30000
});

async function getUserData(id) {
  try {
    return await externalAPIBreaker.call(id);
  } catch (error) {
    // Fallback to cached data
    return getCachedUserData(id);
  }
}

Performance Testing

Load Testing with Artillery

# artillery-config.yml
config:
  target: 'https://api.example.com'
  phases:
    - duration: 300  # 5 minutes
      arrivalRate: 10  # 10 requests per second
    - duration: 600  # 10 minutes
      arrivalRate: 50  # 50 requests per second
    - duration: 300  # 5 minutes
      arrivalRate: 100 # 100 requests per second

scenarios:
  - name: "User Journey"
    weight: 70
    flow:
      - post:
          url: "/auth/login"
          json:
            email: "[email protected]"
            password: "password"
          capture:
            - json: "$.token"
              as: "authToken"
      - get:
          url: "/api/users/profile"
          headers:
            Authorization: "Bearer {{ authToken }}"
      - get:
          url: "/api/users/dashboard"
          headers:
            Authorization: "Bearer {{ authToken }}"

  - name: "Public API"
    weight: 30
    flow:
      - get:
          url: "/api/public/status"
      - get:
          url: "/api/public/metrics"

Stress Testing Script

// stress-test.js
const cluster = require('cluster');
const numCPUs = require('os').cpus().length;
const axios = require('axios');

if (cluster.isMaster) {
  console.log(`Master ${process.pid} is running`);
  
  // Fork workers
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
  
  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died`);
  });
} else {
  // Worker processes
  console.log(`Worker ${process.pid} started`);
  
  const stats = {
    requests: 0,
    errors: 0,
    totalTime: 0
  };
  
  async function stressTest() {
    const startTime = Date.now();
    
    while (Date.now() - startTime < 60000) { // Run for 1 minute
      try {
        const start = Date.now();
        await axios.get('https://api.example.com/health');
        stats.totalTime += Date.now() - start;
        stats.requests++;
      } catch (error) {
        stats.errors++;
      }
    }
    
    console.log(`Worker ${process.pid} stats:`, {
      requests: stats.requests,
      errors: stats.errors,
      avgResponseTime: stats.totalTime / stats.requests,
      errorRate: (stats.errors / (stats.requests + stats.errors)) * 100
    });
  }
  
  stressTest();
}

Performance Monitoring

// Performance metrics collection
class PerformanceMonitor {
  constructor() {
    this.metrics = {
      requestCount: 0,
      errorCount: 0,
      responseTimes: [],
      activeConnections: 0
    };
    
    // Start metrics collection
    setInterval(() => this.collectMetrics(), 1000);
  }
  
  recordRequest(responseTime, isError = false) {
    this.metrics.requestCount++;
    this.metrics.responseTimes.push(responseTime);
    
    if (isError) {
      this.metrics.errorCount++;
    }
    
    // Keep only last 1000 response times
    if (this.metrics.responseTimes.length > 1000) {
      this.metrics.responseTimes.shift();
    }
  }
  
  getMetrics() {
    const responseTimes = this.metrics.responseTimes;
    const sorted = [...responseTimes].sort((a, b) => a - b);
    
    return {
      requestsPerSecond: this.metrics.requestCount,
      errorRate: (this.metrics.errorCount / this.metrics.requestCount) * 100,
      averageResponseTime: responseTimes.reduce((a, b) => a + b, 0) / responseTimes.length,
      p50: sorted[Math.floor(sorted.length * 0.5)],
      p95: sorted[Math.floor(sorted.length * 0.95)],
      p99: sorted[Math.floor(sorted.length * 0.99)],
      activeConnections: this.metrics.activeConnections
    };
  }
  
  collectMetrics() {
    const metrics = this.getMetrics();
    console.log('Performance Metrics:', metrics);
    
    // Reset counters
    this.metrics.requestCount = 0;
    this.metrics.errorCount = 0;
    
    // Send to monitoring system
    this.sendToMonitoring(metrics);
  }
  
  async sendToMonitoring(metrics) {
    // Send to monitoring service (e.g., Prometheus, Datadog)
    try {
      await axios.post('http://monitoring-service/metrics', {
        timestamp: Date.now(),
        service: 'api-server',
        ...metrics
      });
    } catch (error) {
      console.error('Failed to send metrics:', error.message);
    }
  }
}

Capacity Planning

Resource Utilization Analysis

// Capacity planning calculator
class CapacityPlanner {
  constructor(currentMetrics, growthRate) {
    this.current = currentMetrics;
    this.growthRate = growthRate; // Monthly growth rate
  }
  
  calculateFutureCapacity(months) {
    const growthMultiplier = Math.pow(1 + this.growthRate, months);
    
    return {
      cpu: {
        current: this.current.cpu.utilization,
        projected: this.current.cpu.utilization * growthMultiplier,
        recommendedCapacity: this.current.cpu.capacity * growthMultiplier * 1.2 // 20% buffer
      },
      memory: {
        current: this.current.memory.usage,
        projected: this.current.memory.usage * growthMultiplier,
        recommendedCapacity: this.current.memory.total * growthMultiplier * 1.3 // 30% buffer
      },
      storage: {
        current: this.current.storage.used,
        projected: this.current.storage.used * growthMultiplier,
        recommendedCapacity: this.current.storage.total * growthMultiplier * 1.5 // 50% buffer
      },
      bandwidth: {
        current: this.current.bandwidth.usage,
        projected: this.current.bandwidth.usage * growthMultiplier,
        recommendedCapacity: this.current.bandwidth.capacity * growthMultiplier * 1.25 // 25% buffer
      }
    };
  }
  
  generateScalingPlan(timeHorizon = 12) {
    const plan = [];
    
    for (let month = 1; month <= timeHorizon; month++) {
      const capacity = this.calculateFutureCapacity(month);
      
      const recommendations = [];
      
      if (capacity.cpu.projected > this.current.cpu.capacity * 0.8) {
        recommendations.push({
          type: 'CPU',
          action: 'Scale up CPU resources',
          urgency: capacity.cpu.projected > this.current.cpu.capacity ? 'HIGH' : 'MEDIUM'
        });
      }
      
      if (capacity.memory.projected > this.current.memory.total * 0.8) {
        recommendations.push({
          type: 'Memory',
          action: 'Increase memory allocation',
          urgency: capacity.memory.projected > this.current.memory.total ? 'HIGH' : 'MEDIUM'
        });
      }
      
      plan.push({
        month: month,
        projectedLoad: capacity,
        recommendations: recommendations
      });
    }
    
    return plan;
  }
}

// Usage example
const currentMetrics = {
  cpu: { utilization: 45, capacity: 100 },
  memory: { usage: 8, total: 16 }, // GB
  storage: { used: 500, total: 1000 }, // GB
  bandwidth: { usage: 100, capacity: 1000 } // Mbps
};

const planner = new CapacityPlanner(currentMetrics, 0.15); // 15% monthly growth
const scalingPlan = planner.generateScalingPlan(12);

Auto-scaling Configuration

# Kubernetes Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "100"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60

Cost Analysis

// Cost optimization calculator
class CostOptimizer {
  constructor(pricingModel) {
    this.pricing = pricingModel;
  }
  
  calculateCosts(resources, timeframe = 'monthly') {
    const multiplier = timeframe === 'monthly' ? 24 * 30 : 
                     timeframe === 'daily' ? 24 : 1;
    
    return {
      compute: resources.instances * this.pricing.instance * multiplier,
      storage: resources.storage * this.pricing.storage * multiplier,
      bandwidth: resources.bandwidth * this.pricing.bandwidth * multiplier,
      database: resources.dbInstances * this.pricing.database * multiplier,
      total: function() {
        return this.compute + this.storage + this.bandwidth + this.database;
      }
    };
  }
  
  optimizeConfiguration(currentConfig, constraints) {
    const optimizations = [];
    
    // Analyze instance utilization
    if (currentConfig.cpuUtilization < 30) {
      optimizations.push({
        type: 'DOWNSIZE',
        resource: 'compute',
        suggestion: 'Consider smaller instance types',
        potentialSavings: currentConfig.cost * 0.3
      });
    }
    
    // Analyze storage patterns
    if (currentConfig.storageUtilization < 50) {
      optimizations.push({
        type: 'STORAGE_OPTIMIZATION',
        resource: 'storage',
        suggestion: 'Consider object storage for infrequently accessed data',
        potentialSavings: currentConfig.storageCost * 0.4
      });
    }
    
    // Analyze traffic patterns
    if (currentConfig.peakToAverageRatio > 3) {
      optimizations.push({
        type: 'AUTO_SCALING',
        resource: 'compute',
        suggestion: 'Implement auto-scaling to handle traffic spikes',
        potentialSavings: currentConfig.cost * 0.25
      });
    }
    
    return optimizations;
  }
}

Best Practices

Design Patterns for Scalability

1. CQRS (Command Query Responsibility Segregation)

// Command side - Write operations
class UserCommandHandler {
  constructor(eventStore, commandValidator) {
    this.eventStore = eventStore;
    this.validator = commandValidator;
  }
  
  async createUser(command) {
    await this.validator.validate(command);
    
    const event = {
      type: 'UserCreated',
      aggregateId: command.userId,
      data: command.userData,
      timestamp: Date.now()
    };
    
    await this.eventStore.append(command.userId, event);
    return { success: true, userId: command.userId };
  }
}

// Query side - Read operations
class UserQueryHandler {
  constructor(readModelDatabase) {
    this.db = readModelDatabase;
  }
  
  async getUserById(userId) {
    return this.db.users.findById(userId);
  }
  
  async searchUsers(criteria) {
    return this.db.users.find(criteria);
  }
}

2. Event Sourcing

class EventStore {
  constructor(database) {
    this.db = database;
  }
  
  async append(aggregateId, event) {
    await this.db.events.insert({
      aggregateId,
      eventType: event.type,
      eventData: event.data,
      eventVersion: await this.getNextVersion(aggregateId),
      timestamp: event.timestamp
    });
    
    // Publish event for read model updates
    await this.publishEvent(event);
  }
  
  async getEvents(aggregateId, fromVersion = 0) {
    return this.db.events.find({
      aggregateId,
      eventVersion: { $gte: fromVersion }
    }).sort({ eventVersion: 1 });
  }
  
  async publishEvent(event) {
    // Publish to message queue for read model updates
    await this.eventBus.publish(event.type, event);
  }
}

3. Saga Pattern for Distributed Transactions

class OrderSaga {
  constructor(services) {
    this.paymentService = services.payment;
    this.inventoryService = services.inventory;
    this.shippingService = services.shipping;
  }
  
  async processOrder(orderData) {
    const sagaId = generateId();
    let compensations = [];
    
    try {
      // Step 1: Reserve inventory
      const inventoryReservation = await this.inventoryService.reserve(orderData.items);
      compensations.push(() => this.inventoryService.cancelReservation(inventoryReservation.id));
      
      // Step 2: Process payment
      const payment = await this.paymentService.charge(orderData.payment);
      compensations.push(() => this.paymentService.refund(payment.id));
      
      // Step 3: Arrange shipping
      const shipping = await this.shippingService.schedule(orderData.shipping);
      compensations.push(() => this.shippingService.cancel(shipping.id));
      
      return { success: true, orderId: sagaId };
      
    } catch (error) {
      // Execute compensations in reverse order
      for (let i = compensations.length - 1; i >= 0; i--) {
        try {
          await compensations[i]();
        } catch (compensationError) {
          console.error('Compensation failed:', compensationError);
        }
      }
      
      throw new SagaFailedException('Order processing failed', error);
    }
  }
}

Monitoring and Observability

// Distributed tracing
const opentelemetry = require('@opentelemetry/api');
const tracer = opentelemetry.trace.getTracer('user-service');

class UserService {
  async getUser(userId) {
    const span = tracer.startSpan('get-user');
    
    try {
      span.setAttributes({
        'user.id': userId,
        'service.name': 'user-service'
      });
      
      // Database call with child span
      const dbSpan = tracer.startSpan('database-query', { parent: span });
      const user = await this.database.findById(userId);
      dbSpan.end();
      
      // Cache call with child span
      const cacheSpan = tracer.startSpan('cache-set', { parent: span });
      await this.cache.set(`user:${userId}`, user);
      cacheSpan.end();
      
      span.setStatus({ code: opentelemetry.SpanStatusCode.OK });
      return user;
      
    } catch (error) {
      span.recordException(error);
      span.setStatus({ 
        code: opentelemetry.SpanStatusCode.ERROR, 
        message: error.message 
      });
      throw error;
    } finally {
      span.end();
    }
  }
}

Implementation Strategies

Gradual Migration Approach

// Feature flag for gradual rollout
class FeatureFlag {
  constructor(flagStore) {
    this.flags = flagStore;
  }
  
  async isEnabled(flagName, userId) {
    const flag = await this.flags.get(flagName);
    
    if (!flag.enabled) return false;
    
    // Percentage rollout
    if (flag.percentage) {
      const hash = this.hashUserId(userId);
      return hash < flag.percentage;
    }
    
    // User list rollout
    if (flag.userList) {
      return flag.userList.includes(userId);
    }
    
    return flag.enabled;
  }
  
  hashUserId(userId) {
    // Simple hash function for percentage rollout
    return parseInt(userId.slice(-2), 16) % 100;
  }
}

// Usage in service
class UserService {
  async getUser(userId) {
    const useNewEndpoint = await this.featureFlag.isEnabled('new-user-endpoint', userId);
    
    if (useNewEndpoint) {
      return this.getUserV2(userId);
    } else {
      return this.getUserV1(userId);
    }
  }
}

Blue-Green Deployment

# Blue-Green deployment configuration
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: user-service
spec:
  replicas: 10
  strategy:
    blueGreen:
      activeService: user-service-active
      previewService: user-service-preview
      autoPromotionEnabled: false
      scaleDownDelaySeconds: 30
      prePromotionAnalysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: user-service-preview
      postPromotionAnalysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: user-service-active
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
      - name: user-service
        image: user-service:latest
        ports:
        - containerPort: 8080

Database Migration Strategy

// Database migration with zero downtime
class DatabaseMigrator {
  constructor(oldDB, newDB) {
    this.oldDB = oldDB;
    this.newDB = newDB;
    this.migrationState = 'DUAL_WRITE';
  }
  
  async dualWrite(operation, data) {
    const results = await Promise.allSettled([
      this.oldDB[operation](data),
      this.newDB[operation](data)
    ]);
    
    // Use old DB as source of truth during migration
    const primaryResult = results[0];
    if (primaryResult.status === 'fulfilled') {
      return primaryResult.value;
    } else {
      throw primaryResult.reason;
    }
  }
  
  async read(query) {
    switch (this.migrationState) {
      case 'OLD_ONLY':
        return this.oldDB.query(query);
      
      case 'DUAL_WRITE':
        // Read from old, but verify with new
        const oldResult = await this.oldDB.query(query);
        
        // Compare with new DB in background (don't block)
        this.compareResults(query, oldResult);
        
        return oldResult;
      
      case 'DUAL_READ':
        // Read from both and compare
        const [oldRes, newRes] = await Promise.all([
          this.oldDB.query(query),
          this.newDB.query(query)
        ]);
        
        if (!this.resultsMatch(oldRes, newRes)) {
          console.warn('Migration inconsistency detected', { query, oldRes, newRes });
        }
        
        return newRes; // Start using new DB results
      
      case 'NEW_ONLY':
        return this.newDB.query(query);
    }
  }
  
  async write(data) {
    switch (this.migrationState) {
      case 'OLD_ONLY':
        return this.oldDB.write(data);
      
      case 'DUAL_WRITE':
      case 'DUAL_READ':
        return this.dualWrite('write', data);
      
      case 'NEW_ONLY':
        return this.newDB.write(data);
    }
  }
}

Conclusion

Scalability is a critical aspect of system design that requires careful planning, continuous monitoring, and iterative improvement. By implementing the principles, patterns, and strategies outlined in this guide, systems can effectively handle growth while maintaining performance and reliability.

Key takeaways:

  • Plan for scale from the beginning - Design systems with scalability in mind
  • Monitor continuously - Use metrics to identify bottlenecks early
  • Scale incrementally - Avoid over-engineering while planning for growth
  • Test thoroughly - Performance testing should be part of the development cycle
  • Consider costs - Balance performance requirements with budget constraints
  • Embrace automation - Use auto-scaling and automated deployment strategies