Rate Limiting - arilonUK/iotagentmesh GitHub Wiki

Rate Limiting

The IoT Agent Mesh API implements comprehensive rate limiting to ensure fair usage, system stability, and optimal performance for all users. This guide covers rate limiting policies, headers, handling strategies, and best practices.

Overview

Rate limiting controls the number of API requests that can be made within a specific time window. The IoT Agent Mesh platform uses multiple rate limiting strategies:

  • Per-User Limits: Individual user request quotas
  • Per-Organization Limits: Organization-wide request quotas
  • Per-Endpoint Limits: Specific limits for different API endpoints
  • Per-IP Limits: Network-level protection against abuse
  • Concurrent Connection Limits: Real-time connection restrictions

Rate Limiting Policies

Standard Rate Limits

Endpoint Category Rate Limit Window Scope
Authentication 10 requests 1 minute Per IP address
Organizations 100 requests 1 minute Per user per org
Users 100 requests 1 minute Per user per org
IoT Agents 200 requests 1 minute Per user per org
Telemetry Read 500 requests 1 minute Per user per org
Telemetry Write 1000 requests 1 minute Per user per org
Alerts 200 requests 1 minute Per user per org
Edge Functions 500 requests 1 minute Per user per org
Real-time Connections 50 connections Concurrent Per user

Rate Limiting Strategies

1. Token Bucket Algorithm

The primary rate limiting uses a token bucket approach:

  • Each user/organization gets a bucket with tokens
  • Each request consumes one token
  • Tokens are replenished at a steady rate
  • Burst traffic is allowed up to bucket capacity

2. Sliding Window

For precise rate limiting, a sliding window approach is used:

  • Tracks requests in rolling time windows
  • More accurate than fixed windows
  • Prevents burst at window boundaries

3. Hierarchical Limits

Multiple limits can apply simultaneously:

Request → IP Limit → User Limit → Organization Limit → Endpoint Limit → Allow/Deny

Handling Rate Limits

Best Practices for Clients

1. Respect Rate Limit Headers

Always check rate limit headers and adjust behavior accordingly:

// JavaScript/TypeScript example
async function makeAPIRequest(requestFn) {
  try {
    const response = await requestFn()
// Check rate limit headers
const remaining = parseInt(response.headers['x-ratelimit-remaining'])
const reset = parseInt(response.headers['x-ratelimit-reset'])

if (remaining < 10) {
  const resetTime = new Date(reset * 1000)
  console.log(`Rate limit low. Resets at: ${resetTime}`)
  
  // Slow down requests
  await new Promise(resolve => setTimeout(resolve, 1000))
}

return response

} catch (error) { if (error.status === 429) { await handleRateLimit(error) return makeAPIRequest(requestFn) // Retry } throw error } }

2. Implement Exponential Backoff

async function exponentialBackoff(requestFn, maxRetries = 5) {
  let attempt = 0

while (attempt < maxRetries) { try { return await requestFn()

} catch (error) {
  if (error.status === 429) {
    attempt++
    
    // Get retry delay from header or calculate
    const retryAfter = error.headers?.['retry-after'] || 
                      Math.pow(2, attempt) + Math.random()
    
    console.log(`Rate limited. Retrying in ${retryAfter}s (attempt ${attempt})`)
    await new Promise(resolve =&gt; setTimeout(resolve, retryAfter * 1000))
    
  } else {
    throw error
  }
}

}

throw new Error('Max retries exceeded') }

3. Request Queuing

Implement a queue to manage request rate:

class RequestQueue {
  constructor(requestsPerSecond = 10) {
    this.requestsPerSecond = requestsPerSecond
    this.queue = []
    this.processing = false
    this.lastRequestTime = 0
  }

async enqueue(requestFn) { return new Promise((resolve, reject) => { this.queue.push({ requestFn, resolve, reject }) this.processQueue() }) }

async processQueue() { if (this.processing || this.queue.length === 0) return

this.processing = true

while (this.queue.length &gt; 0) {
  const { requestFn, resolve, reject } = this.queue.shift()
  
  // Ensure minimum time between requests
  const now = Date.now()
  const timeSinceLastRequest = now - this.lastRequestTime
  const minInterval = 1000 / this.requestsPerSecond
  
  if (timeSinceLastRequest &lt; minInterval) {
    await new Promise(r =&gt; setTimeout(r, minInterval - timeSinceLastRequest))
  }
  
  try {
    const result = await requestFn()
    resolve(result)
  } catch (error) {
    if (error.status === 429) {
      // Put request back at front of queue
      this.queue.unshift({ requestFn, resolve, reject })
      
      const retryAfter = (error.headers?.['retry-after'] || 30) * 1000
      await new Promise(r =&gt; setTimeout(r, retryAfter))
    } else {
      reject(error)
    }
  }
  
  this.lastRequestTime = Date.now()
}

this.processing = false

} }

// Usage const requestQueue = new RequestQueue(15) // 15 requests per second

const result = await requestQueue.enqueue(() => supabase.from('agents').select('*') )

Language-Specific Examples

Python Rate Limiting

import time
import requests
from functools import wraps

class RateLimiter: def init(self, max_requests=100, time_window=60): self.max_requests = max_requests self.time_window = time_window self.requests = []

def wait_if_needed(self):
    now = time.time()
    
    # Remove old requests outside the window
    self.requests = [req_time for req_time in self.requests 
                    if now - req_time &lt; self.time_window]
    
    if len(self.requests) &gt;= self.max_requests:
        # Calculate wait time
        oldest_request = min(self.requests)
        wait_time = self.time_window - (now - oldest_request)
        
        if wait_time &gt; 0:
            time.sleep(wait_time + 0.1)  # Add small buffer
    
    self.requests.append(now)

def with_rate_limiting(rate_limiter): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): rate_limiter.wait_if_needed()

        max_retries = 3
        for attempt in range(max_retries):
            try:
                response = func(*args, **kwargs)
                
                if hasattr(response, 'status_code') and response.status_code == 429:
                    retry_after = int(response.headers.get('retry-after', 30))
                    time.sleep(retry_after)
                    continue
                
                return response
                
            except requests.exceptions.RequestException as e:
                if attempt == max_retries - 1:
                    raise
                time.sleep(2 ** attempt)  # Exponential backoff
        
        return response
    return wrapper
return decorator

Usage

rate_limiter = RateLimiter(max_requests=100, time_window=60)

@with_rate_limiting(rate_limiter) def api_request(url, **kwargs): return requests.get(url, **kwargs)

Java Rate Limiting

import java.time.Instant;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class RateLimiter { private final Semaphore semaphore; private final int maxRequests; private final long timeWindowMs; private volatile long windowStart; private volatile int requestCount;

public RateLimiter(int maxRequests, long timeWindowMs) {
    this.maxRequests = maxRequests;
    this.timeWindowMs = timeWindowMs;
    this.semaphore = new Semaphore(maxRequests);
    this.windowStart = System.currentTimeMillis();
    this.requestCount = 0;
}

public void acquire() throws InterruptedException {
    long now = System.currentTimeMillis();
    
    synchronized (this) {
        if (now - windowStart &gt;= timeWindowMs) {
            // Reset window
            semaphore.release(maxRequests - semaphore.availablePermits());
            windowStart = now;
            requestCount = 0;
        }
    }
    
    semaphore.acquire();
    requestCount++;
}

public void release() {
    // Don't release immediately - let time window handle it
    // This prevents burst beyond the rate limit
}

public boolean tryAcquire(long timeout, TimeUnit unit) throws InterruptedException {
    return semaphore.tryAcquire(timeout, unit);
}

}

// Usage with HTTP client public class APIClient { private final RateLimiter rateLimiter; private final HttpClient httpClient;

public APIClient() {
    this.rateLimiter = new RateLimiter(100, 60000); // 100 requests per minute
    this.httpClient = HttpClient.newHttpClient();
}

public HttpResponse&lt;String&gt; makeRequest(HttpRequest request) throws Exception {
    rateLimiter.acquire();
    
    try {
        HttpResponse&lt;String&gt; response = httpClient.send(
            request, 
            HttpResponse.BodyHandlers.ofString()
        );
        
        if (response.statusCode() == 429) {
            String retryAfter = response.headers()
                .firstValue("retry-after")
                .orElse("30");
            
            Thread.sleep(Integer.parseInt(retryAfter) * 1000);
            return makeRequest(request); // Retry
        }
        
        return response;
        
    } finally {
        rateLimiter.release();
    }
}

}


Monitoring Rate Limits

Rate Limit Dashboard

Monitor your API usage through the dashboard:

// Get rate limit status for organization
async function getRateLimitStatus() {
  const response = await supabase.functions.invoke('rate-limit-status')

return { current_usage: response.data.current_usage, limits: response.data.limits, reset_times: response.data.reset_times, burst_available: response.data.burst_available } }

// Example response: { "current_usage": { "telemetry_write": 750, "telemetry_read": 340, "agents": 45, "alerts": 12 }, "limits": { "telemetry_write": 1000, "telemetry_read": 500, "agents": 200, "alerts": 200 }, "reset_times": { "telemetry_write": "2024-07-23T17:01:00Z", "telemetry_read": "2024-07-23T17:01:00Z" }, "burst_available": { "telemetry_write": 1250, "telemetry_read": 750 } }

Alerting on Rate Limits

Set up monitoring alerts:

// Monitor rate limit usage
function monitorRateLimits() {
  setInterval(async () => {
    try {
      const status = await getRateLimitStatus()
  Object.entries(status.current_usage).forEach(([endpoint, usage]) =&gt; {
    const limit = status.limits[endpoint]
    const percentage = (usage / limit) * 100
    
    if (percentage &gt; 80) {
      console.warn(`Rate limit warning: ${endpoint} at ${percentage.toFixed(1)}%`)
      
      // Send alert to monitoring system
      sendAlert({
        type: 'rate_limit_warning',
        endpoint,
        usage,
        limit,
        percentage
      })
    }
  })
  
} catch (error) {
  console.error('Failed to check rate limits:', error)
}

}, 30000) // Check every 30 seconds }


Optimization Strategies

1. Batch Operations

Use batch endpoints to maximize efficiency:

// Instead of multiple single requests
for (const telemetryPoint of data) {
  await supabase.from('telemetry').insert(telemetryPoint)  // 100+ requests
}

// Use batch endpoint await supabase.functions.invoke('telemetry-batch', { body: { telemetry: data } // 1 request })

2. Caching

Implement client-side caching:

class CachedAPIClient {
  constructor(ttlSeconds = 300) {
    this.cache = new Map()
    this.ttl = ttlSeconds * 1000
  }

async get(key, fetchFn) { const cached = this.cache.get(key)

if (cached &amp;&amp; Date.now() - cached.timestamp &lt; this.ttl) {
  return cached.data
}

const data = await fetchFn()
this.cache.set(key, {
  data,
  timestamp: Date.now()
})

return data

}

async getAgents() { return this.get('agents', async () => { const { data } = await supabase.from('agents').select('*') return data }) } }

3. Request Optimization

Optimize queries to reduce API calls:

// Instead of separate requests
const agents = await supabase.from('agents').select('*')
const telemetry = await supabase.from('telemetry').select('*')
const alerts = await supabase.from('alerts').select('*')

// Use joins or batch queries const data = await supabase .from('agents') .select(*, latest_telemetry:telemetry(*, created_at.desc.limit(1)), active_alerts:alerts(*))

4. Connection Pooling

For real-time connections, manage connection pools:

class ConnectionPool {
  constructor(maxConnections = 10) {
    this.maxConnections = maxConnections
    this.connections = []
    this.waitingQueue = []
  }

async getConnection() { if (this.connections.length < this.maxConnections) { const connection = supabase.channel(conn-${Date.now()}) this.connections.push(connection) return connection }

// Wait for available connection
return new Promise(resolve =&gt; {
  this.waitingQueue.push(resolve)
})

}

releaseConnection(connection) { if (this.waitingQueue.length > 0) { const resolve = this.waitingQueue.shift() resolve(connection) } } }


Custom Rate Limits

Enterprise Custom Limits

Enterprise customers can request custom rate limits:

// Contact support to configure custom limits
const customLimits = {
  organization_id: "org_01H8K2M3N4P5Q6R7S8T9U0V1W2",
  custom_limits: {
    "telemetry_write": 10000,    // 10x standard limit
    "telemetry_read": 5000,      // 10x standard limit
    "real_time_connections": 500, // 10x standard limit
    "custom_endpoints": {
      "/functions/v1/custom-processing": 2000
    }
  },
  burst_multiplier: 3.0,  // Allow 3x burst instead of 2x
  notes: "High-volume IoT deployment with 10,000+ sensors"
}

API Key Specific Limits

Different API keys can have different limits:

// Generate API key with specific limits
const response = await supabase.functions.invoke('generate-api-key', {
  body: {
    name: 'High Volume Telemetry Key',
    permissions: ['telemetry:write'],
    rate_limits: {
      'telemetry_write': 5000,  // Custom limit for this key
      'burst_allowance': 2.5
    },
    expires_in_days: 365
  }
})

Error Recovery

Graceful Degradation

Handle rate limits gracefully in your application:

class ResilientAPIClient {
  constructor() {
    this.fallbackData = new Map()
    this.rateLimitActive = false
  }

async fetchWithFallback(key, fetchFn, fallbackFn) { try { if (this.rateLimitActive) { // Use fallback immediately if rate limited return await fallbackFn() }

  const data = await fetchFn()
  
  // Cache successful responses
  this.fallbackData.set(key, {
    data,
    timestamp: Date.now()
  })
  
  return data
  
} catch (error) {
  if (error.status === 429) {
    this.rateLimitActive = true
    
    // Reset flag after rate limit window
    const retryAfter = error.headers?.['retry-after'] || 60
    setTimeout(() =&gt; {
      this.rateLimitActive = false
    }, retryAfter * 1000)
    
    // Use fallback data
    return await fallbackFn()
  }
  
  throw error
}

}

async getAgentsWithFallback() { return this.fetchWithFallback( 'agents', () => supabase.from('agents').select('*'), () => { // Return cached data or basic fallback const cached = this.fallbackData.get('agents') if (cached && Date.now() - cached.timestamp < 300000) { // 5 min cache return cached.data }

    // Return minimal fallback data
    return [{ id: 'offline', name: 'Data temporarily unavailable' }]
  }
)

} }


Best Practices Summary

Do's ✅

  1. Monitor Headers: Always check X-RateLimit-* headers
  2. Implement Backoff: Use exponential backoff for retries
  3. Use Batch Operations: Combine multiple operations when possible
  4. Cache Responses: Implement appropriate caching strategies
  5. Handle 429 Gracefully: Provide fallback behavior
  6. Monitor Usage: Track your rate limit consumption
  7. Plan for Bursts: Account for traffic spikes in your design

Don'ts ❌

  1. Don't Ignore 429: Always handle rate limit responses
  2. Don't Retry Immediately: Wait for the specified retry period
  3. Don't Hard Code Delays: Use the Retry-After header
  4. Don't Spam Requests: Implement proper request spacing
  5. Don't Skip Error Handling: Always handle rate limit errors
  6. Don't Ignore Warnings: Monitor when approaching limits
  7. Don't Assume Limits: Check current limits via API

Support and Escalation

When to Contact Support

  • Consistently hitting rate limits despite optimization
  • Need for custom enterprise rate limits
  • Suspected rate limiting bugs or issues
  • Questions about appropriate limits for your use case

Support Information

Include when contacting support:

  • Organization ID
  • API endpoint experiencing issues
  • Current request patterns and volumes
  • Rate limit headers from responses
  • Business justification for higher limits

Related Resources


Rate limiting ensures fair usage and optimal performance for all users. Following these guidelines will help you build robust, efficient applications that work well within the platform limits.

# Rate Limiting

The IoT Agent Mesh API implements comprehensive rate limiting to ensure fair usage, system stability, and optimal performance for all users. This guide covers rate limiting policies, headers, handling strategies, and best practices.

Overview

Rate limiting controls the number of API requests that can be made within a specific time window. The IoT Agent Mesh platform uses multiple rate limiting strategies:

  • Per-User Limits: Individual user request quotas
  • Per-Organization Limits: Organization-wide request quotas
  • Per-Endpoint Limits: Specific limits for different API endpoints
  • Per-IP Limits: Network-level protection against abuse
  • Concurrent Connection Limits: Real-time connection restrictions

Rate Limiting Policies

Standard Rate Limits

Endpoint Category Rate Limit Window Scope
Authentication 10 requests 1 minute Per IP address
Organizations 100 requests 1 minute Per user per org
Users 100 requests 1 minute Per user per org
IoT Agents 200 requests 1 minute Per user per org
Telemetry Read 500 requests 1 minute Per user per org
Telemetry Write 1000 requests 1 minute Per user per org
Alerts 200 requests 1 minute Per user per org
Edge Functions 500 requests 1 minute Per user per org
Real-time Connections 50 connections Concurrent Per user

Burst Allowance

All endpoints support burst traffic up to 2x the standard rate limit for short periods (up to 30 seconds), after which the standard rate applies.

Premium Tier Limits

Plan Multiplier Additional Features
Starter 1x Standard limits
Professional 5x Higher burst allowance
Enterprise 10x Custom limits available

Rate Limiting Headers

Response Headers

Every API response includes rate limiting information in the headers:

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1721750460
X-RateLimit-Window: 60
X-RateLimit-Policy: per-user-per-org
Retry-After: 30

Header Descriptions

Header Description Example
X-RateLimit-Limit Maximum requests allowed in the current window 1000
X-RateLimit-Remaining Requests remaining in the current window 999
X-RateLimit-Reset Unix timestamp when the window resets 1721750460
X-RateLimit-Window Window duration in seconds 60
X-RateLimit-Policy Rate limiting policy applied per-user-per-org
Retry-After Seconds to wait before retrying (only on 429) 30

Rate Limit Exceeded Response

When rate limits are exceeded, the API returns a 429 Too Many Requests status:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1721750460
X-RateLimit-Window: 60
Retry-After: 30
Content-Type: application/json

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Rate limit exceeded for telemetry ingestion",
    "details": {
      "limit": 1000,
      "window_seconds": 60,
      "policy": "per-user-per-org",
      "reset_at": "2024-07-23T17:01:00Z",
      "retry_after_seconds": 30
    },
    "request_id": "req_01H8K3L4M5N6O7P8Q9R0S1T2U3V4",
    "timestamp": "2024-07-23T17:00:00Z"
  },
  "status": 429
}

Endpoint-Specific Rate Limits

Authentication Endpoints

Endpoint Rate Limit Window Scope Notes
POST /auth/v1/token 10 requests 1 minute Per IP Sign in attempts
POST /auth/v1/signup 5 requests 1 minute Per IP Account creation
POST /auth/v1/recover 3 requests 1 minute Per IP Password reset
POST /auth/v1/token?grant_type=refresh_token 20 requests 1 minute Per user Token refresh

Security Note: Authentication endpoints have stricter limits to prevent brute force attacks.

Organization Management

Endpoint Rate Limit Window Scope
GET /rest/v1/organizations 100 requests 1 minute Per user
POST /rest/v1/organizations 5 requests 1 minute Per user
PATCH /rest/v1/organizations 30 requests 1 minute Per user per org
DELETE /rest/v1/organizations 2 requests 1 minute Per user

User Management

Endpoint Rate Limit Window Scope
GET /rest/v1/users 100 requests 1 minute Per user per org
POST /functions/v1/invite-user 10 requests 1 minute Per user per org
PATCH /rest/v1/users 30 requests 1 minute Per user per org
GET /functions/v1/user-activity 50 requests 1 minute Per user per org

IoT Agent Management

Endpoint Rate Limit Window Scope
GET /rest/v1/agents 200 requests 1 minute Per user per org
POST /rest/v1/agents 20 requests 1 minute Per user per org
PATCH /rest/v1/agents 100 requests 1 minute Per user per org
DELETE /rest/v1/agents 10 requests 1 minute Per user per org
GET /functions/v1/agent-status 1000 requests 1 minute Per user per org
POST /functions/v1/agents-bulk-register 5 requests 1 minute Per user per org

Telemetry Data

Endpoint Rate Limit Window Scope Notes
GET /rest/v1/telemetry 500 requests 1 minute Per user per org Data queries
POST /rest/v1/telemetry 1000 requests 1 minute Per user per org Single ingestion
POST /functions/v1/telemetry-batch 100 requests 1 minute Per user per org Batch ingestion
GET /functions/v1/telemetry-latest 200 requests 1 minute Per user per org Latest data
GET /functions/v1/telemetry-aggregate 100 requests 1 minute Per user per org Aggregations
POST /functions/v1/telemetry-export 10 requests 1 hour Per user per org Data export

High-Volume Ingestion: Contact support for custom telemetry ingestion limits if you need higher throughput.

Alert Management

Endpoint Rate Limit Window Scope
GET /rest/v1/alerts 200 requests 1 minute Per user per org
POST /rest/v1/alert-rules 20 requests 1 minute Per user per org
PATCH /rest/v1/alert-rules 50 requests 1 minute Per user per org
POST /rest/v1/alerts/{id}/acknowledge 100 requests 1 minute Per user per org
POST /rest/v1/alerts/{id}/resolve 100 requests 1 minute Per user per org
GET /functions/v1/alert-statistics 30 requests 1 minute Per user per org

Edge Functions

Function Rate Limit Window Scope
POST /functions/v1/process-device-data 1000 requests 1 minute Per user per org
POST /functions/v1/webhook-handler 2000 requests 1 minute Per IP
POST /functions/v1/send-alert-notification 200 requests 1 minute Per user per org
POST /functions/v1/generate-report 10 requests 1 hour Per user per org
Custom functions 500 requests 1 minute Per user per org

Real-time Connections

Connection Type Limit Scope Notes
WebSocket connections 50 concurrent Per user Real-time subscriptions
Channel subscriptions 100 concurrent Per user Per connection
Message rate 1000 messages 1 minute per connection Incoming messages

Rate Limiting Strategies

1. Token Bucket Algorithm

The primary rate limiting uses a token bucket approach:

  • Each user/organization gets a bucket with tokens
  • Each request consumes one token
  • Tokens are replenished at a steady rate
  • Burst traffic is allowed up to bucket capacity

2. Sliding Window

For precise rate limiting, a sliding window approach is used:

  • Tracks requests in rolling time windows
  • More accurate than fixed windows
  • Prevents burst at window boundaries

3. Hierarchical Limits

Multiple limits can apply simultaneously:

Request → IP Limit → User Limit → Organization Limit → Endpoint Limit → Allow/Deny

Handling Rate Limits

Best Practices for Clients

1. Respect Rate Limit Headers

Always check rate limit headers and adjust behavior accordingly:

// JavaScript/TypeScript example
async function makeAPIRequest(requestFn) {
  try {
    const response = await requestFn()
    
    // Check rate limit headers
    const remaining = parseInt(response.headers['x-ratelimit-remaining'])
    const reset = parseInt(response.headers['x-ratelimit-reset'])
    
    if (remaining < 10) {
      const resetTime = new Date(reset * 1000)
      console.log(`Rate limit low. Resets at: ${resetTime}`)
      
      // Slow down requests
      await new Promise(resolve => setTimeout(resolve, 1000))
    }
    
    return response
    
  } catch (error) {
    if (error.status === 429) {
      await handleRateLimit(error)
      return makeAPIRequest(requestFn) // Retry
    }
    throw error
  }
}

2. Implement Exponential Backoff

async function exponentialBackoff(requestFn, maxRetries = 5) {
  let attempt = 0
  
  while (attempt < maxRetries) {
    try {
      return await requestFn()
      
    } catch (error) {
      if (error.status === 429) {
        attempt++
        
        // Get retry delay from header or calculate
        const retryAfter = error.headers?.['retry-after'] || 
                          Math.pow(2, attempt) + Math.random()
        
        console.log(`Rate limited. Retrying in ${retryAfter}s (attempt ${attempt})`)
        await new Promise(resolve => setTimeout(resolve, retryAfter * 1000))
        
      } else {
        throw error
      }
    }
  }
  
  throw new Error('Max retries exceeded')
}

3. Request Queuing

Implement a queue to manage request rate:

class RequestQueue {
  constructor(requestsPerSecond = 10) {
    this.requestsPerSecond = requestsPerSecond
    this.queue = []
    this.processing = false
    this.lastRequestTime = 0
  }
  
  async enqueue(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject })
      this.processQueue()
    })
  }
  
  async processQueue() {
    if (this.processing || this.queue.length === 0) return
    
    this.processing = true
    
    while (this.queue.length > 0) {
      const { requestFn, resolve, reject } = this.queue.shift()
      
      // Ensure minimum time between requests
      const now = Date.now()
      const timeSinceLastRequest = now - this.lastRequestTime
      const minInterval = 1000 / this.requestsPerSecond
      
      if (timeSinceLastRequest < minInterval) {
        await new Promise(r => setTimeout(r, minInterval - timeSinceLastRequest))
      }
      
      try {
        const result = await requestFn()
        resolve(result)
      } catch (error) {
        if (error.status === 429) {
          // Put request back at front of queue
          this.queue.unshift({ requestFn, resolve, reject })
          
          const retryAfter = (error.headers?.['retry-after'] || 30) * 1000
          await new Promise(r => setTimeout(r, retryAfter))
        } else {
          reject(error)
        }
      }
      
      this.lastRequestTime = Date.now()
    }
    
    this.processing = false
  }
}

// Usage
const requestQueue = new RequestQueue(15) // 15 requests per second

const result = await requestQueue.enqueue(() => 
  supabase.from('agents').select('*')
)

Language-Specific Examples

Python Rate Limiting

import time
import requests
from functools import wraps

class RateLimiter:
    def __init__(self, max_requests=100, time_window=60):
        self.max_requests = max_requests
        self.time_window = time_window
        self.requests = []
    
    def wait_if_needed(self):
        now = time.time()
        
        # Remove old requests outside the window
        self.requests = [req_time for req_time in self.requests 
                        if now - req_time < self.time_window]
        
        if len(self.requests) >= self.max_requests:
            # Calculate wait time
            oldest_request = min(self.requests)
            wait_time = self.time_window - (now - oldest_request)
            
            if wait_time > 0:
                time.sleep(wait_time + 0.1)  # Add small buffer
        
        self.requests.append(now)

def with_rate_limiting(rate_limiter):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            rate_limiter.wait_if_needed()
            
            max_retries = 3
            for attempt in range(max_retries):
                try:
                    response = func(*args, **kwargs)
                    
                    if hasattr(response, 'status_code') and response.status_code == 429:
                        retry_after = int(response.headers.get('retry-after', 30))
                        time.sleep(retry_after)
                        continue
                    
                    return response
                    
                except requests.exceptions.RequestException as e:
                    if attempt == max_retries - 1:
                        raise
                    time.sleep(2 ** attempt)  # Exponential backoff
            
            return response
        return wrapper
    return decorator

# Usage
rate_limiter = RateLimiter(max_requests=100, time_window=60)

@with_rate_limiting(rate_limiter)
def api_request(url, **kwargs):
    return requests.get(url, **kwargs)

Java Rate Limiting

import java.time.Instant;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class RateLimiter {
    private final Semaphore semaphore;
    private final int maxRequests;
    private final long timeWindowMs;
    private volatile long windowStart;
    private volatile int requestCount;
    
    public RateLimiter(int maxRequests, long timeWindowMs) {
        this.maxRequests = maxRequests;
        this.timeWindowMs = timeWindowMs;
        this.semaphore = new Semaphore(maxRequests);
        this.windowStart = System.currentTimeMillis();
        this.requestCount = 0;
    }
    
    public void acquire() throws InterruptedException {
        long now = System.currentTimeMillis();
        
        synchronized (this) {
            if (now - windowStart >= timeWindowMs) {
                // Reset window
                semaphore.release(maxRequests - semaphore.availablePermits());
                windowStart = now;
                requestCount = 0;
            }
        }
        
        semaphore.acquire();
        requestCount++;
    }
    
    public void release() {
        // Don't release immediately - let time window handle it
        // This prevents burst beyond the rate limit
    }
    
    public boolean tryAcquire(long timeout, TimeUnit unit) throws InterruptedException {
        return semaphore.tryAcquire(timeout, unit);
    }
}

// Usage with HTTP client
public class APIClient {
    private final RateLimiter rateLimiter;
    private final HttpClient httpClient;
    
    public APIClient() {
        this.rateLimiter = new RateLimiter(100, 60000); // 100 requests per minute
        this.httpClient = HttpClient.newHttpClient();
    }
    
    public HttpResponse<String> makeRequest(HttpRequest request) throws Exception {
        rateLimiter.acquire();
        
        try {
            HttpResponse<String> response = httpClient.send(
                request, 
                HttpResponse.BodyHandlers.ofString()
            );
            
            if (response.statusCode() == 429) {
                String retryAfter = response.headers()
                    .firstValue("retry-after")
                    .orElse("30");
                
                Thread.sleep(Integer.parseInt(retryAfter) * 1000);
                return makeRequest(request); // Retry
            }
            
            return response;
            
        } finally {
            rateLimiter.release();
        }
    }
}

Monitoring Rate Limits

Rate Limit Dashboard

Monitor your API usage through the dashboard:

// Get rate limit status for organization
async function getRateLimitStatus() {
  const response = await supabase.functions.invoke('rate-limit-status')
  
  return {
    current_usage: response.data.current_usage,
    limits: response.data.limits,
    reset_times: response.data.reset_times,
    burst_available: response.data.burst_available
  }
}

// Example response:
{
  "current_usage": {
    "telemetry_write": 750,
    "telemetry_read": 340,
    "agents": 45,
    "alerts": 12
  },
  "limits": {
    "telemetry_write": 1000,
    "telemetry_read": 500,
    "agents": 200,
    "alerts": 200
  },
  "reset_times": {
    "telemetry_write": "2024-07-23T17:01:00Z",
    "telemetry_read": "2024-07-23T17:01:00Z"
  },
  "burst_available": {
    "telemetry_write": 1250,
    "telemetry_read": 750
  }
}

Alerting on Rate Limits

Set up monitoring alerts:

// Monitor rate limit usage
function monitorRateLimits() {
  setInterval(async () => {
    try {
      const status = await getRateLimitStatus()
      
      Object.entries(status.current_usage).forEach(([endpoint, usage]) => {
        const limit = status.limits[endpoint]
        const percentage = (usage / limit) * 100
        
        if (percentage > 80) {
          console.warn(`Rate limit warning: ${endpoint} at ${percentage.toFixed(1)}%`)
          
          // Send alert to monitoring system
          sendAlert({
            type: 'rate_limit_warning',
            endpoint,
            usage,
            limit,
            percentage
          })
        }
      })
      
    } catch (error) {
      console.error('Failed to check rate limits:', error)
    }
  }, 30000) // Check every 30 seconds
}

Optimization Strategies

1. Batch Operations

Use batch endpoints to maximize efficiency:

// Instead of multiple single requests
for (const telemetryPoint of data) {
  await supabase.from('telemetry').insert(telemetryPoint)  // 100+ requests
}

// Use batch endpoint
await supabase.functions.invoke('telemetry-batch', {
  body: { telemetry: data }  // 1 request
})

2. Caching

Implement client-side caching:

class CachedAPIClient {
  constructor(ttlSeconds = 300) {
    this.cache = new Map()
    this.ttl = ttlSeconds * 1000
  }
  
  async get(key, fetchFn) {
    const cached = this.cache.get(key)
    
    if (cached && Date.now() - cached.timestamp < this.ttl) {
      return cached.data
    }
    
    const data = await fetchFn()
    this.cache.set(key, {
      data,
      timestamp: Date.now()
    })
    
    return data
  }
  
  async getAgents() {
    return this.get('agents', async () => {
      const { data } = await supabase.from('agents').select('*')
      return data
    })
  }
}

3. Request Optimization

Optimize queries to reduce API calls:

// Instead of separate requests
const agents = await supabase.from('agents').select('*')
const telemetry = await supabase.from('telemetry').select('*')
const alerts = await supabase.from('alerts').select('*')

// Use joins or batch queries
const data = await supabase
  .from('agents')
  .select(`
    *,
    latest_telemetry:telemetry(*, created_at.desc.limit(1)),
    active_alerts:alerts(*)
  `)

4. Connection Pooling

For real-time connections, manage connection pools:

class ConnectionPool {
  constructor(maxConnections = 10) {
    this.maxConnections = maxConnections
    this.connections = []
    this.waitingQueue = []
  }
  
  async getConnection() {
    if (this.connections.length < this.maxConnections) {
      const connection = supabase.channel(`conn-${Date.now()}`)
      this.connections.push(connection)
      return connection
    }
    
    // Wait for available connection
    return new Promise(resolve => {
      this.waitingQueue.push(resolve)
    })
  }
  
  releaseConnection(connection) {
    if (this.waitingQueue.length > 0) {
      const resolve = this.waitingQueue.shift()
      resolve(connection)
    }
  }
}

Custom Rate Limits

Enterprise Custom Limits

Enterprise customers can request custom rate limits:

// Contact support to configure custom limits
const customLimits = {
  organization_id: "org_01H8K2M3N4P5Q6R7S8T9U0V1W2",
  custom_limits: {
    "telemetry_write": 10000,    // 10x standard limit
    "telemetry_read": 5000,      // 10x standard limit
    "real_time_connections": 500, // 10x standard limit
    "custom_endpoints": {
      "/functions/v1/custom-processing": 2000
    }
  },
  burst_multiplier: 3.0,  // Allow 3x burst instead of 2x
  notes: "High-volume IoT deployment with 10,000+ sensors"
}

API Key Specific Limits

Different API keys can have different limits:

// Generate API key with specific limits
const response = await supabase.functions.invoke('generate-api-key', {
  body: {
    name: 'High Volume Telemetry Key',
    permissions: ['telemetry:write'],
    rate_limits: {
      'telemetry_write': 5000,  // Custom limit for this key
      'burst_allowance': 2.5
    },
    expires_in_days: 365
  }
})

Error Recovery

Graceful Degradation

Handle rate limits gracefully in your application:

class ResilientAPIClient {
  constructor() {
    this.fallbackData = new Map()
    this.rateLimitActive = false
  }
  
  async fetchWithFallback(key, fetchFn, fallbackFn) {
    try {
      if (this.rateLimitActive) {
        // Use fallback immediately if rate limited
        return await fallbackFn()
      }
      
      const data = await fetchFn()
      
      // Cache successful responses
      this.fallbackData.set(key, {
        data,
        timestamp: Date.now()
      })
      
      return data
      
    } catch (error) {
      if (error.status === 429) {
        this.rateLimitActive = true
        
        // Reset flag after rate limit window
        const retryAfter = error.headers?.['retry-after'] || 60
        setTimeout(() => {
          this.rateLimitActive = false
        }, retryAfter * 1000)
        
        // Use fallback data
        return await fallbackFn()
      }
      
      throw error
    }
  }
  
  async getAgentsWithFallback() {
    return this.fetchWithFallback(
      'agents',
      () => supabase.from('agents').select('*'),
      () => {
        // Return cached data or basic fallback
        const cached = this.fallbackData.get('agents')
        if (cached && Date.now() - cached.timestamp < 300000) { // 5 min cache
          return cached.data
        }
        
        // Return minimal fallback data
        return [{ id: 'offline', name: 'Data temporarily unavailable' }]
      }
    )
  }
}

Best Practices Summary

Do's ✅

  1. Monitor Headers: Always check X-RateLimit-* headers
  2. Implement Backoff: Use exponential backoff for retries
  3. Use Batch Operations: Combine multiple operations when possible
  4. Cache Responses: Implement appropriate caching strategies
  5. Handle 429 Gracefully: Provide fallback behavior
  6. Monitor Usage: Track your rate limit consumption
  7. Plan for Bursts: Account for traffic spikes in your design

Don'ts ❌

  1. Don't Ignore 429: Always handle rate limit responses
  2. Don't Retry Immediately: Wait for the specified retry period
  3. Don't Hard Code Delays: Use the Retry-After header
  4. Don't Spam Requests: Implement proper request spacing
  5. Don't Skip Error Handling: Always handle rate limit errors
  6. Don't Ignore Warnings: Monitor when approaching limits
  7. Don't Assume Limits: Check current limits via API

Support and Escalation

When to Contact Support

  • Consistently hitting rate limits despite optimization
  • Need for custom enterprise rate limits
  • Suspected rate limiting bugs or issues
  • Questions about appropriate limits for your use case

Support Information

Include when contacting support:

  • Organization ID
  • API endpoint experiencing issues
  • Current request patterns and volumes
  • Rate limit headers from responses
  • Business justification for higher limits

Related Resources


Rate limiting ensures fair usage and optimal performance for all users. Following these guidelines will help you build robust, efficient applications that work well within the platform limits.

⚠️ **GitHub.com Fallback** ⚠️